Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-4587: integrate lucene-monitor into solr #2382

Draft
wants to merge 82 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 62 commits
Commits
Show all changes
82 commits
Select commit Hold shift + click to select a range
bba4191
integrate lucene-monitor into solr
kotman12 Mar 30, 2024
fe3b413
move MonitorDataValues and check in license
kotman12 Apr 1, 2024
0989f1c
update versions.lock
kotman12 Apr 1, 2024
62ddcbc
add package-info to monitor packages
kotman12 Apr 1, 2024
33eccf7
extract helper method
kotman12 Apr 1, 2024
fa7fb40
apply errorprone suggestions
kotman12 Apr 3, 2024
11a1138
implement highlight matches
kotman12 Apr 4, 2024
7526b2f
AggregatingMatcher -> MatchesAggregator
kotman12 Apr 4, 2024
6ac7a5e
make monitor query cache optional
kotman12 Apr 4, 2024
2362ce7
move manySegmentsTest to ParallelMonitorSolrQueryTest
kotman12 Apr 4, 2024
e5a1382
call CandidateMatcher directly
kotman12 Apr 5, 2024
b7b58b3
remove doc forwarding callback
kotman12 Apr 6, 2024
ce40d60
instantiate decoder in outer loop
kotman12 Apr 9, 2024
a201676
ignore score for relevant match types
kotman12 Apr 12, 2024
60b0eb5
read MAX_SIZE_PARAM for maxSize
kotman12 Apr 12, 2024
fd04c4e
don't drop cause
kotman12 Apr 12, 2024
32cb6f6
remove superstitious delete calls
kotman12 Apr 12, 2024
b6d369b
add testDeleteByQueryId
kotman12 Apr 12, 2024
0ab5a6b
enable setting maxRamMB for monitor cache
kotman12 Apr 13, 2024
0b0120e
hardcoding luceneMatchVersion is bad
kotman12 Apr 15, 2024
24c1bf5
add multi-pass presearcher and optional field aliasing
kotman12 Apr 19, 2024
ee2992d
more accurate error
kotman12 Apr 19, 2024
6815ea1
redundant override
kotman12 Apr 19, 2024
995cfa2
wrap reserved field with _ and remove override behavior
kotman12 Apr 24, 2024
a2419ff
validate MonitorFields.RESERVED_MONITOR_FIELDS in schema
kotman12 Apr 24, 2024
0fde7ef
stricter validations of required fields
kotman12 Apr 27, 2024
5968754
narrow scope of __anytokenfield validation
kotman12 Apr 27, 2024
bb11982
getBool with default
kotman12 May 3, 2024
eba6e2d
initialize Presearcher in ReverseSearchComponent + add ReverseSearchH…
kotman12 May 4, 2024
d3ccf0c
remove unused constant
kotman12 May 4, 2024
e12ec6f
make SolrMonitorCache name optionally configurable
cpoerschke May 10, 2024
6d4986c
[exploratory] turn MonitorConstants.QUERY_DECOMPOSER into ReverseSear…
cpoerschke May 10, 2024
3aae9f0
remove REVERSE_SEARCH_PARAM_NAME flag in favor of dedicated path
kotman12 May 10, 2024
0915ce1
move getComponent from QCEVisitor to SolrMonitorQueryDecoder as sugge…
cpoerschke May 14, 2024
e6e25a5
avoid queryDecoder.getComponent(queryDecoder.decode(...), ...) usage …
cpoerschke May 14, 2024
8bb7151
Merge remote-tracking branch 'github_kotman12/solr-monitor' into solr…
cpoerschke May 14, 2024
e1365ad
remove now no-longer-used MonitorConstants.QUERY_DECOMPOSER
cpoerschke May 15, 2024
5b37a6a
coexistWithRegularDocumentsTest with multiple update chains
kotman12 Jun 4, 2024
e4af381
remove payload and additional field support
kotman12 Jun 4, 2024
7be05a3
validate fields in MonitorUpdateRequestProcessor
kotman12 Jun 4, 2024
f2d47be
start MonitorSolrQueryTest.validateDocList
cpoerschke Jun 14, 2024
d359f80
defer parallel matching from initial integration
cpoerschke Jun 17, 2024
4af2c57
move manySegments and multiPass to SingleCoreMonitorSolrTest
kotman12 Jun 22, 2024
ef8dafd
Merge remote-tracking branch 'github_kotman12/solr-monitor' into solr…
cpoerschke Jun 24, 2024
f3b33b1
Merge remote-tracking branch 'github_kotman12/solr-monitor' into solr…
cpoerschke Jun 24, 2024
7feb415
restore ParallelMonitorSolrQueryTest.java
cpoerschke Jun 24, 2024
a3c4d36
Merge pull request #2 from cpoerschke/solr-monitor-cpoerschke-5
kotman12 Jun 24, 2024
ebc4fb9
Merge pull request #1 from cpoerschke/solr-monitor-cpoerschke-4
kotman12 Jun 24, 2024
e33e8ae
./gradlew tidy
cpoerschke Jun 24, 2024
05438f4
defer monitorMatchType support (always use simple matching)
cpoerschke Jun 24, 2024
483e46f
always use ConstantScoreQuery
kotman12 Jun 25, 2024
9ed6b48
Merge remote-tracking branch 'origin/main' into solr-monitor
cpoerschke Jul 1, 2024
28bf264
remove MonitorFields.ANYTOKEN_FIELD in favour of TermFilteredPresearc…
cpoerschke Jul 1, 2024
908f462
Merge pull request #3 from cpoerschke/solr-monitor-cpoerschke-6
kotman12 Nov 8, 2024
13dffda
add back inadvertently removed import
cpoerschke Nov 8, 2024
cbf45de
remove stand-alone monitor API
kotman12 Nov 12, 2024
a3fc3d5
Merge branch 'solr-monitor' of https://github.com/kotman12/solr into …
kotman12 Nov 12, 2024
a24d66a
fix tests
kotman12 Nov 12, 2024
44eea79
avoid (minimise) solr/core changes
cpoerschke Nov 13, 2024
bf0daf7
tentative: no need to configure the query component
cpoerschke Nov 13, 2024
3e7a53c
subjective: remove SolrMatcherSinkFactory now that there's only one S…
cpoerschke Nov 13, 2024
113ca8f
some ReverseSearchComponent.prepare reordering and edits to aid code …
cpoerschke Nov 13, 2024
102cdf0
remove MatchesAggregator.java
kotman12 Nov 18, 2024
5850f8c
Merge branch 'main' of https://github.com/kotman12/solr into solr-mon…
kotman12 Nov 23, 2024
acd144f
Merge branch 'main' of https://github.com/kotman12/solr into solr-mon…
kotman12 Nov 23, 2024
3795013
update build files
kotman12 Nov 23, 2024
6989703
remove MonitorConstants
kotman12 Nov 23, 2024
1e62be2
debug component refactor
kotman12 Nov 24, 2024
f840f97
ReverseSearchQuery proof-of-concept
kotman12 Nov 25, 2024
4cc04c4
handle null scorer
kotman12 Nov 25, 2024
e68c329
wire lucene-monitor metadata to solr response
kotman12 Nov 25, 2024
c1d3f72
simplify interface and implement toString
kotman12 Nov 29, 2024
33e7400
Merge pull request #6 from kotman12/solr-monitor-rsq
kotman12 Dec 1, 2024
42e5f44
clean up
kotman12 Dec 2, 2024
882cda9
pull latest CandidateMatcher
kotman12 Dec 2, 2024
7c80dae
monitor -> saved-search
kotman12 Dec 2, 2024
50ec78c
Merge pull request #7 from kotman12/solr-monitor-test
kotman12 Dec 3, 2024
d869b33
monitor -> savedsearch package
kotman12 Dec 4, 2024
4665004
make SolrResourceLoader aware of savedsearch paths
kotman12 Dec 5, 2024
9a45f17
fix analyzeDependency error
kotman12 Dec 5, 2024
613b6cd
rename stuff to SavedSearch*
kotman12 Dec 5, 2024
2ad0db2
consolidate lucene-monitor visitors
kotman12 Dec 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions settings.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ include "solr:modules:ltr"
include "solr:modules:s3-repository"
include "solr:modules:scripting"
include "solr:modules:sql"
include "solr:modules:monitor"
include "solr:webapp"
include "solr:benchmark"
include "solr:test-framework"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -105,15 +105,7 @@ public void process(ResponseBuilder rb) throws IOException {
info.addAll(stdinfo);
}

FacetDebugInfo fdebug = (FacetDebugInfo) (rb.req.getContext().get("FacetDebugInfo"));
if (fdebug != null) {
info.add("facet-trace", fdebug.getFacetDebugInfo());
}

fdebug = (FacetDebugInfo) (rb.req.getContext().get("FacetDebugInfo-nonJson"));
if (fdebug != null) {
info.add("facet-debug", fdebug.getFacetDebugInfo());
}
addCustomInfo(rb, info);

if (rb.req.getJSON() != null) {
info.add(JSON, rb.req.getJSON());
Expand All @@ -139,6 +131,18 @@ public void process(ResponseBuilder rb) throws IOException {
}
}

protected void addCustomInfo(ResponseBuilder rb, NamedList<Object> info) {
FacetDebugInfo fdebug = (FacetDebugInfo) (rb.req.getContext().get("FacetDebugInfo"));
if (fdebug != null) {
info.add("facet-trace", fdebug.getFacetDebugInfo());
}

fdebug = (FacetDebugInfo) (rb.req.getContext().get("FacetDebugInfo-nonJson"));
if (fdebug != null) {
info.add("facet-debug", fdebug.getFacetDebugInfo());
}
}

private void doDebugTrack(ResponseBuilder rb) {
final String rid = rb.req.getParams().get(CommonParams.REQUEST_ID);
rb.addDebug(rid, "track", CommonParams.REQUEST_ID); // to see it in the response
Expand Down
1 change: 1 addition & 0 deletions solr/licenses/lucene-monitor-9.11.1.jar.sha1
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fa731be343ec79d940c7abd0dd8ff1de9bf9c7f1
33 changes: 33 additions & 0 deletions solr/modules/monitor/build.gradle
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

apply plugin: 'java-library'

description = 'Apache Solr Monitor'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is so puzzling to anyone who isn't intimately familiar with Lucene Monitor. I don't even think we should be calling this "Solr Monitor"; looks like infrastructure monitoring thing. Possibly "Solr-Lucene-Monitor" but still... a puzzling name.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great point .. The library used to be called luwak which I find to be a much better name... I'll try to think of a better name (maybe solr-reverse-search or solr-query-alerting). I'll reply in more detail to your mailing list message also touching on solr.cool and the sandbox.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saved Searches is a common name, I assume it is possible to list a users's saved searches too. Or Alerting, but then most people will expect there to be some functionality to ship alerts somewhere...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, if anything this might be a part of some larger alerting system, but "saved search" is more accurate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saved searches is a pretty indicative name. Percolator is also a known name for this kind of functionally.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I thought ES invented "percolator" as more of a metaphor... I wasn't aware that this is a more generic name. I was worried that "percolator" might clash too much with ES.


dependencies {

implementation project(":solr:core")
implementation project(":solr:solrj")
implementation "org.apache.lucene:lucene-core"
implementation "org.apache.lucene:lucene-monitor"
implementation 'com.github.ben-manes.caffeine:caffeine'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the CI says

Execution failed for task ':solr:modules:monitor:analyzeClassesDependencies'.
> Dependency analysis found issues.
  usedUndeclaredArtifacts
   - io.dropwizard.metrics:metrics-core:4.2.26@jar

and a remedy might be (something like)

Suggested change
implementation 'com.github.ben-manes.caffeine:caffeine'
implementation 'com.github.ben-manes.caffeine:caffeine'
implementation 'io.dropwizard.metrics:metrics-core'

however with that and ./gradlew clean ; ./gradlew :solr:modules:monitor:analyzeClassesDependencies locally it then instead says

Execution failed for task ':solr:modules:monitor:analyzeClassesDependencies'.
> Dependency analysis found issues.
  unusedDeclaredArtifacts
   - io.dropwizard.metrics:metrics-core:4.2.25@jar

instead i.e. the opposite.

testImplementation project(':solr:test-framework')
testImplementation 'junit:junit'
}


Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.lucene.monitor;

import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.concurrent.TimeUnit;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;

/** Class used to match candidate queries selected by a Presearcher from a Monitor query index. */
public abstract class CandidateMatcher<T extends QueryMatch> {

/** The searcher to run candidate queries against */
protected final IndexSearcher searcher;

private final Map<String, Exception> errors = new HashMap<>();
private final List<MatchHolder<T>> matches;

private long searchTime = System.nanoTime();

private static class MatchHolder<T> {
Map<String, T> matches = new HashMap<>();
}

/**
* Creates a new CandidateMatcher for the supplied DocumentBatch
*
* @param searcher the IndexSearcher to run queries against
*/
public CandidateMatcher(IndexSearcher searcher) {
this.searcher = searcher;
int docCount = searcher.getIndexReader().maxDoc();
this.matches = new ArrayList<>(docCount);
for (int i = 0; i < docCount; i++) {
this.matches.add(new MatchHolder<>());
}
}

/**
* Runs the supplied query against this CandidateMatcher's set of documents, storing any resulting
* match, and recording the query in the presearcher hits
*
* @param queryId the query id
* @param matchQuery the query to run
* @param metadata the query metadata
* @throws IOException on IO errors
*/
public abstract void matchQuery(String queryId, Query matchQuery, Map<String, String> metadata)
throws IOException;

/**
* Record a match
*
* @param match a QueryMatch object
*/
protected final void addMatch(T match, int doc) {
MatchHolder<T> docMatches = matches.get(doc);
docMatches.matches.compute(
match.getQueryId(),
(key, oldValue) -> {
if (oldValue != null) {
return resolve(match, oldValue);
}
return match;
});
}

/**
* If two matches from the same query are found (for example, two branches of a disjunction),
* combine them.
*
* @param match1 the first match found
* @param match2 the second match found
* @return a Match object that combines the two
*/
public abstract T resolve(T match1, T match2);

/** Called by the Monitor if running a query throws an Exception */
void reportError(String queryId, Exception e) {
this.errors.put(queryId, e);
}

/**
* @return the matches from this matcher
*/
public final MultiMatchingQueries<T> finish(long buildTime, int queryCount) {
doFinish();
this.searchTime =
TimeUnit.MILLISECONDS.convert(System.nanoTime() - searchTime, TimeUnit.NANOSECONDS);
List<Map<String, T>> results = new ArrayList<>();
for (MatchHolder<T> matchHolder : matches) {
results.add(matchHolder.matches);
}
return new MultiMatchingQueries<>(
results, errors, buildTime, searchTime, queryCount, matches.size());
}

/** Called when all monitoring of a batch of documents is complete */
protected void doFinish() {}

/** Copy all matches from another CandidateMatcher */
protected void copyMatches(CandidateMatcher<T> other) {
this.matches.clear();
this.matches.addAll(other.matches);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
/*
*
* * Licensed to the Apache Software Foundation (ASF) under one or more
* * contributor license agreements. See the NOTICE file distributed with
* * this work for additional information regarding copyright ownership.
* * The ASF licenses this file to You under the Apache License, Version 2.0
* * (the "License"); you may not use this file except in compliance with
* * the License. You may obtain a copy of the License at
* *
* * http://www.apache.org/licenses/LICENSE-2.0
* *
* * Unless required by applicable law or agreed to in writing, software
* * distributed under the License is distributed on an "AS IS" BASIS,
* * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* * See the License for the specific language governing permissions and
* * limitations under the License.
*
*/

package org.apache.lucene.monitor;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened apache/lucene#13993 to propose to make DocumentBatch public.


import java.io.Closeable;
import java.io.IOException;
import java.util.List;
import java.util.function.Supplier;
import java.util.stream.Collectors;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.LeafReader;

public class DocumentBatchVisitor implements Closeable, Supplier<LeafReader> {

private final DocumentBatch batch;
private final List<Document> docs;

private DocumentBatchVisitor(DocumentBatch batch, List<Document> docs) {
this.batch = batch;
this.docs = docs;
}

public static DocumentBatchVisitor of(Analyzer analyzer, List<Document> docs) {
return new DocumentBatchVisitor(
DocumentBatch.of(analyzer, docs.toArray(new Document[0])), docs);
}

@Override
public void close() throws IOException {
batch.close();
}

@Override
public LeafReader get() {
return batch.get();
}

public int size() {
return docs.size();
}

@Override
public String toString() {
return docs.stream().map(Document::toString).collect(Collectors.joining(" "));
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
/*
*
* * Licensed to the Apache Software Foundation (ASF) under one or more
* * contributor license agreements. See the NOTICE file distributed with
* * this work for additional information regarding copyright ownership.
* * The ASF licenses this file to You under the Apache License, Version 2.0
* * (the "License"); you may not use this file except in compliance with
* * the License. You may obtain a copy of the License at
* *
* * http://www.apache.org/licenses/LICENSE-2.0
* *
* * Unless required by applicable law or agreed to in writing, software
* * distributed under the License is distributed on an "AS IS" BASIS,
* * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* * See the License for the specific language governing permissions and
* * limitations under the License.
*
*/
package org.apache.lucene.monitor;

import java.util.List;
import java.util.Map;
import org.apache.lucene.search.Query;

public class MatchesAggregator<T extends QueryMatch> extends CandidateMatcher<T> {

private final CandidateMatcher<T> resolvingMatcher;

private MatchesAggregator(
List<CandidateMatcher<T>> matchers, CandidateMatcher<T> resolvingMatcher) {
super(resolvingMatcher.searcher);
this.resolvingMatcher = resolvingMatcher;
for (var matcher : matchers) {
var matches = matcher.finish(Long.MIN_VALUE, -1);
for (int doc = 0; doc < matches.getBatchSize(); doc++) {
for (T match : matches.getMatches(doc)) {
this.addMatch(match, doc);
}
}
for (Map.Entry<String, Exception> error : matches.getErrors().entrySet()) {
this.reportError(error.getKey(), error.getValue());
}
}
}

@Override
public void matchQuery(String queryId, Query matchQuery, Map<String, String> metadata) {
throw new UnsupportedOperationException("only use for aggregating other matchers");
}

@Override
public T resolve(T match1, T match2) {
return resolvingMatcher.resolve(match1, match2);
}

public static <T extends QueryMatch> MultiMatchingQueries<T> aggregate(
List<CandidateMatcher<T>> matchers, CandidateMatcher<T> resolver, int queryCount) {
return new MatchesAggregator<>(matchers, resolver).finish(Long.MIN_VALUE, queryCount);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
/*
*
* * Licensed to the Apache Software Foundation (ASF) under one or more
* * contributor license agreements. See the NOTICE file distributed with
* * this work for additional information regarding copyright ownership.
* * The ASF licenses this file to You under the Apache License, Version 2.0
* * (the "License"); you may not use this file except in compliance with
* * the License. You may obtain a copy of the License at
* *
* * http://www.apache.org/licenses/LICENSE-2.0
* *
* * Unless required by applicable law or agreed to in writing, software
* * distributed under the License is distributed on an "AS IS" BASIS,
* * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* * See the License for the specific language governing permissions and
* * limitations under the License.
*
*/

package org.apache.lucene.monitor;

import java.util.Set;

public class MonitorFields {

public static final String QUERY_ID = QueryIndex.FIELDS.query_id + "_";
public static final String CACHE_ID = QueryIndex.FIELDS.cache_id + "_";
public static final String MONITOR_QUERY = QueryIndex.FIELDS.mq + "_";

public static final Set<String> REQUIRED_MONITOR_SCHEMA_FIELDS =
Set.of(QUERY_ID, CACHE_ID, MONITOR_QUERY);
}
Loading
Loading