Lucene Query Types

02 Jan 2021

A reference list of the different query types in Lucene.

In the core package:

Query Overview
CommonTermsQuery A query that executes high-frequency terms in a optional sub-query to prevent slow queries due to “common” terms like stopwords. This query builds 2 queries off the added terms: low-frequency terms are added to a required boolean clause and high-frequency terms are added to an optional boolean clause.
FunctionMatchQuery A query that retrieves all documents with a DoubleValues value matching a predicate. This query works by a linear scan of the index, and is best used in conjunction with other queries that can restrict the number of documents visited.
FunctionQuery Returns a score for each document based on a ValueSource, often some function of the value of a field.
FunctionRangeQuery A query wrapping a ValueSource that matches docs in which the values in the value source match a configured range. The score is the float value.
FunctionScoreQuery A query that wraps another query, and uses a DoubleValuesSource to replace or modify the wrapped query’s score
IntervalQuery A query that retrieves documents containing intervals returned from an IntervalsSource.
MoreLikeThisQuery A simple wrapper for MoreLikeThis for use in scenarios where a Query object is required eg in custom QueryParser extensions.
PayloadScoreQuery A query class that uses a PayloadFunction to modify the score of a wrapped SpanQuery.
SpanPayloadCheckQuery Only return those matches that have a specific payload at the given position.

In the queries package:

Query Overview
AutomatonQuery This query will match documents that contain terms accepted by a given finite-state machine. The automaton can be constructed with the org.apache.lucene.util.automaton API. Alternatively, it can be created from a regular expression with RegexpQuery or from the standard Lucene wildcard syntax with WildcardQuery.
BlendedTermQuery A query that blends index statistics across multiple terms. This is particularly useful when several terms should produce identical scores, regardless of their index statistics.
BooleanQuery A query that matches documents matching boolean combinations of other queries, e.g. TermQuerys, PhraseQuerys or other BooleanQuerys. See here for an overview of Lucene’s boolean query and operator rules.
BoostQuery A query wrapper that allows to give a boost to the wrapped query. Boost values that are less than one will give less importance to this query compared to other ones while values that are greater than one will give more importance to the scores returned by this query. More complex boosts can be applied by using FunctionScoreQuery.
ConstantScoreQuery A query that wraps another query and simply returns a constant score equal to 1 for every document that matches the query. It therefore simply strips off all scores and always returns 1.
DisjunctionMaxQuery A query that generates the union of documents produced by its subqueries, and that scores each document with the maximum score for that document as produced by any subquery, plus a tie breaking increment for any additional matching subqueries.
DocValuesFieldExistsQuery A query that matches documents that have a value for a given field as reported by doc values iterators.
FieldMaskingSpanQuery Wrapper to allow SpanQuery objects participate in composite single-field span queries by ’lying’ about their search field. That is, the masked SpanQuery will function as normal, but SpanQuery.getField() simply hands back the value supplied in this class’s constructor. This can be used to support queries like SpanNearQuery or SpanOrQuery across different fields, which is not ordinarily permitted.
FuzzyQuery Implements the fuzzy search query. The similarity measurement is based on the Damerau-Levenshtein (optimal string alignment) algorithm, though you can explicitly choose classic Levenshtein by passing false to the transpositions parameter.
IndexOrDocValuesQuery A query that uses either an index structure (points or terms) or doc values in order to run a query, depending which one is more efficient.
LatLonDocValuesPointInPolygonQuery Polygon query for LatLonDocValuesField.
MatchAllDocsQuery A query that matches all documents.
MatchNoDocsQuery A query that matches no documents.
MultiPhraseQuery A generalized version of PhraseQuery, with the possibility of adding more than one term at the same position that are treated as a disjunction (OR).
MultiTermQuery An abstract query that matches documents containing a subset of terms provided by a FilteredTermsEnum enumeration. This query cannot be used directly; you must subclass it and define getTermsEnum(Terms,AttributeSource) to provide a FilteredTermsEnum that iterates through the terms to be matched.
NGramPhraseQuery This is a PhraseQuery which is optimized for n-grams.
NormsFieldExistsQuery A query that matches documents that have a value for a given field as reported by field norms. This will not work for fields that omit norms, e.g. StringField.
PhraseQuery A query that matches documents containing a particular sequence of terms. A PhraseQuery is built by QueryParser for input like “new york”. All terms in the phrase must match, even those at the same position. If you have terms at the same position, perhaps synonyms, you probably want MultiPhraseQuery instead which only requires one term at a position to match.
PointInSetQuery Abstract query class to find all documents whose single or multi-dimensional point values, previously indexed with e.g. IntPoint, is contained in the specified set.
PointRangeQuery Abstract class for range queries against single or multidimensional points such as IntPoint.
PrefixQuery A query that matches documents containing terms with a specified prefix. A PrefixQuery is built by QueryParser for input like app*.
RegexpQuery A fast regular expression query based on the org.apache.lucene.util.automaton package. The supported syntax is documented in the RegExp class. Note this might be different than other regular expression implementations.
SpanQuery Spans support proximity searching. See the overview of spans for more details. This is an abstract class. See the separate section below for a list of span query implementations.
SynonymQuery A query that treats multiple terms as synonyms. For scoring purposes, this query tries to score the terms as if you had indexed them as one term: it will match any of the terms but only invoke the similarity a single time, scoring the sum of all term frequencies for the document.
TermInSetQuery Specialization for a disjunction (OR) over many terms that behaves like a ConstantScoreQuery over a BooleanQuery containing only BooleanClause.Occur.SHOULD clauses.
TermQuery A query that matches documents containing a term. This may be combined with other terms with a BooleanQuery.
TermRangeQuery A Query that matches documents within a range of terms. This query matches the documents looking for terms that fall into the supplied range according to BytesRef.compareTo(BytesRef).
WildcardQuery Implements the wildcard search query. Supported wildcards are *, which matches any character sequence (including the empty one), and ?, which matches any single character. \ is the escape character. Note this query can be slow, as it needs to iterate over many terms. In order to prevent extremely slow wildcard queries, a wildcard term should not start with the wildcard *
XYDocValuesPointInGeometryQuery XYGeometry query for XYDocValuesField.

Span queries:

Query Overview
SpanTermQuery Matches all spans containing a particular Term. This should not be used for terms that are indexed at position Integer.MAX_VALUE.
SpanNearQuery Matches spans which occur near one another, and can be used to implement things like phrase search (when constructed from SpanTermQuerys) and inter-phrase proximity (when constructed from other SpanNearQuerys).
SpanWithinQuery Matches spans which occur inside of another spans.
SpanContainingQuery Matches spans which contain other spans.
SpanOrQuery Merges spans from a number of other SpanQuerys.
SpanNotQuery Removes spans matching one SpanQuery which overlap (or come near) another. This can be used, e.g., to implement within-paragraph search.
SpanFirstQuery Matches spans matching q whose end position is less than n. This can be used to constrain matches to the first part of the document.
SpanPositionRangeQuery A more general form of SpanFirstQuery that can constrain matches to arbitrary portions of the document.