gui._includes.search-api-reference.html Maven / Gradle / Ivy
Search API Reference
Search API Reference
All the search request parameters listed below can be set in query
profiles. The first four blocks of properties are also modeled as
query profile types. These types can be referred from query profiles
(and inheriting types) to provide type checking on the parameters.
These parameters often have both a full name - which includes the
path from the root query profile - and one or more abbreviated
names. Both names can be used in search requests, while only full
names can be used in query profiles. The full names are case
sensitive, while the abbreviated names are case insensitive.
The parameters modeled as query profiles are also available through
get methods as Java objects from the Query to Searcher components.
Index
Query
- Native Execution Parameters
-
- hits [count]
- offset [start]
- queryProfile
- nocache
- groupingSessionCache
- searchChain
- timeout
- tracelevel
- trace.timestamps
- Query Model Parameters
-
- model.defaultIndex [default-index]
- model.encoding [encoding]
- model.filter [filter]
- model.language [lang, language]
- model.queryString [query]
- model.restrict [restrict]
- model.searchPath [path]
- model.sources [search, sources]
- model.type [type]
- Ranking
-
- ranking.location [location]
- ranking.features [rankfeature]
- ranking.listFeatures [rankfeatures]
- ranking.profile [ranking]
- ranking.properties [rankproperty]
- ranking.sorting [sorting]
- ranking.freshness
- ranking.queryCache
- ranking.matchPhase
- Presentation
-
- presentation.bolding [bolding]
- presentation.format [format]
- presentation.template
- presentation.summary [summary]
- presentation.timing
- Grouping
-
- Geographical Searches
-
- Streaming Search
-
- Semantic Rules
-
- Other
-
Query
yql
Alias
Values String
Default None
The YQL query will be parsed and executed in the backend.
Only simple YQL programs are supported, refer to
YQL for details.
select
Select query is equivalent with YQL, written in JSON. Contains subparameters where
and grouping
.
where
Alias
Values JSON
Default None
grouping
Alias
Values JSON
Default None
The where and grouping query will be parsed and executed in the backend.
Refer to
Select Reference for details.
Native Execution Parameters
These parameters are defined in the native
query profile type.
hits
Alias count
Values
A positive integer, or 0. The sum of offset and
hits should be lower than the configured maxoffset
value, and will be adjusted to fit. See also comment
at offset
.
Default 10
The maximum number of hits to return from the result set.
Must be lower than maxHits
, which is either set in a
query profile, or default 400.
offset
Alias start
Values
A positive integer, including 0.
Default 0
The index of the first hit to return from the result set.
Must be lower than maxOffset
, which is either set in a
query profile, or default 1000.
queryProfile
Alias None
Values
A query profile id - name:version, where version can be omitted
or partially specified, e.g "myprofile:2.1"
Default default
A query profile has default properties for a query.
The default query profile is named default - example:
<query-profile id="default">
<field name="maxHits">10</field>
<field name="maxOffset">1000</field>
</query-profile>
nocache
Alias
Values
True or false
Default false
Set to true to avoid the result being fetched from cache, and avoid
writing the result to cache after fetching it.
groupingSessionCache
Alias
Values
True or false
Default false
Set to true to store intermediate grouping results in the search back ends when
using multi level grouping expressions in order to speed up grouping at a
potential loss of accuracy. See the grouping reference for more
details.
searchChain
Alias
Values
A search chain id - name:version, where version can be
omitted or partially specified, e.g "mychain:2.1.3".
Default default
The search chain initially invoked when processing this query. This
search chain may invoke other chains.
timeout
Alias
Values
Positive floating point number with an optional unit. Default unit
is seconds (s), valid unit strings are e.g. ms and s. To set
a timeout of one minute, the argument could be set to 60 s.
Space between the number and the unit is optional.
Default Undefined, but guaranteed to be at least 5000 milliseconds. This default can be overridden by configuring timeout in a query profile.
The query timeout.
tracelevel
Alias
Values
Any positive number
Default No tracing
Set to a positive number to collect trace information for debugging
when running a query. Higher numbers give
progressively more detail on query transformations and searcher
execution.
trace.timestamps
Alias
Values
true or false
Default No timestamps in trace
Enable it to get timing information already at tracelevel=1 which is useful for debugging latency spent at different components in the search chain without rendering a lot of string data which is associated with higher trace levels.
Query Model Parameters
model.defaultIndex [default-index]
Alias default-index
Values An index name
Default default
The field which is searched for query terms which doesn't explicitly specify an index.
model.encoding [encoding]
Alias encoding
Values Encoding names or aliases defined in the IANA character sets
Default utf-8
Sets the encoding to use when returning a result. The encodings big5,
euc-jp, euc-kr, gb2312, iso-2022-jp and shift-jis
also influences how tokenization is done in the absence of an explicit language setting.
The query is always encoded as UTF-8, independently of how the result will be encoded.
model.filter [filter]
Alias filter
Values Any allowed collection of filter terms
Default Not set
Sets a filter to be combined with the query. Typical use of a filter
is to add machine generated or preferences based filter terms to a raw
user query. The filter is parsed the same way as a query of type any,
the full syntax is available. The positive terms (preceded by +) and
phrases act as AND filters, the negative terms (preceded by -) act as
NOT filters, while the unprefixed terms will be used to RANK the
results. Unless the query has no positive terms, the filter will only
restrict and influence ranking of the result set, never cause more
matches than the query.
model.language [lang, language]
Alias language, lang
Values Ref. RFC 3066
Default Unspecified
Informs Vespa about the natural language of the query. Please see
linguistics for details.
This attribute should always be set when it is known. If this
parameter is not set, it will be guessed from the query and encoding, and
default to english if it cannot be guessed.
model.queryString [query]
Alias query
Values Any HTTP encoded legal Vespa query language string
Default Not set
The Simple Vespa Query Language query string
specifying which documents to match in this query.
model.restrict [restrict]
Alias restrict
Values A comma delimited list of document type names.
Default Search unrestricted
The document types to restrict the search to when different document
types share the same search cluster.
model.searchPath [path]
Alias searchpath
Values
- searchpath::ELEMENT [';' ELEMENT]*
- ELEMENT::PART ['/' ROW]
- PART::EXP [',' EXP]*
- EXP::NUM | RANGE
- ROW::NUM
- RANGE::'['NUM ',' NUM ' >'
Default Whole cluster
Specification of which path to send the query to.
Used to select which set of search nodes in the cluster should be used.
Only meant for debugging/monitoring.
Examples:
Note that in an indexed content cluster with flat distribution we have 1 implicit row
and each search node represents a part.
- '7/3' = part 7, row 3.
- '7/' = part 7, any row.
- '7,1,9/0' = parts 1,7 and 9, row 0.
- '1,[3,9>/0' = parts 1,3,4,5,6,7,8, row 0.
In a cluster with a multi-level dispatch setup we must specify a search path element for each level.
Lets say we have a setup with 2 mid-level dispatch groups, each containing 3 search nodes (and 3 dispatchers):
- '0/;2/' = dispatch group (part) 0, any of the dispatchers (row); search node (part) 2, any row (of 1 present).
- '0/1;2/0' = dispatch group (part) 0, dispatcher (row) 1; search node (part) 2, row 0 (of 1 present).
model.sources [search, sources]
Alias search, sources
Values A comma separated list of search cluster names or other source names
Default Search unrestricted
The names of the sources to search, e.g one or more search clusters and/or federated sources.
model.type [type]
Alias type
Values web, all, any, phrase, yql, adv (deprecated) -
refer to simple query language reference
Default all
Selects the query language syntax of the query parameter.
Ranking
ranking.location [location]
Alias location
Values See Geo search
Default None
Point (one or two dimensional) location to use as base for location ranking.
For geographical locations, it is recommended to add the location using pos.ll
ranking.features.featurename [rankfeature.featurename]
Alias rankfeature.featurename
Values Any string
Default None
Set a rank feature to a value. This works for any key name query(anyname)
(query features),
and also as a way to override all existing (match and document) features.
Example: query=foo&ranking.features.query(userage)=42&ranking.features.fieldMatch(title)=0.65
ranking.listFeatures [rankfeatures]
Alias rankfeatures
Values boolean
Default false
Set to true to request all rank features to be calculated and returned.
The rank features will be returned in the summary field rankfeatures.
This option is typically used for MLR training, should not to be used for production.
ranking.profile [ranking]
Alias ranking
Values Any rank profile name
Default default
Sets the name of the rank profile to use for assigning relevancy scores.
The default rank profile will be used for back-ends which does not have the given rank profile.
ranking.properties.propertyname [rankproperty.propertyname]
Alias rankproperty.propertyname
Values Any string
Default None
Set a rank property that is passed to, and used by a feature executor for this query.
Example: query=foo&ranking.properties.dotProduct.X={a:1,b:2}
ranking.sorting [sorting]
Alias sorting
Values A valid sort specification
Default None - order by relevance
A specification of how to sort the result.
Fields you want to sort on must be stored as document attributes in the index structure
by adding attribute to the indexing statement.
ranking.freshness
Alias
Values [integer]
, an absolute time in seconds since epoch, or now-[number]
, to use a time [integer] seconds into the past, or now
to use the current time
Default None - use the current time on each node.
Sets the time which will be used as now during execution.
ranking.queryCache
Alias
Values boolean
Default false
Turns query cache on or off. Search is a two-phase process. If the
query cache is on, the query is stored on the search nodes between the
first and second phase, saving network bandwidth and also query setup
time, at the expense of using more memory.
ranking.matchPhase
Settings which control Vespa's behavior during the match phase.
If these are set in the query they will override any match-phase setting
in the rank profile.
- ranking.matchPhase.maxHits the max number of hits that should be generated during the match phase
- ranking.matchPhase.attribute the attribute to limit matches by if more than maxHits hits will be generated
- ranking.matchPhase.ascending whether to keep the documents having the highest (default) or lowest values of the attribute
- ranking.matchPhase.diversity.attribute the attribute to use to guarantee diversity.
- ranking.matchPhase.diversity.minGroups the minimum number of groups grouped by the diversity attribute.
ranking.matchPhase.maxHits
Alias
Values long
Default If sorting and not ranking: max(10000, maxhits+maxoffset).
Otherwise: none.
The max hits the engine should attempt to produce in the match phase on each partition.
If it is determined during matching that many more hits than this will be generated, the matching will fall back to
take the best (highest or lowest) values of the attribute given by ranking.matchPhase.attribute.
By default, this will be turned on only when sorting is used and grouping is not.
If sorting is used, the primary sort attribute will be used as the match phase attribute if it has fast-search set.
In that case the default can be overridden by setting this value explicitly.
ranking.matchPhase.attribute
Alias
Values An attribute name
Default none
The attribute to decide which documents are a match if the match phase
estimates that there will be more than maxHits matches.
This attribute should have fast-search set and should correlate with the order
which would be produced by a full evaluation.
ranking.matchPhase.ascending
Alias
Values boolean
Default false
Whether the attribute should be sorted in ascending or descending (default) order
to determine which documents to keep as matches.
ranking.matchPhase.diversity.attribute
Alias
Values An attribute name
Default none.
The attribute to be used for producing the desired diversity.
Also see attribute.
ranking.matchPhase.diversity.minGroups
Alias
Values long
Default none
The minimum number of groups that should be returned from the match phase grouped by the diversity attribute.
Also see min-groups.
Presentation
presentation.bolding [bolding]
Alias bolding
Values boolean
Default true
Whether or not to bold search terms in search definition
fields defined with bolding: on
or summary: dynamic.
presentation.format [format]
Alias format
Values
No value or default
The default, builtin JSON format
json
Builtin JSON format
xml
Deprecated, builtin XML format
page
Alternative deprecated XML format which is suitable for use with page templates.
Any other value
A custom result renderer supplied by the application
Default default
presentation.summary [summary]
Alias summary
Values
The name of the summary class
used to select fields in results.
Default The default summary class of the search definition.
presentation.template
Alias
Values Any id specification of a deployed page template.
Default
The id of the page template to use for this result. This should be used with the
page result format.
presentation.timing
Alias
Values boolean
Default false
Whether a result renderer should try to add optional timing information
to the rendered page.
Grouping and Aggregation
select
Alias
Values A valid grouping specification.
Default No grouping
Requests specific multi-level result set statistics and/or hit groups to be returned in the result.
Fields you want to retrieve statistics or hit groups for must be stored as document attributes
in the index structure by adding attribute to the indexing statement.
See the grouping guide.
collapsefield
Alias
Values Any document summary field name
Default No field collapsing
Collapse (i.e. aggregate) results using this field.
Collapsing is run in the container, not content node level.
Define a collapsefield to remove duplicates if the corpus has few duplicates -
this is more efficient than using grouping.
Otherwise, use grouping.
collapsesize
Alias
Values A positive integer
Default 1
The number of hits to keep in each collapsed bucket
collapse.summary
Alias
Values A valid name of a document summary class.
Default Use default summary or attributes.
Use this summary class to fetch the field used for collapsing.
Geographical Searches
pos.ll
Alias
Values
Position given in latitude and longitude - example: S22.4532;W123.9887
Refer to position field
for format specification.
Default None
pos.radius
Alias
Values
Radius of the circle used for filtering. Valid units of measurement are km, m and mi. Examples:
- pos.radius=100m
- pos.radius=42mi
- pos.radius=4km
One can also specify just a number (internal units, micro-degrees), but this is not recommended.
Default 50km
pos.bb
Alias
Values
Bounding box for positions, given as latitude and longitude boundaries.
The four boundaries must be specified as N, S, E, W, with degrees as
a decimal fraction. Degrees south of equator or west of Greenwich are
input as negative numbers. Examples:
- n=37.44899,s=37.3323,e=-121.98241,w=-122.06566
- s=40.183868,w=-74.819519,n=40.248291,e=-74.728798
Default None
pos.attribute
Alias
Values Any attribute that has zcurve encoded positions as a long attribute.
Default Random choice among the ones declared as position in the searchdefinition.
Which attribute to use for the position. Can be both single- or multi-value.
Streaming Search
The features in this section applies to streaming search only.
streaming.userid
Alias
Values An integer in decimal notation in the range [0, 2^64>
Default None
Restricts streaming search to only stream through documents with document ids having the n=<number>
modifier and the userid part matches the supplied value. This can be used for grouping documents on a 64 bit integer.
streaming.groupname
Alias
Values A string
Default None
Restricts streaming search to only stream through documents with document ids having the g=<groupname>
modifier and the groupname part matches the supplied value. This can be used for grouping documents on a string.
streaming.selection
Alias
Values A string
Default None
Restricts streaming search using a document selection.
This can be used for selecting a subset of documents based on an advanced expression.
streaming.priority
Alias
Values Priority class
Default VERY_HIGH
Priority of the streaming search visitor. Having a high priority visitor helps maintain low latencies
even when the system is under load.
streaming.maxbucketspervisitor
Alias
Values int
Default 1 (if ordering is set), or infinite
If set, visit only this many buckets at a time.
Combine with ordering to reduce visiting time for large users/groups.
Semantic Rules
Refer to semantic rules.
rules.off
Alias
Values Boolean
Default True
Turn rule evaluation off for this query
rules.rulebase
Alias
Values String
Default A rule base name
The name of the rule base to use for these queries
tracelevel.rules
Alias
Values int
Default 1-5 (?)
The amount of rule evaluation trace output to show, higher number means more details.
This is useful to see a trace from rule evaluation
without having to see trace from all other searchers at the same time.
Other
recall
Alias
Values Any allowed collection of recall terms
Default No recall
Sets a recall parameter to be combined with the query.
This is identical to filter,
except that recall terms are not exposed to the ranking framework and thus not ranked.
As such, one can not use unprefixed terms; they must either by positive or negative.
user
Alias
Values A string
Default None
The id of the user making the query. The contents of the argument are made available to the search chain,
but it triggers no features in Vespa apart from being propagated to the access log.
nocachewrite
Alias
Values Boolean
Default False
Set to true to avoid the result being written to cache when fetched.
hitcountestimate
Alias
Values Boolean
Default False
Make this an estimation query.
No hits will be returned, and total hit count will be set to an estimate of what executing
the query as a normal query would give.
metrics.ignore
Alias
Values Boolean
Default False
Ignore metric collection for this query request, useful for warm up queries