org.apache.lucene.benchmark.byTask.package-info Maven / Gradle / Ivy

Go to download

Show more of this group Show more artifacts with this name
Show all versions of lucene-benchmark Show documentation

Lucene Benchmarking Module

The newest version!

table of benchmark packages
Package	Description
stats	Statistics maintained when running benchmark tasks.
tasks	Benchmark tasks.
feeds	Sources for benchmark inputs: documents and queries.
utils	Utilities used for the benchmark, and for the reports.
programmatic	Sample performance test written programmatically.

/* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreements. See the NOTICE file distributed with * this work for additional information regarding copyright ownership. * The ASF licenses this file to You under the Apache License, Version 2.0 * (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ /** * Benchmarking Lucene By Tasks * * This package provides "task based" performance benchmarking of Lucene. One can use the * predefined benchmarks, or create new ones. * * Contained packages: * * * * * * * * * * * * * * * * * * * * * * * * * * * * table of benchmark packagesPackage Description stats Statistics maintained when running benchmark tasks. tasks Benchmark tasks. feeds Sources for benchmark inputs: documents and queries. utils Utilities used for the benchmark, and for the reports. programmatic Sample performance test written programmatically. * * Table Of Contents * * * Benchmarking By Tasks * How to use * Benchmark "algorithm" * Supported tasks/commands * Benchmark properties * Example input algorithm and the result benchmark report. * Results record counting clarified * * * * * Benchmarking By Tasks * * Benchmark Lucene using task primitives. * * A benchmark is composed of some predefined tasks, allowing for creating an index, adding * documents, optimizing, searching, generating reports, and more. A benchmark run takes an * "algorithm" file that contains a description of the sequence of tasks making up the run, and some * properties defining a few additional characteristics of the benchmark run. * * How to use * * The easiest way to run a benchmarks is using gradle: * * * ./gradlew -p lucene/benchmark getReuters run * - would run the micro-standard.alg "algorithm". * ./gradlew -p lucene/benchmark getReuters run -Ptask.alg=conf/compound-penalty.alg * - would run the compound-penalty.alg "algorithm". * ./gradlew -p lucene/benchmark getReuters run -Ptask.alg=[full-path-to-your-alg-file] * - would run your perf test "algorithm". * java org.apache.lucene.benchmark.byTask.programmatic.Sample * - would run a performance test programmatically - without using an alg file. This is less * readable, and less convenient, but possible. * * * You may find existing tasks sufficient for defining the benchmark you need, otherwise, * you can extend the framework to meet your needs, as explained herein. * * Each benchmark run has a DocMaker and a QueryMaker. These two should usually match, so that * "meaningful" queries are used for a certain collection. Properties set at the header of the alg * file define which "makers" should be used. You can also specify your own makers, extending * DocMaker and implementing QueryMaker. * * * * Note: since 2.9, DocMaker is a concrete class which accepts a ContentSource. In most * cases, you can use the DocMaker class to create Documents, while providing your own ContentSource * implementation. For example, the current Benchmark package includes ContentSource implementations * for TREC, Enwiki and Reuters collections, as well as others like LineDocSource which reads a * 'line' file produced by WriteLineDocTask. * * * * Benchmark .alg file contains the benchmark "algorithm". The syntax is described below. Within * the algorithm, you can specify groups of commands, assign them names, specify commands that * should be repeated, do commands in serial or in parallel, and also control the speed of "firing" * the commands. * * This allows, for instance, to specify that an index should be opened for update, documents * should be added to it one by one but not faster than 20 docs a minute, and, in parallel with * this, some N queries should be searched against that index, again, no more than 2 queries a * second. You can have the searches all share an index reader, or have them each open its own * reader and close it afterwords. * * If the commands available for use in the algorithm do not meet your needs, you can add * commands by adding a new task under org.apache.lucene.benchmark.byTask.tasks - you should extend * the PerfTask abstract class. Make sure that your new task class name is suffixed by Task. Assume * you added the class "WonderfulTask" - doing so also enables the command "Wonderful" to be used in * the algorithm. * * External classes: It is sometimes useful to invoke the benchmark package with your * external alg file that configures the use of your own doc/query maker and or html parser. You can * work this out without modifying the benchmark package code, by passing your class path with the * benchmark.ext.classpath property: * * * ./gradlew -p lucene/benchmark run -Ptask.alg=[full-path-to-your-alg-file] -Dbenchmark.ext.classpath=/mydir/classes -Dtask.mem=512M * * * External tasks: When writing your own tasks under a package other than * org.apache.lucene.benchmark.byTask.tasks specify that package thru the alt.tasks.packages property. * * * * Benchmark "algorithm" * * The following is an informal description of the supported syntax. * * * Measuring: When a command is executed, statistics for the elapsed execution time and * memory consumption are collected. At any time, those statistics can be printed, using one * of the available ReportTasks. * Comments start with '#'. * Serial sequences are enclosed within '{ }'. * Parallel sequences are enclosed within '[ ]' * Sequence naming: To name a sequence, put '"name"' just after '{' or '['. * Example - { "ManyAdds" AddDoc } : 1000000 - would name * the sequence of 1M add docs "ManyAdds", and this name would later appear in statistic * reports. If you don't specify a name for a sequence, it is given one: you can see it as the * algorithm is printed just before benchmark execution starts. * Repeating: To repeat sequence tasks N times, add ': * N' just after the sequence closing tag - '}' or * ']' or '>'. * Example - [ AddDoc ] : 4 - would do 4 addDoc in * parallel, spawning 4 threads at once. * Example - [ AddDoc AddDoc ] : 4 - would do 8 addDoc in * parallel, spawning 8 threads at once. * Example - { AddDoc } : 30 - would do addDoc 30 times in * a row. * Example - { AddDoc AddDoc } : 30 - would do addDoc 60 * times in a row. * Exhaustive repeating: use * instead of a number * to repeat exhaustively. This is sometimes useful, for adding as many files as a doc maker * can create, without iterating over the same file again, especially when the exact number of * documents is not known in advance. For instance, TREC files extracted from a zip file. * Note: when using this, you must also set content.source.forever to false. * Example - { AddDoc } : * - would add docs until the doc * maker is "exhausted". * Command parameter: a command can optionally take a single parameter. If the certain * command does not support a parameter, or if the parameter is of the wrong type, reading the * algorithm will fail with an exception and the test would not start. Currently the following * tasks take optional parameters: * * AddDoc takes a numeric parameter, indicating the required size of added * document. Note: if the DocMaker implementation used in the test does not support * makeDoc(size), an exception would be thrown and the test would fail. * DeleteDoc takes numeric parameter, indicating the docid to be deleted. The * latter is not very useful for loops, since the docid is fixed, so for deletion in * loops it is better to use the doc.delete.step property. * SetProp takes a name,value mandatory param, ',' used as a * separator. * SearchTravRetTask and SearchTravTask take a numeric parameter, * indicating the required traversal size. * SearchTravRetLoadFieldSelectorTask takes a string parameter: a comma separated * list of Fields to load. * SearchTravRetHighlighterTask takes a string parameter: a comma separated list * of parameters to define highlighting. See that tasks javadocs for more information * * * Example - AddDoc(2000) - would add a document of size * 2000 (~bytes). * See conf/task-sample.alg for how this can be used, for instance, to check which is faster, * adding many smaller documents, or few larger documents. Next candidates for supporting a * parameter may be the Search tasks, for controlling the query size. * Statistic recording elimination: - a sequence can also end with '>', in which case child tasks would not store their statistics. This can * be useful to avoid exploding stats data, for adding say 1M docs. * Example - { "ManyAdds" AddDoc > : 1000000 - would * add million docs, measure that total, but not save stats for each addDoc. * Notice that the granularity of System.currentTimeMillis() (which is used here) is system * dependant, and in some systems an operation that takes 5 ms to complete may show 0 ms * latency time in performance measurements. Therefore it is sometimes more accurate to look * at the elapsed time of a larger sequence, as demonstrated here. * Rate: To set a rate (ops/sec or ops/min) for a sequence, add ': N : R' just after sequence closing tag. This would specify repetition of * N with rate of R operations/sec. Use 'R/sec' or 'R/min' to explicitly specify that the rate is per second or * per minute. The default is per second, * Example - [ AddDoc ] : 400 : 3 - would do 400 addDoc in * parallel, starting up to 3 threads per second. * Example - { AddDoc } : 100 : 200/min - would do 100 * addDoc serially, waiting before starting next add, if otherwise rate would exceed 200 * adds/min. * Disable Counting: Each task executed contributes to the records count. This count is * reflected in reports under recs/s and under recsPerRun. Most tasks count 1, some count 0, * and some count more. (See Results record counting clarified for * more details.) It is possible to disable counting for a task by preceding it with -. * Example - -CreateIndex - would count 0 while the * default behavior for CreateIndex is to count 1. * Command names: Each class "AnyNameTask" in the package * org.apache.lucene.benchmark.byTask.tasks, that extends PerfTask, is supported as command * "AnyName" that can be used in the benchmark "algorithm" description. This allows to add new * commands by just adding such classes. * * * * * Supported tasks/commands * * Existing tasks can be divided into a few groups: regular index/search work tasks, report * tasks, and control tasks. * * * Report tasks: There are a few Report commands for generating reports. Only task runs * that were completed are reported. (The 'Report tasks' themselves are not measured and not * reported.) * * RepAll - all (completed) task runs. * RepSumByName - all statistics, aggregated by * name. So, if AddDoc was executed 2000 times, only 1 report line would be created for * it, aggregating all those 2000 statistic records. * RepSelectByPref prefixWord - all records * for tasks whose name start with prefixWord. * RepSumByPref prefixWord - all records for * tasks whose name start with prefixWord, * aggregated by their full task name. * RepSumByNameRound - all statistics, aggregated by * name and by Round. So, if AddDoc was executed * 2000 times in each of 3 rounds, 3 report lines * would be created for it, aggregating all those 2000 statistic records in each round. * See more about rounds in the NewRound command * description below. * RepSumByPrefRound prefixWord - similar to * RepSumByNameRound, just that only tasks whose * name starts with prefixWord are included. * * If needed, additional reports can be added by extending the abstract class ReportTask, and * by manipulating the statistics data in Points and TaskStats. * Control tasks: Few of the tasks control the benchmark algorithm all over: * * ClearStats - clears the entire statistics. * Further reports would only include task runs that would start after this call. * NewRound - virtually start a new round of * performance test. Although this command can be placed anywhere, it mostly makes sense * at the end of an outermost sequence. * This increments a global "round counter". All task runs that would start now would * record the new, updated round counter as their round number. This would appear in * reports. In particular, see RepSumByNameRound * above. * An additional effect of NewRound, is that numeric and boolean properties defined (at * the head of the .alg file) as a sequence of values, e.g. merge.factor=mrg:10:100:10:100 would increment (cyclic) to the next * value. Note: this would also be reflected in the reports, in this case under a column * that would be named "mrg". * ResetInputs - DocMaker and the various * QueryMakers would reset their counters to start. The way these Maker interfaces work, * each call for makeDocument() or makeQuery() creates the next document or query that * it "knows" to create. If that pool is "exhausted", the "maker" start over again. The * ResetInputs command therefore allows to make the rounds comparable. It is therefore * useful to invoke ResetInputs together with NewRound. * ResetSystemErase - reset all index and input data * and call gc. Does NOT reset statistics. This contains ResetInputs. All * writers/readers are nullified, deleted, closed. Index is erased. Directory is erased. * You would have to call CreateIndex once this was called... * ResetSystemSoft - reset all index and input data * and call gc. Does NOT reset statistics. This contains ResetInputs. All * writers/readers are nullified, closed. Index is NOT erased. Directory is NOT erased. * This is useful for testing performance on an existing index, for instance if the * construction of a large index took a very long time and now you would to test its * search or update performance. * * Other existing tasks are quite straightforward and would just be briefly described here. * * CreateIndex and OpenIndex both leave the index open for later update operations. * CloseIndex would close it. * OpenReader, similarly, would leave an index * reader open for later search operations. But this have further semantics. If a Read * operation is performed, and an open reader exists, it would be used. Otherwise, the * read operation would open its own reader and close it when the read operation is * done. This allows testing various scenarios - sharing a reader, searching with "cold" * reader, with "warmed" reader, etc. The read operations affected by this are: Warm, Search, SearchTrav (search and traverse), and SearchTravRet (search and traverse and retrieve). * Notice that each of the 3 search task types maintains its own queryMaker instance. * CommitIndex and ForceMerge can be used to commit changes to the index then merge the * index segments. The integer parameter specifies how many segments to merge down to * (default 1). * WriteLineDoc prepares a 'line' file where each * line holds a document with title, date and body elements, * separated by [TAB]. A line file is useful if one wants to measure pure indexing * performance, without the overhead of parsing the data. * You can use LineDocSource as a ContentSource over a 'line' file. * ConsumeContentSource consumes a ContentSource. * Useful for e.g. testing a ContentSource performance, without the overhead of * preparing a Document out of it. * * * * * * Benchmark properties * * Properties are read from the header of the .alg file, and define several parameters of the * performance test. As mentioned above for the NewRound task, * numeric and boolean properties that are defined as a sequence of values, e.g. merge.factor=mrg:10:100:10:100 would increment (cyclic) to the next value, when * NewRound is called, and would also appear as a named column in the reports (column name would be * "mrg" in this example). * * Some of the currently defined properties are: * * * analyzer - full class name for the analyzer to use. * Same analyzer would be used in the entire test. * directory - valid values are This tells which directory * to use for the performance test. * Index work parameters: Multi int/boolean values would be iterated with calls to * NewRound. There would be also added as columns in the reports, first string in the sequence * is the column name. (Make sure it is no shorter than any value in the sequence). * * max.buffered * Example: max.buffered=buf:10:10:100:100 - this would define using maxBufferedDocs of * 10 in iterations 0 and 1, and 100 in iterations 2 and 3. * merge.factor - which merge factor to use. * compound - whether the index is using the * compound format or not. Valid values are "true" and "false". * * * * Here is a list of currently defined properties: * * * Root directory for data and indexes: * * work.dir (default is System property "benchmark.work.dir" or "work".) * * Docs and queries creation: * * analyzer * doc.maker * content.source.forever * html.parser * doc.stored * doc.tokenized * doc.term.vector * doc.term.vector.positions * doc.term.vector.offsets * doc.store.body.bytes * docs.dir * query.maker * file.query.maker.file * file.query.maker.default.field * search.num.hits * * Logging: * * log.step * log.step.[class name]Task ie log.step.DeleteDoc (e.g. log.step.Wonderful for the * WonderfulTask example above). * log.queries * task.max.depth.log * * Index writing: * * compound * merge.factor * max.buffered * directory * ram.flush.mb * codec.postingsFormat (eg Direct) Note: no codec should be specified through * default.codec * * Doc deletion: * * doc.delete.step * * Spatial: Numerous; see spatial.alg * Task alternative packages: * * alt.tasks.packages - comma separated list of additional packages where tasks classes * will be looked for when not found in the default package (that of PerfTask). If the * same task class appears in more than one package, the package indicated first in this * list will be used. * * * * For sample use of these properties see the *.alg files under conf. * * Example input algorithm and the result benchmark report * * The following example is in conf/sample.alg: * * * # -------------------------------------------------------- * # * # Sample: what is the effect of doc size on indexing time? * # * # There are two parts in this test: * # - PopulateShort adds 2N documents of length L * # - PopulateLong adds N documents of length 2L * # Which one would be faster? * # The comparison is done twice. * # * # -------------------------------------------------------- * * # ------------------------------------------------------------------------------------- * # multi val params are iterated by NewRound's, added to reports, start with column name. * merge.factor=mrg:10:20 * max.buffered=buf:100:1000 * compound=true * * analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer * directory=FSDirectory * * doc.stored=true * doc.tokenized=true * doc.term.vector=false * doc.add.log.step=500 * * docs.dir=reuters-out * * doc.maker=org.apache.lucene.benchmark.byTask.feeds.SimpleDocMaker * * query.maker=org.apache.lucene.benchmark.byTask.feeds.SimpleQueryMaker * * # task at this depth or less would print when they start * task.max.depth.log=2 * * log.queries=false * # ------------------------------------------------------------------------------------- * { * * { "PopulateShort" * CreateIndex * { AddDoc(4000) > : 20000 * Optimize * CloseIndex * > * * ResetSystemErase * * { "PopulateLong" * CreateIndex * { AddDoc(8000) > : 10000 * Optimize * CloseIndex * > * * ResetSystemErase * * NewRound * * } : 2 * * RepSumByName * RepSelectByPref Populate * * * * The command line for running this sample: * ./gradlew -p lucene/benchmark getReuters run -Ptask.alg=conf/sample.alg * * The output report from running this test contains the following: * * * Operation round mrg buf runCnt recsPerRun rec/s elapsedSec avgUsedMem avgTotalMem * PopulateShort 0 10 100 1 20003 119.6 167.26 12,959,120 14,241,792 * PopulateLong - - 0 10 100 - - 1 - - 10003 - - - 74.3 - - 134.57 - 17,085,208 - 20,635,648 * PopulateShort 1 20 1000 1 20003 143.5 139.39 63,982,040 94,756,864 * PopulateLong - - 1 20 1000 - - 1 - - 10003 - - - 77.0 - - 129.92 - 87,309,608 - 100,831,232 * * * * * Results record counting clarified * * Two columns in the results table indicate records counts: records-per-run and * records-per-second. What does it mean? * * Almost every task gets 1 in this count just for being executed. Task sequences aggregate the * counts of their child tasks, plus their own count of 1. So, a task sequence containing 5 other * task sequences, each running a single other task 10 times, would have a count of 1 + 5 * (1 + 10) * = 56. * * The traverse and retrieve tasks "count" more: a traverse task would add 1 for each traversed * result (hit), and a retrieve task would additionally add 1 for each retrieved doc. So, regular * Search would count 1, SearchTrav that traverses 10 hits would count 11, and a SearchTravRet task * that retrieves (and traverses) 10, would count 21. * *

Confusing? this might help: always examine the elapsedSec column, and always * compare "apples to apples", .i.e. it is interesting to check how the rec/s changed * for the same task (or sequence) between two different runs, but it is not very useful to know how * the rec/s differs between Search and SearchTrav tasks. For * the latter, elapsedSec would bring more insight. */ package org.apache.lucene.benchmark.byTask;