An example job definition for a parameter grid search evaluating search metrics against a given endpoint
is provided within the scripts-folder.
The definition is contained in the file testSearchEval.json
, that can be send to the respective Kolibri endpoints
(see start_searcheval.sh
). Where the response is written is configured via properties/env variables
(see respective part of the documentation).
A simpler way is to start up the app along with the UI (Kolibri Watch, see respective section of this doc),
and navigate to the CREATE menu, select the search evaluation type and choose a job execution definition template.
From this UI you can directly edit an existing template, save it and start the execution.
Note that the execution below makes a few assumptions for the processing to work.
Lets have a look at the definition and then define the meaning of the distinct sections of the used json. You’ll see something like this:
{
"jobName": "testJob",
"fixedParams": {
"k1": [
"v1",
"v2"
],
"k2": [
"v3"
]
},
"contextPath": "search",
"connections": [
{
"host": "search-service",
"port": 80,
"useHttps": false
},
{
"host": "search-service1",
"port": 81,
"useHttps": false
}
],
"requestPermutation": [
{
"type": "ALL",
"value": {
"params": {
"type": "FROM_FILES_LINES",
"values": {
"q": "/app/test-files/test-paramfiles/test_queries.txt"
}
}
}
},
{
"type": "ALL",
"value": {
"params": {
"type": "GRID_FROM_VALUES_SEQ",
"values": [
{
"name": "a1",
"values": [
0.45,
0.32
]
},
{
"name": "o",
"start": 0.0,
"end": 2000.0,
"stepSize": 1.0
}
]
}
}
}
],
"batchByIndex": 0,
"parsingConfig": {
"singleSelectors": [],
"seqSelectors": [
{
"name": "productIds",
"castType": "STRING",
"selector": {
"type": "PLAINREC",
"path": "\\ response \\ docs \\\\ product_id"
}
}
]
},
"excludeParamsFromMetricRow": [
"q"
],
"taggingConfiguration": {
"initTagger": {
"type": "REQUEST_PARAMETER",
"parameter": "q",
"extend": false
},
"processedTagger": {
"type": "NOTHING"
},
"resultTagger": {
"type": "NOTHING"
}
},
"requestTemplateStorageKey": "requestTemplate",
"mapFutureMetricRowCalculation": {
"functionType": "IR_METRICS",
"name": "irMetrics",
"queryParamName": "q",
"requestTemplateKey": "requestTemplate",
"productIdsKey": "productIds",
"judgementProvider": {
"type": "FILE_BASED",
"filename": "/app/test-files/test-judgements/test_judgements.txt"
},
"metricsCalculation": {
"metrics": [
{"name": "DCG_10", "function": {"type": "DCG", "k": 10}},
{"name": "NDCG_10", "function": {"type": "NDCG", "k": 10}},
{"name": "PRECISION_4", "function": {"type": "PRECISION", "k": 4, "threshold": 0.1}},
{"name": "ERR_10", "function": {"type": "ERR", "k": 10}}
],
"judgementHandling": {
"validations": [
"EXIST_RESULTS",
"EXIST_JUDGEMENTS"
],
"handling": "AS_ZEROS"
}
},
"excludeParams": [
"q"
]
},
"singleMapCalculations": [],
"allowedTimePerElementInMillis": 1000,
"allowedTimePerBatchInSeconds": 6000,
"allowedTimeForJobInSeconds": 720000,
"expectResultsFromBatchCalculations": false,
"wrapUpFunction": {
"type": "AGGREGATE_FROM_DIR_BY_REGEX",
"weightProvider": {
"type": "CONSTANT",
"weight": 1.0
},
"regex": "[(]q=.+[)]",
"outputFilename": "(ALL1)",
"readSubDir": "testJob",
"writeSubDir": "testJob"
}
}
The above on posting to the search_eval_no_ser
endpoint is parsed into an JobMessages.SearchEvaluation instance.
Within Kolibri, the parsing of sent data utilizes the spray lib, and all types except base types need
a JsonFormat definition that specifies how a passed json is transformed into the specific object and how an object
is transformed back to its string representation. Those definitions are always found within the ‘de.awagen.kolibri-[datatypes/base].io.json’ package
and carry the suffix JsonProtocol. More details on this in the follow up sections.
The actual SearchEvaluation message case class looks like this:
case class SearchEvaluation(jobName: String,
fixedParams: Map[String, Seq[String]],
contextPath: String,
connections: Seq[Connection],
requestPermutation: Seq[ModifierGeneratorProvider],
batchByIndex: Int,
parsingConfig: ParsingConfig,
excludeParamsFromMetricRow: Seq[String],
requestTemplateStorageKey: String,
mapFutureMetricRowCalculation: FutureCalculation[WeaklyTypedMap[String], MetricRow],
singleMapCalculations: Seq[Calculation[WeaklyTypedMap[String], CalculationResult[Double]]],
taggingConfiguration: Option[BaseTaggingConfiguration[RequestTemplate,
(Either[Throwable, WeaklyTypedMap[String]], RequestTemplate), MetricRow]],
wrapUpFunction: Option[JobWrapUpFunction[Unit]],
allowedTimePerElementInMillis: Int = 1000,
allowedTimePerBatchInSeconds: Int = 600,
allowedTimeForJobInSeconds: Int = 7200,
expectResultsFromBatchCalculations: Boolean = true) extends JobMessage
Lets summarize what the distinct attribute are used for:
Name:Type | What for? |
---|---|
jobName: String | job name for the execution. If execution with same jobName is running, the request to start another one with the same name will be denied. |
fixedParams: Map[String, Seq[String]] | Parameter name/values mapping for parameters that wont change between requests. |
contextPath: String | Context path for the requests. The host settings where to send those requests to is defined within connections. |
connections: Seq[Connection] | Single or multiple connections against which the requests shall be sent. A connection holds host, port, flag whether to use https or http and optional credentials. |
requestPermutation: : Seq[ModifierGeneratorProvider] | Single or multiple ModifierGeneratorProvider. Each of those providers provides methods to retrieve the Seq of generators of modifiers of RequestTemplateBuilders or a partitioning, which is a generator of generators of mentioned modifiers. For more detail see later sections. |
batchByIndex: Int | index (0-based) to define which generator of modifiers to batch by. E.g in the above example specification setting this value to 0 batches by the generator what generates the modifiers corresponding to the single query-parameter values, e.g its the first one in the definition, thus index 0. |
parsingConfig: ParsingConfig | The parsing configuration defining which values to extract as what data type and under which key to place into the result map. The result map can then be utilized to derive metrics / tags or similar. |
excludeParamsFromMetricRow: Seq[String] | Gives the parameter names that shall not be part of the parameter set in the aggregation result (MetricRow[Double]). For same given tags results would be aggregated per set of parameters, thus if this shall not happen on the per-query level, or if overall aggregation shall aggregate values over multiple queries, query parameter should be added here. If granularity on per-query level is needed, this should be reflected in the tag sticked to the result instead (for more details on tagging see later sections of the doc). |
requestTemplateStorageKey: String | This simply defines an arbitrary storage key used to put the request template in the result map for further reference down the processing chain. |
mapFutureMetricRowCalculation: FutureCalculation[WeaklyTypedMap[String], MetricRow] | Definition of the MetricRow calculation based on WeaklyTypedMap[String], yielding a Future result due to additional steps involved such as loading the judgements. |
singleMapCalculations: Seq[Calculation[WeaklyTypedMap[String], CalculationResult[Double]]] | Additional calculations based on WeaklyTypedMap[String] parsed response, leading to CalculationResult[Double] |
taggingConfiguration: Option[BaseTaggingConfiguration[RequestTemplate, (Either[Throwable, WeaklyTypedMap[String]], RequestTemplate), MetricRow]] | This specifies a tagging configuration, allowing tagging on the request level (using RequestTemplate), on the response level (using (Either[Throwable, WeaklyTypedMap[String]], RequestTemplate)) and on the final outcome level (using the result MetricRow[Double] object) |
wrapUpFunction: Option[JobWrapUpFunction[Unit]] | Wrap-up function to execute after the execution has finished. This could be the aggregation of all single results to an overall result or similar. This is executed on the node of the Job Manager Actor. In case of many single results, its beneficial to write results directly from the nodes generating the results and aggregating all to an overall result later instead of sending all partial results as serialized messages across the cluster. |
allowedTimePerElementInMillis: Int | Specifies the time in milliseconds a single processing element in batch can take till finishing. |
allowedTimePerBatchInSeconds: Int | Specifies the time a single batch is allowed to take till finishing execution (in seconds). |
allowedTimeForJobInSeconds: Int | Specifies the time a full job is allowed to execute. If exceeding the time, the job is aborted (time given in seconds) |
expectResultsFromBatchCalculations: Boolean | Specifies whether the job manager actor expects results for single batches back from the single executing nodes. |
In the above example job definitions you can observe the following:
k1=v1&k1=v2&k2=v3
PLAINREC
, meaning plain recursive. The path is set to
\\ response \\ docs \\\\ product_id
, which describes a json path like this:{
"response": {
"docs": [
{"product_id": "value1"},
{"product_id": "value2"}
]
}
}
In the result product ids can be retrieved of the result map via key “productIds”.
There are multiple variants to parse data out of a json, such as (see JsonSelectorJsonProtocol
and TypedJsonSelectorJsonProtocol
)
SINGLEREC
: single recursive selector, e.g recursively on json root without any selectors beforePLAINREC
: some plain path selectors followed by recursive selector at the endRECPLAIN
: recursive selector (may contain plain path) then mapped to some plain selection (each element from recursive selection)RECREC
: recursive selector (may contain plain path) then flatMapped to another recursive selector (each element from the first recursive selection,
//
e.g mapping the Seq[JsValue] elements)(q=[paramValue])
, e.g (q=q1), (q=q2), and so on.{
"type": "AGGREGATE_FROM_DIR_BY_REGEX",
"weightProvider": {
"type": "CONSTANT",
"weight": 1.0
},
"regex": "[(]q=.+[)]",
"outputFilename": "(ALL1)",
"readSubDir": "testJob",
"writeSubDir": "testJob"
}
Kolibri provides an execution-endpoint, for which examples can be found in the ’testAggregation.json’ (aggregation example, same as used for the wrapup function above) and ’testAnalyze.json’
{
"type": "ANALYZE_BEST_WORST_REGEX",
"directory": "testJob",
"regex": "[(]q=.+[)]",
"currentParams": {
"a1": ["0.45"],
"k1": ["v1", "v2"],
"k2": ["v3"],
"o": ["479.0"]
},
"compareParams": [
{
"a1": ["0.32"],
"k1": ["v1", "v2"],
"k2": ["v3"],
"o": ["1760.0"]
},
{
"a1": ["0.45"],
"k1": ["v1", "v2"],
"k2": ["v3"],
"o": ["384.0"]
},
{
"a1": ["0.45"],
"k1": ["v1", "v2"],
"k2": ["v3"],
"o": ["1325.0"]
}
],
"metricName": "NDCG_10",
"queryParamName": "q",
"n_best": 5,
"n_worst": 4
}
The latter picks the single result files according to the provided regex, defines the parameters to compare against (‘currentParams’) and the variants to compare against (‘compareParams’). Futher, ‘metricName’ defines the name of the metric to use for comparison, and ’n_best’ and ’n_worst’ defines the n most increasing values / most decreasing values to be kept. ‘queryParamName’ specifies the parameter name that in this example is extracted by regex from the file name of the partial result.