Executing examples

An example job definition for a parameter grid search evaluating search metrics against a given endpoint is provided within the scripts-folder. The definition is contained in the file testSearchEval.json, that can be send to the respective Kolibri endpoints (see start_searcheval.sh). Where the response is written is configured via properties/env variables (see respective part of the documentation). A simpler way is to start up the app along with the UI (Kolibri Watch, see respective section of this doc), and navigate to the CREATE menu, select the search evaluation type and choose a job execution definition template. From this UI you can directly edit an existing template, save it and start the execution. Note that the execution below makes a few assumptions for the processing to work.

Lets have a look at the definition and then define the meaning of the distinct sections of the used json. You’ll see something like this:

{
  "jobName": "testJob",
  "fixedParams": {
    "k1": [
      "v1",
      "v2"
    ],
    "k2": [
      "v3"
    ]
  },
  "contextPath": "search",
  "connections": [
    {
      "host": "search-service",
      "port": 80,
      "useHttps": false
    },
    {
      "host": "search-service1",
      "port": 81,
      "useHttps": false
    }
  ],
  "requestPermutation": [
    {
      "type": "ALL",
      "value": {
        "params": {
          "type": "FROM_FILES_LINES",
          "values": {
            "q": "/app/test-files/test-paramfiles/test_queries.txt"
          }
        }
      }
    },
    {
      "type": "ALL",
      "value": {
        "params": {
          "type": "GRID_FROM_VALUES_SEQ",
          "values": [
            {
              "name": "a1",
              "values": [
                0.45,
                0.32
              ]
            },
            {
              "name": "o",
              "start": 0.0,
              "end": 2000.0,
              "stepSize": 1.0
            }
          ]
        }
      }
    }
  ],
  "batchByIndex": 0,
  "parsingConfig": {
    "singleSelectors": [],
    "seqSelectors": [
      {
        "name": "productIds",
        "castType": "STRING",
        "selector": {
          "type": "PLAINREC",
          "path": "\\ response \\ docs \\\\ product_id"
        }
      }
    ]
  },
  "excludeParamsFromMetricRow": [
    "q"
  ],
  "taggingConfiguration": {
    "initTagger": {
      "type": "REQUEST_PARAMETER",
      "parameter": "q",
      "extend": false
    },
    "processedTagger": {
      "type": "NOTHING"
    },
    "resultTagger": {
      "type": "NOTHING"
    }
  },
  "requestTemplateStorageKey": "requestTemplate",
  "mapFutureMetricRowCalculation": {
    "functionType": "IR_METRICS",
    "name": "irMetrics",
    "queryParamName": "q",
    "requestTemplateKey": "requestTemplate",
    "productIdsKey": "productIds",
    "judgementProvider": {
      "type": "FILE_BASED",
      "filename": "/app/test-files/test-judgements/test_judgements.txt"
    },
    "metricsCalculation": {
      "metrics": [
        {"name": "DCG_10", "function": {"type": "DCG", "k": 10}},
        {"name": "NDCG_10", "function": {"type": "NDCG", "k": 10}},
        {"name": "PRECISION_4", "function": {"type": "PRECISION", "k": 4, "threshold":  0.1}},
        {"name": "ERR_10", "function": {"type": "ERR", "k": 10}}
      ],
      "judgementHandling": {
        "validations": [
          "EXIST_RESULTS",
          "EXIST_JUDGEMENTS"
        ],
        "handling": "AS_ZEROS"
      }
    },
    "excludeParams": [
      "q"
    ]
  },
  "singleMapCalculations": [],
  "allowedTimePerElementInMillis": 1000,
  "allowedTimePerBatchInSeconds": 6000,
  "allowedTimeForJobInSeconds": 720000,
  "expectResultsFromBatchCalculations": false,
  "wrapUpFunction": {
    "type": "AGGREGATE_FROM_DIR_BY_REGEX",
    "weightProvider": {
      "type": "CONSTANT",
      "weight": 1.0
    },
    "regex": "[(]q=.+[)]",
    "outputFilename": "(ALL1)",
    "readSubDir": "testJob",
    "writeSubDir": "testJob"
  }
}

Example Job Definition Explained

Parameter meaning

The above on posting to the search_eval_no_ser endpoint is parsed into an JobMessages.SearchEvaluation instance. Within Kolibri, the parsing of sent data utilizes the spray lib, and all types except base types need a JsonFormat definition that specifies how a passed json is transformed into the specific object and how an object is transformed back to its string representation. Those definitions are always found within the ‘de.awagen.kolibri-[datatypes/base].io.json’ package and carry the suffix JsonProtocol. More details on this in the follow up sections.

The actual SearchEvaluation message case class looks like this:

case class SearchEvaluation(jobName: String,
                            fixedParams: Map[String, Seq[String]],
                            contextPath: String,
                            connections: Seq[Connection],
                            requestPermutation: Seq[ModifierGeneratorProvider],
                            batchByIndex: Int,
                            parsingConfig: ParsingConfig,
                            excludeParamsFromMetricRow: Seq[String],
                            requestTemplateStorageKey: String,
                            mapFutureMetricRowCalculation: FutureCalculation[WeaklyTypedMap[String], MetricRow],
                            singleMapCalculations: Seq[Calculation[WeaklyTypedMap[String], CalculationResult[Double]]],
                            taggingConfiguration: Option[BaseTaggingConfiguration[RequestTemplate, 
                            (Either[Throwable, WeaklyTypedMap[String]], RequestTemplate), MetricRow]],
                            wrapUpFunction: Option[JobWrapUpFunction[Unit]],
                            allowedTimePerElementInMillis: Int = 1000,
                            allowedTimePerBatchInSeconds: Int = 600,
                            allowedTimeForJobInSeconds: Int = 7200,
                            expectResultsFromBatchCalculations: Boolean = true) extends JobMessage

Lets summarize what the distinct attribute are used for:

Name:TypeWhat for?
jobName: Stringjob name for the execution. If execution with same jobName is running, the request to start another one with the same name will be denied.
fixedParams: Map[String, Seq[String]]Parameter name/values mapping for parameters that wont change between requests.
contextPath: StringContext path for the requests. The host settings where to send those requests to is defined within connections.
connections: Seq[Connection]Single or multiple connections against which the requests shall be sent. A connection holds host, port, flag whether to use https or http and optional credentials.
requestPermutation: : Seq[ModifierGeneratorProvider]Single or multiple ModifierGeneratorProvider. Each of those providers provides methods to retrieve the Seq of generators of modifiers of RequestTemplateBuilders or a partitioning, which is a generator of generators of mentioned modifiers. For more detail see later sections.
batchByIndex: Intindex (0-based) to define which generator of modifiers to batch by. E.g in the above example specification setting this value to 0 batches by the generator what generates the modifiers corresponding to the single query-parameter values, e.g its the first one in the definition, thus index 0.
parsingConfig: ParsingConfigThe parsing configuration defining which values to extract as what data type and under which key to place into the result map. The result map can then be utilized to derive metrics / tags or similar.
excludeParamsFromMetricRow: Seq[String]Gives the parameter names that shall not be part of the parameter set in the aggregation result (MetricRow[Double]). For same given tags results would be aggregated per set of parameters, thus if this shall not happen on the per-query level, or if overall aggregation shall aggregate values over multiple queries, query parameter should be added here. If granularity on per-query level is needed, this should be reflected in the tag sticked to the result instead (for more details on tagging see later sections of the doc).
requestTemplateStorageKey: StringThis simply defines an arbitrary storage key used to put the request template in the result map for further reference down the processing chain.
mapFutureMetricRowCalculation: FutureCalculation[WeaklyTypedMap[String], MetricRow]Definition of the MetricRow calculation based on WeaklyTypedMap[String], yielding a Future result due to additional steps involved such as loading the judgements.
singleMapCalculations: Seq[Calculation[WeaklyTypedMap[String], CalculationResult[Double]]]Additional calculations based on WeaklyTypedMap[String] parsed response, leading to CalculationResult[Double]
taggingConfiguration: Option[BaseTaggingConfiguration[RequestTemplate, (Either[Throwable, WeaklyTypedMap[String]], RequestTemplate), MetricRow]]This specifies a tagging configuration, allowing tagging on the request level (using RequestTemplate), on the response level (using (Either[Throwable, WeaklyTypedMap[String]], RequestTemplate)) and on the final outcome level (using the result MetricRow[Double] object)
wrapUpFunction: Option[JobWrapUpFunction[Unit]]Wrap-up function to execute after the execution has finished. This could be the aggregation of all single results to an overall result or similar. This is executed on the node of the Job Manager Actor. In case of many single results, its beneficial to write results directly from the nodes generating the results and aggregating all to an overall result later instead of sending all partial results as serialized messages across the cluster.
allowedTimePerElementInMillis: IntSpecifies the time in milliseconds a single processing element in batch can take till finishing.
allowedTimePerBatchInSeconds: IntSpecifies the time a single batch is allowed to take till finishing execution (in seconds).
allowedTimeForJobInSeconds: IntSpecifies the time a full job is allowed to execute. If exceeding the time, the job is aborted (time given in seconds)
expectResultsFromBatchCalculations: BooleanSpecifies whether the job manager actor expects results for single batches back from the single executing nodes.

Example definition explained

In the above example job definitions you can observe the following:

  • jobName is “testJob”
  • fixed parameters are set and will be used for each request: k1=v1&k1=v2&k2=v3
  • contextPath is “search”
  • connections specifies to distinct connections, one going to host “search-service:80”, the other to “search-service1:81”, both using normal http:// requests without credentials set. This makes the assumptions that the corresponding services are running and indeed exposing a “search”-endpoint. The execution flow utilizes a balanced execution across connections, thus about equal load can be expected on both if both have similar latencies.
  • the request permutation permutates only different url parameters, but does not vary headers or bodies. The permutated parameter values include q1-q10 for parameter “q”, values 0.45 and 0.32 for parameter “a1” and 2001 parameter values in the range [0.0, 2000.0] in step sizes of 1 for parameter “o”. Note that the order of parameters plays a role here, since the parameter “batchByIndex” refers to exactly this ordering, e.g if set to 0 using parameter “q”. Index 1 would refer to parameter “a1”, index 2 to parameter “o”.
  • batchByIndex = 0, thus each batch only handles a single setting for parameter “q”. We want to have per-query granularity here, thus each batch result corresponds to a valid aggregation by itself. These concerns about the smallest granularity needed should go into decision which modifiers to batch by.
  • parsing config only specifies the parsing of values with key “productIds”, assuming each element to be of type String. That we are expecting a Seq and not a single value is given by the fact that the selector is defined within “seqSelectors” and within the “selector” the type is PLAINREC, meaning plain recursive. The path is set to \\ response \\ docs \\\\ product_id, which describes a json path like this:
{
  "response": {
    "docs": [
      {"product_id":  "value1"},
      {"product_id":  "value2"}
    ]
  }
}

In the result product ids can be retrieved of the result map via key “productIds”. There are multiple variants to parse data out of a json, such as (see JsonSelectorJsonProtocol and TypedJsonSelectorJsonProtocol)

  1. SINGLEREC: single recursive selector, e.g recursively on json root without any selectors before
  2. PLAINREC: some plain path selectors followed by recursive selector at the end
  3. RECPLAIN: recursive selector (may contain plain path) then mapped to some plain selection (each element from recursive selection)
  4. RECREC: recursive selector (may contain plain path) then flatMapped to another recursive selector (each element from the first recursive selection, // e.g mapping the Seq[JsValue] elements)
  • parameter “q” is the only parameter given in “excludeParamsFromMetricRow” setting, since we dont want the parameter “q” to occur as parameter in our aggregated MetricRow[Double] result, since aggregations are done per parameter setting lateron, and we couldnt do this before we removed the parameter “q” from the single results. As youll see below, we rather include parameter q in the tagging of the single results.
  • “taggingConfiguration” specifies that we won’t add a tag for the parsed result or the final MetricRow[Double] result, but we will tag based on the RequestTemplate’s request parameter “q” for each request to the external endpoint. This will effectively result in one partial result per query, with filename given by the Tag toString method, in this case resulting in names (q=[paramValue]), e.g (q=q1), (q=q2), and so on.
  • “requestTemplateStorageKey”: simply defines under which value the RequestTemplate value is stored in the result map. As you saw above, we already stored the parsed product IDs with key “productIds”, and RequestTemplate used for the request will be stored under “requestTemplate”.
  • “mapFutureMetricRowCalculation” specifies which metrics are calculated based on which judgement file, which metrics are calculated (in the example DCG@10, NDCG@10, PRECISION@4, ERR). It also specifies validations on judgements and handling of missing judgements. In the given example validations include validation that there are productId results, that some judgements exist for the results, and if those validations are passed missing judgements are handled as by replacement with value 0.0.
  • “singleMapCalculations” is empty, thus no additional calculations are executed apart from above defined metrics.
  • “allowedTimePerElementInMillis” is set to 1000, thus we allow up to 1s for each element to finish processing
  • “allowedTimePerBatchInSeconds”: here we allow 6000 seconds, thus 100 minutes for a single batch to execute
  • “allowedTimeForJobInSeconds”: is set to 720000 seconds, meaning 120 * 100 minutes
  • “expectResultsFromBatchCalculations” is set to “false”, thus no results are serialized and sent across the cluster back to the Job Manager.
  • “wrapUpFunction” is defined such that all results matching the given regex are aggregated to an overall result and written to file with name “(ALL1)” into the same folder the single results were picked from (e.g for this subDir must be same as jobName). Also, a weightProvider can be defined to provide distinct weights per query, or just one of type “CONSTANT” with the weight to apply for all samples:
{
  "type": "AGGREGATE_FROM_DIR_BY_REGEX",
  "weightProvider": {
    "type": "CONSTANT",
    "weight": 1.0
  },
  "regex": "[(]q=.+[)]",
  "outputFilename": "(ALL1)",
  "readSubDir": "testJob",
  "writeSubDir": "testJob"
}

Example Aggregation / Analyze Executions

Kolibri provides an execution-endpoint, for which examples can be found in the ’testAggregation.json’ (aggregation example, same as used for the wrapup function above) and ’testAnalyze.json’

{
  "type": "ANALYZE_BEST_WORST_REGEX",
  "directory": "testJob",
  "regex": "[(]q=.+[)]",
  "currentParams": {
    "a1": ["0.45"],
    "k1": ["v1", "v2"],
    "k2": ["v3"],
    "o": ["479.0"]
  },
  "compareParams": [
    {
      "a1": ["0.32"],
      "k1": ["v1", "v2"],
      "k2": ["v3"],
      "o": ["1760.0"]
    },
    {
      "a1": ["0.45"],
      "k1": ["v1", "v2"],
      "k2": ["v3"],
      "o": ["384.0"]
    },
    {
      "a1": ["0.45"],
      "k1": ["v1", "v2"],
      "k2": ["v3"],
      "o": ["1325.0"]
    }
  ],
  "metricName": "NDCG_10",
  "queryParamName": "q",
  "n_best": 5,
  "n_worst": 4
}

The latter picks the single result files according to the provided regex, defines the parameters to compare against (‘currentParams’) and the variants to compare against (‘compareParams’). Futher, ‘metricName’ defines the name of the metric to use for comparison, and ’n_best’ and ’n_worst’ defines the n most increasing values / most decreasing values to be kept. ‘queryParamName’ specifies the parameter name that in this example is extracted by regex from the file name of the partial result.