A parsing configuration consists of the following parts:
selector
that defines which fields to pick from a jsonname
under which the extracted data is storedcastType
that defines what the value is cast to (Note: for a recursive selector that extracts a sequence of single-value fields, use the single-value cast type, that is if you use a recursive selector and each single extracted element is a string, you will use castType ‘STRING’, not ‘SEQ[STRING]’. If every element is a list of strings, ud use ‘SEQ[STRING]’)The selector
syntax is straight-forward. Let’s use the following json as example:
{
"response": {
"numFound": 10,
"docs": [
{
"product_id": "id1",
"description": "yummy yummy",
"title": "yummy",
"innerJson": {
"key1": "value1"
}
}
]
}
}
Now we distinguish between plain
and recursive
selectors, while both selectors can be combined:
plain
: \
is the selector. Can apply multiple to navigate deeper into a structure. Example: response \ numFound
(in this case castType should be set to INT
).recursive
: \\
is the selector. Is used to extract sequential values from a list of jsons. Example: response \ docs \\ product_id
(in this case castType should be set to STRING
, although the result of applying the selector will be a list of strings).
If your recursive selector picks up elements that are themselves json objects, you can pick a field by just applying another plain
selector, as in response \ docs \\ innerJson \ key1