Link Search Menu Expand Document

How to extract key-value objects

To extract key-value pairs from a document you need to do the following:

  • Create a new field using Add FIELD with auto KEY-VALUE extraction
  • Make sure to enable Regex for this new field

put the following expression into a expression property:

(?<key>): (?<value>)

What it means:

  1. ?<key> tells that the macro next to it is the name of the field (i.e. key). Important: if you place ?<key> and this field if found multiple times then it will generate multiple objects as output, for every matching object accordingly.
  2. ?<value> tells that the macro next to it is a value.
  3. `` is the macro that captures a sentence with single spaces inside. See all macros here

Sample:

Input text:

Name: Alfred Pennyworth
ID: 000012345
DOB: 8/16/1943 (78 years)

Output JSON objects will generate:

{
  "objects": [
    {
      "name": "Name",
      "objectType": "field",
      "value": "Alfred Pennyworth",
      "pageIndex": 0
    },
    {
      "name": "ID",
      "objectType": "field",
      "value": "000012345",
      "pageIndex": 0
    },
    {
      "name": "DOB",
      "objectType": "field",
      "value": "8/16/1943 (78 years)",
      "pageIndex": 0
    }
    ...

CSV output:

Name,ID,DOB
Alfred Pennyworth,000012345,8/16/1943 (78 years),