โ† All workers

Entity Extraction

Coming Soon

Pull structured fields out of unstructured documents. Text becomes data.

Document + ArticleStandard

What it does

Extracts named entities (people, organizations, dollar amounts, dates) and attaches them as typed, span-addressed extension fields. Each field carries character offsets back into the source for provenance.

Example output

JSONFeed extension fields attached to each processed item.

{
  "_entities": {
    "people": [{ "name": "Jane Smith", "role": "CFO", "span": [1204, 1214] }],
    "organizations": [{ "name": "Acme Corp", "ticker": "ACME", "span": [502, 511] }],
    "amounts": [{ "value": 4200000, "currency": "USD", "span": [3301, 3316] }]
  }
}

Use cases

  • SEC filing analysis โ€” extract officers, amounts, effective dates
  • Patent processing โ€” resolve assignees, pull classification codes
  • News monitoring โ€” track when specific companies are mentioned
  • Compliance โ€” build audit trails with cited provenance