Concepts

Obligations & verbatim citations

Every fact clmSpace records is anchored to a verbatim span of source text. Obligations without citations are not obligations.

What an obligation is

An obligation is a single structured fact extracted from a contract: a payment term, a notice period, a liability cap, an SLA, an exclusivity restriction. The extractor is bound by a JSON schema (Anthropic tool_use), so the shape never drifts.

Recorded fields (selection)

  • obligation_type: Payment, Notice, Renewal, Termination, SLA, MinimumCommitment, Exclusivity, DataProtection, Reporting, IP, Confidentiality, Liability.
  • value_text: the obligation expressed for humans (“Net 30”, “12 months”, “£500,000”).
  • value_numeric / value_unit: normalised for arithmetic where applicable.
  • direction: vendor vs customer side of the obligation.
  • citation: see below.
  • verified: false until a human passes it through the verification gate.

What a citation is

A citation is the bridge between the structured obligation and the original PDF. It records:

  • The document_id of the source PDF.
  • The page number (1-indexed).
  • A bbox: a bounding box [x0, y0, x1, y1] in PDF user-space units.
  • The verbatim text span that was the basis for the extracted value.
json
{
  "obligation_type": "Payment",
  "value_text": "Net 30 from invoice date",
  "value_numeric": 30,
  "value_unit": "days",
  "direction": "customer",
  "citation": {
    "document_id": "9b1f…",
    "page": 7,
    "bbox": [72.0, 540.2, 540.0, 558.6],
    "text": "Customer shall pay each invoice within thirty (30) days of the invoice date."
  },
  "verified": false
}

Why this matters

  • Trust. A reviewer in the workspace clicks the citation chip and the PDF viewer opens with the exact span highlighted. No guessing what the AI “saw”.
  • Audit. Every authoritative obligation in Dataverse can be traced back to ink. Compliance teams stop asking “where did this come from?”.
  • Drafting. When the platform proposes a counter-position or a renewal recommendation, it cites the source span it is responding to.
No citation, no write
The extractor refuses to emit an obligation without a citation. If a clause is genuinely absent from the document, the extractor returns nothing for that obligation_type; it reports only what the text supports.