Getting started
Architecture at a glance
clmSpace runs on Microsoft 365 and Azure with Dataverse as the system of record. This page walks through every component your tenant interacts with.
Topology
From the user’s point of view there is one product: the portal at app.clmspace.com. Underneath, four planes do the work.
| Plane | Component | Purpose |
|---|---|---|
| Identity | Microsoft Entra ID | SSO. Per-tenant app registration; user identity flows through to every backend call. |
| Storage of record | Microsoft Dataverse | Agreements, obligations, citations, playbook entries, lifecycle events, audit log. |
| File storage | SharePoint Online | Source PDFs of agreements and templates live in tenant-owned SharePoint libraries. |
| Compute | Azure Container Apps (UK South) | The backend API, extraction, and analysis. A scheduled job checks bound SharePoint folders roughly every five minutes. |
| UI | Next.js portal (this app) | Workspace at /, docs at /docs. Delivered through Vercel (London); the system of record stays in Dataverse and SharePoint (see Data residency). |
The data path
A new contract dropped into a tenant’s SharePoint Agreements folder goes through this sequence:
- Intake checks each bound folder roughly every five minutes through Microsoft Graph and picks up new files.
- New files are pulled and parsed (PDF, with OCR for scanned pages, plus Word, Excel, CSV and text), and a schema-locked Claude Sonnet extraction call returns structured obligations with verbatim citation spans. Claude Sonnet is the default for enterprise-grade recall.
- Obligations land in Dataverse as AI-proposed. Each citation carries the bounding box the in-app PDF viewer uses to highlight the original text.
- An analysis pass with Claude Sonnet compares each obligation to the tenant’s playbook, raising deviations where the actual position strays from the standard.
- Each item is treated as authoritative once a human verifies it through the Verification gate.
Where your data lives
- Dataverse: your derived structured contract data (agreements, obligations, citations) lives in your own Microsoft Power Platform tenant, one environment per tenant.
- SharePoint: source documents stay in your own Microsoft 365 tenancy. clmSpace reads each PDF through Microsoft Graph and keeps no separate copy of the source file.
- Backend API: the extraction and analysis service runs on Azure Container Apps in UK South. A read model in Neon Postgres (AWS London, UK) holds a tenant-scoped replica of derived structured data so lists and dashboards stay fast.
- Anthropic: text from your documents is sent to Anthropic (United States) for inference only. Under Anthropic’s API terms and our commercial agreement, this data is not used to train models. See Anthropic data handling.
The two SharePoint roles
Every bound folder carries a role:
Templates
Tenant-authored draft contracts. Their extracted obligations feed your Tier 1 standards. Never treated as live obligations on a counterparty.
Agreements
Signed contracts. Files dropped here are extracted and surfaced in the workspace as live obligations and renewal candidates.
Intake is built in
SharePoint folder bindings and the built-in intake schedule move documents from SharePoint into clmSpace for you: bind a folder once and new files are picked up automatically. A webhook integration path is also available for tenants that prefer to push documents in directly.