AnythingGraph — VLM document extraction guide

Overview

AnythingGraph does not run OCR or vision models inside the platform. You provide the document; your VLM (GPT-4o, Claude, Gemini, or an on-prem model) returns JSON that matches your entity structure. The graph stores validated rows, applies playbook mappings, and exposes the same data to the dashboard and MCP agents.

Each entity field supports AI metadata: description, example, and extraction_hint (where to find the value on a page). Playbooks package record types, relationships, connector mappings, and ingest instructions.

Recommended playbook for invoices: invoice-records-structured — assumes extraction happens outside AnythingGraph; the playbook validates and graphs invoice facts.

Prerequisites

Run the stack from the repository root: ./start-all.sh (data-layer :8182, dashboard API :5180, MCP HTTP :3333/mcp when started separately).
Install a playbook in the dashboard (Playbooks → Install), or create custom record types under Entity structure with AI metadata filled in.
Choose how your orchestrator fetches schema:
- REST — dashboard API (GET /api/entities/:id, GET /api/playbooks/:id).
- MCP — connect Cursor, Claude Desktop, or your agent runtime to AnythingGraph MCP; the host calls get_entity before vision.

Step 1 — Define or install schema

Option A — Playbook (catalog)

Install e.g. invoice-records-structured, procure-to-pay, or document-registry. Playbook JSON lives under dashboard/backend/src/playbook/playbooks/ and defines entities, fields, and example payloads.

Option B — Custom entity structure

In the dashboard, open Entity structure → create or edit a record type → expand AI metadata on each field. Hints such as “top-right corner, labeled Invoice #” improve VLM accuracy.

Step 2 — Fetch the extraction specification

Before calling your VLM, assemble a machine-readable spec: entity names, field names, types, required flags, descriptions, examples, and extraction hints. Use either REST (direct HTTP) or MCP (agent host invokes tools).

Method	Best for	Primary calls
REST	Custom pipelines, server-side ETL, dashboard API	`GET /api/entities/:id`, `GET /api/playbooks/:id`
MCP	Agent + VLM in one session (no manual copy/paste)	`list_entities` → `get_entity`

Note: get_graph_query_context and anythinggraph://schema-summary are for graph Q&A, not rich extraction prompts. Always use get_entity (MCP) or GET /api/entities/:id (REST) for field-level metadata.

REST API

List entities, then fetch each definition (includes extraction_hint):

curl -s http://127.0.0.1:5180/api/entities
curl -s http://127.0.0.1:5180/api/entities/1

# Playbook catalog + install status (field defs from entities after install)
curl -s http://127.0.0.1:5180/api/playbooks/invoice-records-structured

MCP (agent host)

Connect Cursor, Claude Desktop, or your orchestrator to http://127.0.0.1:3333/mcp (cd mcp-service && npm run start:http). Tool sequence:

health_check
get_graph_query_context with optional playbook_id
get_entity for each entity in scope

{
  "mcpServers": {
    "anythinggraph": {
      "url": "http://127.0.0.1:3333/mcp"
    }
  }
}

Interactive schema fetcher

Use the panel below to pull schema from a running dashboard API and generate a VLM-ready extraction spec. MCP cannot be called from the browser directly; switch to the MCP tab for host configuration and agent instructions.

Fetch extraction spec

Dashboard API base URL

Schema source

Playbook id

Generated VLM extraction spec (JSON)

Connect an MCP host (not the raw VLM API) to AnythingGraph. The host calls tools, builds the spec, then passes document bytes to your vision model in the same agent run.

MCP HTTP endpoint

Playbook id (optional scope)

Cursor / Claude Desktop config

{
  "mcpServers": {
    "anythinggraph": {
      "url": "http://127.0.0.1:3333/mcp"
    }
  }
}

Tool sequence for extraction

health_check
get_graph_query_context with playbook_id (optional — discover entity ids in scope)
For each entity: get_entity with entity_id — includes extraction_hint
Run VLM on document using composed spec
Ingest: create_entity_row per record, or POST to playbook webhook from your orchestrator

Step 3 — Build and run the VLM prompt

Pass the generated extraction spec plus the document (image or PDF pages) to your vision model.

{
  "task": "Extract structured business records from the attached document.",
  "rules": [
    "Output a JSON object with a records array.",
    "Use field_name keys exactly as defined in the schema.",
    "Use null for missing optional fields; do not invent values.",
    "Dates: prefer ISO-8601 (YYYY-MM-DD) when possible."
  ],
  "schema": { "...": "from Step 2 — entities[].fields[]" }
}

Example model output for invoice-records-structured:

{
  "records": [
    {
      "invoice_number": "INV-2024-0042",
      "vendor_name": "Acme Supplies Ltd",
      "total_amount": 1250.0,
      "invoice_date": "2024-03-15"
    }
  ]
}

Step 4 — Ingest into AnythingGraph

Playbook webhook (recommended for bulk)

After playbook install:

POST http://127.0.0.1:5180/api/playbooks/invoice-records-structured/webhook
Content-Type: application/json

{
  "records": [
    {
      "invoice_number": "INV-2024-0042",
      "vendor_name": "Acme Supplies Ltd",
      "total_amount": 1250,
      "invoice_date": "2024-03-15"
    }
  ]
}

The connector validates required fields, applies field mappings, routes rows to entities, and sends failures to the landing zone.

MCP row insert (single records)

create_entity_row(
  entity_id=<from list_entities>,
  values_json='{"invoice_number":"INV-2024-0042",...}'
)

SDK

client.dashboard.playbook_webhook("invoice-records-structured", {"records": [...]})

Step 5 — Validate and use the graph

Review failed rows in the dashboard Landing zone.
Sync RDF cache (Settings → Caching) and open Graph View.
Query via MCP ask_graph or SPARQL (sync_rdf_cache then run_sparql).

Architecture

┌─────────────┐     REST or MCP      ┌──────────────────┐
│ Orchestrator│ ──────────────────► │ AnythingGraph    │
│ (your app)  │ ◄── schema / spec   │ data-layer + MCP │
└──────┬──────┘                     └────────▲─────────┘
       │                                   │
       │ document + spec                   │ JSON records
       ▼                                   │
┌─────────────┐                     ┌──────┴───────┐
│ VLM         │                     │ Connector /  │
│ (vision API)│                     │ webhook      │
└─────────────┘                     └──────────────┘

The VLM never talks to LMDB directly. Your orchestrator owns the loop: fetch schema → extract → ingest → optional graph queries.

Troubleshooting

Issue	What to check
REST fetch fails (network)	Dashboard API running on `:5180`; CORS enabled on dashboard; open this page via `http://` not `file://`
Playbook entities not found	Install the playbook first; entity names in LMDB must match playbook `entities[].name`
MCP tools missing	`cd mcp-service && npm run start:http`; data-layer on `:8182`
Ingest validation errors	Landing zone; required fields; field mappings if VLM keys differ from schema
Empty extraction hints	Playbook catalog JSON may omit hints — add them on entity fields in the dashboard or use `GET /api/entities/:id` after editing AI metadata

Document extraction with a VLM

Overview

Prerequisites

Step 1 — Define or install schema

Option A — Playbook (catalog)

Option B — Custom entity structure

Step 2 — Fetch the extraction specification

REST API

MCP (agent host)

Step 3 — Build and run the VLM prompt

Step 4 — Ingest into AnythingGraph

Playbook webhook (recommended for bulk)

MCP row insert (single records)

SDK

Step 5 — Validate and use the graph

Architecture

Troubleshooting