AnythingGraph — AI metadata

Definition

In AnythingGraph, every record type (entity) has fields with a name, type, and required flag. Beyond that, you can attach AI metadata: three text properties that describe the field for machines and humans building extraction or agent workflows.

AI metadata does not change validation rules or storage. It is documentation embedded in the schema so VLMs, MCP tools, and your own ETL jobs know what each column means and where to find values in source material.

In the dashboard, open Entity structure → create or edit a record type → expand AI metadata on each field row.

The three fields

Description

JSON key: description

Plain-language meaning of the field — what business fact it captures, units, or constraints agents should respect.

Example: “Unique invoice identifier from the vendor”

Example

JSON key: example

A representative value showing format and style. Helps models normalize output (casing, prefixes, date shape).

Example: INV-2024-0042

Extraction hint

JSON key: extraction_hint

Where or how to locate the value in a document, email, PDF, or upstream API payload — especially useful for VLMs reading scans.

Example: “Top-right header, labeled Invoice #”

All three are optional strings. Empty values are omitted or stored as blank; they do not block ingest or row creation.

Where to set it

Source	When to use
Dashboard → Entity structure	Custom record types; edit anytime; best for iterative VLM tuning
Playbook JSON (catalog)	Ship defaults with starter packs under `playbook/playbooks/`
Data-layer API / MCP	`POST /entities`, `PUT /entities/:id`, or MCP `create_entity` / `update_entity` with `description`, `example`, `extraction_hint` in each field object

How it is used

VLM document extraction — fetch entity schema, build a prompt from AI metadata, extract JSON, ingest via playbook webhook. See the VLM extraction guide.
MCP agents — get_entity returns full field definitions including AI metadata for tool-using assistants.
REST integrators — GET /api/entities/:id (dashboard) or GET /entities/:id (data-layer) expose the same properties.
Dashboard entity detail — view descriptions and extraction hints when reviewing schema (expand field metadata on the entity page).

Not used for: row-level access control (ReBAC uses subject + playbook headers) or automatic OCR — AnythingGraph stores metadata; your pipeline runs the vision model.

Example field in API JSON

{
  "field_name": "invoice_number",
  "field_type": "TEXT",
  "is_required": true,
  "is_identifier": true,
  "description": "Unique invoice identifier from the vendor",
  "example": "INV-2024-0042",
  "extraction_hint": "Top-right of page 1, labeled Invoice #"
}

API access

# Full entity with AI metadata on every field
curl -s http://127.0.0.1:5180/api/entities/1

# MCP (agent host)
# list_entities → get_entity(entity_id)

Related concepts

Identifier field (is_identifier) — marks the primary business key for a row; separate from AI metadata but often set on the same field.
Playbook field mappings — rename incoming JSON keys to canonical field names; mappings are not AI metadata.
Graph query manifest — entity names and relationships for ask_graph; does not include extraction hints (use get_entity instead).

What is AI metadata?

Definition

The three fields

Description

Example

Extraction hint

Where to set it

How it is used

Example field in API JSON

API access

Related concepts