Metadata-First Knowledge Graphs: Solving the Enterprise AI Context Problem
The Most Expensive Question in Enterprise AI
Every day, millions of AI-powered applications ask the same question: "What does this organization know?"
And every day, they answer it the same way, badly.
A modern enterprise AI assistant responding to "Who owns the authentication service?" must search email (312 results), scan Slack (47 threads), parse meeting transcripts (12 mentions), check the wiki (3 outdated pages), and query the directory (1 person who transferred 8 months ago). It loads 40,000–55,000 tokens of context, makes sequential API calls to 4-6 systems, waits 6-15 seconds, and (here's the expensive part) does this from scratch on every single query.
Now multiply that by every AI application in the organization. The chat assistant does it. The code assistant does it. The meeting summarizer does it. The onboarding tool does it. Each one independently builds the same fragmented picture, consuming the same tokens, hitting the same APIs, implementing the same permission checks.
At enterprise scale, organizations with 100,000+ employees, this architectural failure represents hundreds of millions of dollars annually in redundant inference costs, duplicated infrastructure, and wasted engineering time. Gartner projects that over 40% of agentic AI projects will be canceled by the end of 2027, citing escalating costs, unclear business value, and inadequate risk controls, failure modes that missing organizational context directly compounds. Foundation Capital, meanwhile, calls organizational context graphs "AI's trillion-dollar opportunity."
This article proposes an alternative architecture.
Why RAG Alone Doesn't Solve This
Retrieval Augmented Generation (RAG) was the first wave of enterprise AI architecture. It works: ingest documents, create embeddings, retrieve relevant chunks at query time, pass them to the LLM. For document Q&A, it's excellent.
But RAG has a fundamental limitation for organizational intelligence: it treats all information as content to be searched, when most organizational questions are about relationships.
Consider these queries:
- "Who actually owns the retry logic in the payment pipeline?" (relationship: person → code → system)
- "Are two teams building the same caching layer?" (relationship: team → project → capability overlap)
- "What decisions were made about the Q3 roadmap?" (relationship: meeting → decision → project → timeline)
RAG retrieves documents that mention these concepts. It doesn't know how they connect. The LLM must infer relationships from context, requiring massive token budgets and still frequently hallucinating connections that don't exist.
The fundamental insight: most organizational queries are graph traversal problems, not search problems.
The Two-Tier Architecture
The architecture I propose inverts the enterprise AI cost structure through two complementary layers:
Tier 1: Metadata Knowledge Graph (Zero LLM Cost)
Core idea: Pre-compute organizational relationships from metadata only, no full document ingestion required.
Source Systems (Email, Calendar, Slack, Directory, Wiki, Tickets)
│
▼ (metadata extraction, not content)
Relationship Extraction
│
├── WHO: Person → Team → Manager → Org
├── WHAT: Project → Service → Component → Owner
├── WHEN: Meeting → Decision → Action Item → Deadline
├── WHERE: Document → Author → Reviewers → Related Docs
└── HOW: Communication frequency → Collaboration strength
│
▼
Knowledge Graph (Neptune / Neo4j / equivalent)
│
▼
Graph Traversal API (sub-second, zero LLM cost)
What metadata reveals without reading content:
- Email headers: who communicates with whom, about what projects (subject lines), how frequently
- Calendar data: who attends which meetings, recurring vs. one-off, duration signals importance
- Directory: org structure, role changes, team membership
- Ticket systems: who's assigned what, which projects are active, what's blocked
- Code repos: who commits where, who reviews whom, which services change together
- Wiki: who authored what, when it was last updated, which pages link to each other
The 70% insight: In my experience, approximately 70% of organizational queries ("who owns X?", "what team handles Y?", "when was Z decided?") are answerable through graph traversal of pre-computed relationships alone. No LLM required. Sub-second response. Zero token cost.
Tier 2: Graph-Informed Content Fetch (Targeted LLM Cost)
For the remaining 30%, questions that require actual content understanding ("summarize the authentication team's concerns about the migration"), the knowledge graph provides surgical targeting:
User Query: "What were the concerns about migrating to the new auth system?"
│
▼
Tier 1 Graph Traversal
│
├── Identifies: Auth team members (from directory + commit history)
├── Identifies: Migration project (from tickets + meetings)
├── Identifies: Relevant time window (from project timeline)
└── Identifies: Communication artifacts (5 emails, 2 meeting summaries)
│
▼
Tier 2: Fetch ONLY those 5 emails + 2 summaries
│
▼
LLM: Summarize concerns (2,000 tokens instead of 55,000)
The cost inversion:
- Traditional: Load everything → Filter with LLM → 55,000 tokens
- Two-tier: Graph determines relevance → Fetch only what matters → 2,000 tokens
- Result: Up to 8x reduction in per-query inference cost
Why Metadata-First (Not Content-First)
This is the counterintuitive design decision: why build the graph from metadata instead of ingesting full document content?
1. Security Simplification
Full content ingestion creates massive security surface area. Every email body, every document paragraph, every Slack message, each with different access controls, retention policies, and sensitivity classifications.
Metadata has a radically simpler permission model:
- "Person A emailed Person B about Project X on Date Y", this fact about a relationship is typically accessible to anyone in the organization
- "Here's what the email said", this content requires permission checks
By building Tier 1 on metadata only, you eliminate 90% of the security complexity. Content access (Tier 2) is handled on-demand with real-time permission checks, never stored, never indexed, never at rest in your system.
2. Freshness at Zero Marginal Cost
Content ingestion is expensive to keep fresh. Documents change. Emails arrive continuously. Re-embedding a 500-page wiki every time someone edits a paragraph is wasteful.
Metadata changes are events: "new email sent," "meeting created," "ticket assigned." Event-driven graph updates are cheap, immediate, and incremental. The graph stays fresh without batch reprocessing.
3. The Cold Start Problem Vanishes
Content-based systems need to ingest everything before they're useful. A RAG system with 10% of documents indexed gives 10% of the value.
A metadata graph is useful from day one: connect the directory + calendar + email headers and you immediately know who works with whom, which meetings exist, and which projects are active. Full value from organizational structure without reading a single document.
4. Cross-System Relationships Emerge Automatically
The unique power of a graph approach: relationships that exist across systems, but are invisible within any single system, become explicit.
Example: The person who attends the "Auth Migration" meeting (calendar) is the same person who's assigned the blocking ticket (Jira) and who emailed the VP about concerns (email). No single system knows this. The graph knows it trivially.
The Permission Model Challenge
The hardest problem in enterprise knowledge graphs isn't technology, it's permissions.
The Naive Approach Fails
"Index everything, check permissions at query time" sounds simple. In practice:
- Source system permissions change constantly (team transfers, project reassignments)
- Some data has time-based access (visible during the project, archived after)
- Organizational hierarchy access ≠ project-level access ≠ document-level access
- "Can see metadata" ≠ "Can see content"
A Layered Permission Architecture
The approach I advocate separates permissions into three layers:
Layer 1: Graph-Level Access (Structural Relationships)
- Who can see that Person A is connected to Project B?
- Generally: anyone in the organization can see structural relationships
- Exceptions: classified projects, HR-sensitive relationships
Layer 2: Metadata-Level Access (Contextual Facts)
- Who can see that there was a meeting about X on date Y with attendees A, B, C?
- Generally: meeting attendees + their management chain
- Configurable per source system
Layer 3: Content-Level Access (Actual Documents)
- Who can read the email/document/message?
- Delegated to source system: real-time permission check before serving content
- Never cached, never stored in the graph system
This separation means the graph itself (Layers 1-2) can be broadly accessible while content (Layer 3) remains under source-system control.
Cost Model: Why This Matters at Scale
The Math That Breaks Current Architectures
| Per Query (Current) | Per Query (Two-Tier) | |
|---|---|---|
| Tokens consumed | 40,000–55,000 | 2,000–8,000 |
| API calls to source systems | 4-6 (sequential) | 0-2 (targeted) |
| Latency | 6-15 seconds | <1 second (Tier 1) / 2-4 sec (Tier 2) |
| Cost at $3/M input tokens | $0.12–$0.17 per query | $0.006–$0.024 per query |
The $3 per million input tokens figure corresponds to Claude Sonnet on Amazon Bedrock at current published rates ($3 input / $15 output per million tokens). Higher-tier models cost more and smaller models less, but the relative cost inversion between the two architectures holds regardless of the specific model.
At Enterprise Scale
| Organization Size | Daily Queries (est.) | Annual Cost: Current | Annual Cost: Two-Tier | Savings |
|---|---|---|---|---|
| 10,000 employees | 50,000 | $2.2M | $280K | $1.9M |
| 100,000 employees | 500,000 | $22M | $2.8M | $19.2M |
| 500,000 employees | 2,500,000 | $110M | $14M | $96M |
These estimates assume current LLM pricing (which is declining) and conservative query volumes. The actual savings may be higher as AI assistants become more deeply embedded in daily workflows.
Implementation Considerations
Technology Stack (Reference Architecture)
| Component | Options | Role |
|---|---|---|
| Knowledge Graph | Amazon Neptune, Neo4j, TigerGraph | Relationship storage + traversal |
| Semantic Search | OpenSearch, Elasticsearch, Pinecone | Content retrieval when Tier 2 is needed |
| Event Processing | Kafka, EventBridge, SQS | Real-time metadata ingestion |
| LLM Integration | Bedrock, OpenAI API, Anthropic | Tier 2 content summarization |
| Permission Engine | Custom or OPA | Three-layer access control |
Build vs. Buy
The market is emerging:
- Glean ($7.2B valuation, June 2025), full-content enterprise search with AI. Content-first, not metadata-first.
- Worklytics, organizational metadata analytics. Metadata-first but focused on ONA (Organizational Network Analysis), not AI-agent serving.
- Microsoft Graph, organizational metadata API. Provides raw data but no intelligence layer.
- Custom build, the two-tier architecture described here. Requires engineering investment but provides maximum control over cost optimization and permission model.
No commercially available solution currently provides the full two-tier metadata-first architecture with cost inversion. The market opportunity remains open.
Risks and Open Questions
- Graph staleness, How quickly must metadata updates propagate? For most organizational queries, minutes are acceptable; for real-time collaboration queries, seconds matter.
- Schema evolution, As new source systems are added, the graph schema must evolve without breaking existing traversals.
- False negatives, If a relationship isn't in the graph, the system won't find it. Completeness of metadata ingestion is critical.
- Adversarial queries, Can users craft queries that exploit graph relationships to infer information they shouldn't access? The permission layer must account for transitive inference.
- LLM routing accuracy, Correctly deciding whether a query needs Tier 1 only (graph traversal) or Tier 2 (content fetch) is itself an AI problem. Mis-routing is expensive in either direction.
The National Significance
Enterprise AI adoption is the defining technology challenge of this decade. The United States government recognizes this, Artificial Intelligence and Machine Learning sits on the White House Critical and Emerging Technologies list because national economic competitiveness depends on how effectively American organizations deploy AI.
The current architectural paradigm, every AI application independently rebuilding organizational context through brute-force token consumption, is a structural barrier to adoption. It makes AI expensive, slow, and security-fragmented. It limits deployment to well-funded teams who can absorb the cost.
A metadata-first knowledge graph architecture removes these barriers:
- 8x cost reduction makes AI economically viable for more use cases
- Sub-second response makes AI feel like a tool, not a waiting room
- Centralized security makes AI deployable without per-application security review
- Shared infrastructure eliminates redundant engineering across teams
This isn't a single-company problem. Every American organization with 10,000+ employees deploying AI across fragmented data sources faces the same challenge. The pattern described here, metadata-first knowledge graphs with two-tier cost inversion, is a reference architecture for the entire enterprise AI industry.
Conclusion
The enterprise AI context problem is a trillion-dollar challenge hiding in plain sight. Current approaches, brute-force RAG, per-application data pipelines, sequential API calls, work at small scale but break catastrophically as organizations deploy AI to hundreds of thousands of users.
The two-tier metadata-first knowledge graph architecture provides a path forward:
- Tier 1 answers 70% of queries through graph traversal at zero LLM cost
- Tier 2 surgically retrieves only relevant content for the remaining 30%
- The permission model separates structural access from content access
- The cost structure inverts from "expensive by default" to "expensive only when necessary"
The organizations that solve this problem first will deploy AI faster, cheaper, and more securely than their competitors. The architectural pattern exists. The market has validated the category. The question is who builds it.
Cost and query-distribution figures in this article are illustrative estimates based on current public LLM pricing and the author's professional experience; they are intended to model the economics of the pattern, not to report any specific organization's metrics.
References
- Gartner, "Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027" (June 2025).
- Foundation Capital, "Context Graphs: AI's Trillion-Dollar Opportunity" (2025).
- TechCrunch, "Enterprise AI startup Glean lands a $7.2B valuation" (June 2025).
- White House Office of Science and Technology Policy, Critical and Emerging Technologies List (February 2024 update).
- Amazon Web Services, Amazon Bedrock Pricing (per-token model rates).
Sebastian Undurraga is a Senior Technical Program Manager specializing in enterprise AI architectures, knowledge graph systems, and deploying AI at organizational scale. He designs systems that make AI economically viable for workforces of hundreds of thousands.