Knowledge Graphs for AI Coding Assistants

An AI coding assistant is only as good as the context it can fit in a prompt. A knowledge graph of your repository gives it a compact, navigable map — so it reasons about structure instead of grepping raw files.

The context-window problem

Every coding assistant — Claude Code, OpenAI Codex, OpenCode, OpenClaw, Factory Droid — hits the same ceiling: a codebase plus its docs, RFCs, papers, and diagrams does not fit in a single prompt. Traditional RAG splits everything into chunks and retrieves by embedding similarity, but that loses structural information: who calls whom, which module depends on which, what rationale sat in the commit message that created a function.

A knowledge graph preserves that structure. Nodes are concepts — classes, functions, design decisions, paper sections, diagrams. Edges are relationships: calls, imports, rationale_for, semantically_similar_to. Instead of retrieving chunks, the assistant traverses edges.

Why graphs beat vector search for code

Structure is signal

DigestAuth → Response is a meaningful edge whether or not the two files share vocabulary. Vector similarity misses structural relationships.

Provenance is preserved

Every edge in tracely360 is tagged EXTRACTED, INFERRED, or AMBIGUOUS with a confidence score. You always know what was found vs guessed.

Multi-modal by construction

A diagram node can connect to a code class node and a paper-section node on the same graph — impossible with a flat vector store.

Compression compounds

On a 52-file mixed corpus, an average query costs ~1.7k tokens against the graph vs ~123k reading raw files — a 71.5× reduction.

How tracely360 fits into your assistant

tracely360 ships as a slash command. Type /tracely360 . in Claude Code, Codex, Cursor, VS Code Copilot Chat, or any of the other 14 supported assistants. It writes a tracely360-out/ folder containing an interactive graph.html, a one-page GRAPH_REPORT.md audit, and a persistent graph.json. From then on, queries read the graph instead of the raw tree.

For Claude Code there is a deeper integration: a PreToolUse hook fires before every Glob and Grep call and tells Claude to consult GRAPH_REPORT.md first, ensuring the knowledge graph is always consulted before raw file search.

What the graph actually contains

God nodes

The highest-degree concepts that everything routes through. These are your architectural keystone classes and functions — identified by betweenness centrality.

Surprising connections

Ranked cross-file or cross-modal edges, each with a plain-English explanation of why the connection is unexpected and worth investigating.

Rationale nodes

Docstrings, # NOTE: / # WHY: comments, and design discussion from docs, attached as rationale_for edges so design intent travels with the code.

Hyperedges

Group relationships connecting 3+ nodes when a pairwise edge would be lossy — e.g. all classes implementing a shared protocol appear as a single hyperedge.

Confidence model

Every edge carries an EXTRACTED, INFERRED, or AMBIGUOUS tag. EXTRACTED edges are directly proven by the source; AMBIGUOUS edges are flagged for human review.

Token reduction in numbers

~1.7k
tokens per graph query
Karpathy mixed corpus
~123k
tokens reading raw files
Same corpus, no graph
71.5×
reduction factor
Verified on 52-file corpus