How I Built a Second Brain with Karpathy's LLM Wiki: 153 Reports to Living Knowledge Graph

When Andrej Karpathy published his LLM Wiki pattern on GitHub Gist in early April 2026, it hit like a revelation. I had already been building a second brain with Obsidian and Claude Code. But something was missing--a systematic way to extract structured knowledge from raw sources. Karpathy's pattern was exactly that missing piece.

I took 153 research and sensing reports sitting idle in Obsidian, ran them through the LLM Wiki pipeline, and ended up with 146 source summaries, 48 entity pages, and 29 concept pages--all cross-linked into a living knowledge graph. Here is exactly how I did it, what worked, and what still needs fixing.

llm-wiki. GitHub Gist: instantly share code, notes, and snippets.
Karpathy's LLM Wiki GitHub Gist--the blueprint for compile-time knowledge processing.
Source: llm-wiki

What Is Karpathy's LLM Wiki and Why It Matters

LLM Wiki is a knowledge management pattern where an LLM reads raw source documents, extracts entities and concepts, and writes structured wiki pages in markdown. Unlike RAG (Retrieval-Augmented Generation), which searches and assembles knowledge at query time, LLM Wiki processes everything at ingest time. The knowledge synthesis cost occurs once, not on every question.

Karpathy describes this as "compile time vs. query time." RAG assembles answers from scattered chunks every time you ask. LLM Wiki compiles knowledge into organized pages once, then serves answers from that structured base.

The numbers back this up. According to Atlan's analysis, LLM Wiki can reduce token usage by up to 95% for small-to-medium knowledge bases. Karpathy's own wiki--built from a single ML research domain--already holds roughly 100 documents and 400,000 words. None of it written by hand.

Why RAG Falls Short for Personal Knowledge

RAG has a fundamental problem for knowledge workers: it does not accumulate understanding. Upload 10 PDFs and ask a question. The LLM performs a vector search, finds relevant chunks, and assembles an answer. Ask the same question tomorrow and it starts from scratch.

This matters most when questions require cross-referencing multiple sources. If an answer depends on connecting insights from 5 different documents, RAG has to locate and stitch together scattered fragments every single time. There is no memory. No building on previous synthesis.

LLM Wiki flips this model. When a new source is ingested, the LLM reads it fully, identifies entities (companies, people, technologies), extracts concepts (patterns, frameworks, methodologies), and either creates new wiki pages or updates existing ones with the new information. Cross-references are generated automatically. The knowledge base grows richer with every source added.

LLM Wiki vs RAG comparison
LLM Wiki processes knowledge at ingest time; RAG assembles it at query time. Source: Atlan

The 3-Layer Architecture: Simpler Than You Think

LLM Wiki's structure is surprisingly minimal. Three layers handle everything:

Layer	Role	Contents
Raw Sources	Immutable storage	PDFs, articles, meeting notes--original documents that never change
The Wiki	Knowledge pages	Markdown files written and maintained by the LLM
The Schema	Rules and structure	Configuration defining wiki organization and workflows

Drop a source into the raw folder. The LLM reads it, summarizes it, extracts entities and concepts, and creates or updates wiki pages. Cross-references are linked automatically. In Karpathy's words, "the tedious part of maintaining a knowledge base is not reading or thinking--it's the organizing," and the LLM handles precisely that organizing work.

4 Commands That Power the Entire System

I built the LLM Wiki workflow using Claude Code's sub-agent and skill architecture. Four commands handle the complete lifecycle:

/ingest -- Transform Sources into Knowledge

This command analyzes documents in the raw folder, generates source summary pages, and creates or updates entity and concept pages. Indexes, overviews, and logs are refreshed automatically. One source in, multiple wiki pages updated.

The key insight: ingestion is not just summarization. The LLM identifies which entities (Google, Anthropic, specific researchers) and which concepts (agentic workflows, knowledge graphs, prompt engineering patterns) appear in the source. Each gets its own page that grows as more sources mention it.

/query -- Search and Synthesize Across the Wiki

This command searches the entire wiki, reads relevant pages, and produces a synthesized answer with source citations. It is not keyword matching. Because the knowledge is already structured and interconnected, the LLM reasons over organized information rather than raw fragments. Each answer includes a confidence assessment (high / medium / low).

/file-answer -- Turn Answers Back into Knowledge

This is the most compelling part of the system. The /query result gets saved as a wiki page in wiki/answers/. The answer to a question becomes a new source in the wiki. Knowledge circulates and expands through this feedback loop. This is what transforms a static repository into a growing knowledge system.

/lint -- Health Check for the Wiki

Detects orphan pages, broken links, frontmatter errors, and contradictions between sources. As the wiki scales, this maintenance becomes essential.

From 153 Reports to a Knowledge Graph: Real Results

Here is what happened when I fed 153 research and sensing reports--previously sitting dormant in Obsidian--into the LLM Wiki pipeline using /ingest:

146 sources successfully ingested (95.4% success rate)
48 entities automatically extracted (companies, people, technologies)
29 concepts automatically generated (patterns, frameworks, methodologies)

The numbers alone are impressive. But the real value is in the connections. Entity pages link to related concepts. Concept pages reference their original sources. Navigate from an entity like "Anthropic" to a concept like "agentic engineering" to the original report that first introduced the connection. Obsidian's graph view makes this entire structure visible at a glance.

The difference from manual tagging is stark. Previously, I had to tag and link notes by hand. LLM Wiki understands the content and builds the classification system automatically. The organizing burden nearly disappears. Manually categorizing 48 entities and 29 concepts would have taken days. The LLM completed it in a fraction of the time.

Obsidian + Claudian: No Terminal Required

Here is a practical tip that dramatically improves the workflow. Claudian is an Obsidian plugin that embeds Claude Code directly in the Obsidian sidebar. With Claudian installed, you can run /query without ever opening a terminal.

Claudian - An Obsidian plugin that embeds Claude Code as an AI collaborator in your vault
Claudian embeds Claude Code directly in Obsidian's sidebar. Source: GitHub - YishenTu/claudian

The workflow becomes seamless: browse the knowledge graph in Obsidian's graph view, spot something interesting, ask /query in the sidebar, and if the answer is valuable, save it with /file-answer. Explore, question, and capture knowledge--all without leaving one tool.

Claudian works because it has direct access to vault files. Reading, writing, searching, and bash commands all function natively. If Claude Code CLI is installed, Claudian works out of the box with no additional configuration.

Building a Web Interface

I also built the LLM Wiki as a standalone website. While Obsidian with Claudian excels for editing and daily operation, a web interface provides better readability and shareability. The wiki's interlinked page structure feels more natural in a browser. Running both in parallel--Obsidian for authoring, web for reading and sharing--is currently the most effective combination.

3 Challenges That Still Need Solving

Honest assessment: the system is not perfect yet. Three issues stand out.

1. Hallucination Management

The LLM occasionally adds information that does not exist in the original source when writing wiki pages. The /lint command catches contradictions between sources, but it cannot fully filter content that goes beyond the source material. Atlan's analysis flags the same concern: unlike RAG's transient hallucinations, incorrect information written into a wiki page can propagate through subsequent generations. Solving this is the highest-priority challenge.

2. Taxonomy Scaling

The current system uses two classification axes: entities and concepts. With 48 entities, this works fine. At 200+ entities, finer grouping will be necessary. Designing a more wiki-native category hierarchy is an open problem.

3. Automated Insight Extraction

The system currently structures knowledge and answers questions. The next frontier is automatically surfacing new insights from connected knowledge--identifying common patterns across different entities or hidden relationships between concepts without being explicitly asked.

How to Get Started Today

If you have notes accumulating in Obsidian but rarely use them, LLM Wiki is worth trying. You do not need to start with everything. Pick one topic you care about. Gather 5-10 source documents.

Here is one action you can take right now: open Karpathy's llm-wiki gist and tell Claude Code: "Use this prompt as a reference to set up an LLM Wiki in my Obsidian vault." The AI handles the rest.

The gap between collecting information and actually using it has always been the organizing work. LLM Wiki removes that gap entirely.

Frequently Asked Questions

What is the difference between LLM Wiki and RAG?

LLM Wiki processes and structures knowledge at ingest time, creating persistent wiki pages. RAG searches and assembles answers from raw documents at query time. LLM Wiki provides accumulated, interconnected knowledge; RAG provides on-demand retrieval without memory between queries.

Do I need coding skills to set up an LLM Wiki?

No advanced coding is required. With Claude Code installed, you can set up the entire system by referencing Karpathy's GitHub Gist and asking the AI to configure the wiki structure, commands, and schema in your Obsidian vault.

How does LLM Wiki handle hallucinations?

The /lint command detects contradictions between sources and structural errors. However, content that goes beyond source material is harder to catch automatically. This remains an active challenge. Best practice is to periodically review high-traffic wiki pages against their original sources.

Can LLM Wiki replace my existing note-taking system?

LLM Wiki complements rather than replaces tools like Obsidian. It adds an automated knowledge structuring layer on top of your existing vault. Your original notes stay unchanged in the raw sources folder while the LLM builds and maintains the wiki layer.

How many sources can LLM Wiki handle effectively?

Karpathy's own wiki runs about 100 documents at 400,000 words. The system tested in this article processed 153 reports successfully. For small-to-medium knowledge bases, Atlan reports up to 95% token savings compared to RAG. Very large corpora may benefit from a hybrid approach.

aboutcorelab

Search This Blog