README file from
Github
Karpathy LLM Wiki Plugin for Obsidian
AI-powered structured knowledge base that ingests your notes and generates a connected Wiki — based on Andrej Karpathy's LLM Wiki concept.
Author: Greener-Dalii | Version: 1.7.9
English | 中文文档 | Discussions
What is LLM-Wiki?
You write. AI organizes. You ask. That's it.
The problem. Your notes are a goldmine — people, concepts, ideas, connections. But right now they're just files in folders. Finding what relates to what means searching, tagging, and hoping you remember the thread.
The fix. Andrej Karpathy suggested something elegant: treat your notes as raw material, and let an LLM do the architect work. It reads what you write, pulls out entities and concepts, and weaves them into a structured Wiki — complete with [[bidirectional links]], an auto-generated index, and a chat interface that answers questions from your knowledge.
So you don't have to be the librarian. No deciding what deserves a page. No maintaining cross-links. No wondering if something is out of date. Drop notes into sources/ and the LLM reads, extracts, writes, links, and even flags contradictions — while you stay in flow.
And it's not another chatbot. ChatGPT knows the internet. LLM-Wiki knows you — or rather, what you've taught it. Every answer carries [[wiki-links]] back into your knowledge graph. Every response is a trailhead, not a dead end.
Why Obsidian + LLM-Wiki?
Obsidian is brilliant at linked thinking. But there's a catch: you're the one doing all the linking.
LLM-Wiki flips that. Instead of you building the graph by hand, the AI grows it with you. Add a note about a new concept — it finds the connections you'd miss. Ask a question — it walks your own knowledge graph and brings back answers with citations.
- Your Graph View comes alive. New notes don't just sit there — they sprout links to entities, concepts, and sources. The graph grows organically, not manually.
- Your notes learn to talk back. Search becomes conversation. "What did I write about X?" becomes a dialogue, with streaming responses and
[[wiki-links]]as breadcrumbs. - Obsidian becomes a thinking partner. It stops being a cabinet for notes and starts being something that helps you think — surfacing connections, spotting gaps, remembering what you forgot you knew.
Features
Ingestion Improvements
- Smart Batch Skip — Automatically detect and skip already-ingested files during folder batch operations, saving time and API costs. Shows skipped count in reports
- Parallel Page Generation — Configurable 1-5 concurrent pages for sources with 50+ entities, 3x faster with error isolation
- Iterative Batch Extraction — Adaptive batch sizing eliminates max_tokens bottleneck for long documents, extracting all entities/concepts systematically
- Verbatim Source Mentions — Original language quotes preserved with optional translation, ensuring traceability
Knowledge Quality
- Save-to-Wiki Quality — Conversation saves now match file ingestion quality: proper summary pages, frontmatter fields, entity/concept reports
- Enhanced Entity/Concept Relations — Separate "Related Entities" and "Related Concepts" sections for bidirectional tracking
- Smart Knowledge Fusion — Intelligent merge on page updates: detect duplicates, preserve contradictions with attribution, maintain links
- Content Truncation Protection — 8000 max_tokens + automatic retry at 2x tokens across all providers
- Contradiction State Machine —
detected → review_ok → resolved(AI fix) ordetected → pending_fix(manual)
Query & Feedback
- Conversational Query — ChatGPT-style dialog with streaming Markdown and
[[wiki-links]], multi-turn history - Query-to-Wiki Feedback — 3-stage value assessment on close, semantic deduplication before save
- Duplicate Save Prevention — Hash tracking stops re-evaluation of unchanged conversations
LLM & Language
- Multi-Provider Support — Anthropic, Anthropic Compatible (Coding Plan), Gemini, OpenAI, DeepSeek, Kimi, GLM, OpenRouter, Ollama, custom endpoints
- Dynamic Model List — Real-time fetching from provider APIs
- Wiki Output Language — 8 languages independent of UI (EN/ZH/JA/KO/DE/FR/ES/PT), with custom input
- Internationalization — English and Chinese UI (default: English), all notices respect language setting
Maintenance & Architecture
- Schema Layer —
wiki/schema/config.mdwith templates, merge policies, content guidelines injected into all prompts - Lint AI Auto-Fix — Per-item buttons for dead links, empty pages, orphans in LintReportModal
- Auto Maintenance — Multi-folder file watcher, periodic lint, startup health check (all optional)
- Knowledge Graph — Entity/concept relationships visualized in Obsidian's Graph View
- Auto Index —
index.mdandlog.mdmaintained automatically - Modular Codebase — 9 focused modules for maintainability
Quick Start
Installation
Manual (recommended):
- Download
main.js,manifest.json,styles.cssfrom Releases - In Obsidian, go to Settings → Community plugins. On the Installed plugins tab, click the folder icon to open your plugins directory
- Create a folder named
karpathywiki, drop the three files inside - Back in Obsidian, click the refresh icon — Karpathy LLM Wiki will appear under Installed plugins
- Toggle it on to enable
Development: git clone, pnpm install, pnpm build.
Configure an LLM Provider
- Open Settings → Karpathy LLM Wiki
- Pick a provider from the dropdown (Anthropic, Anthropic Compatible, Google Gemini, OpenAI, DeepSeek, Kimi, GLM, Ollama, OpenRouter, or custom)
- Enter your API key (not needed for Ollama)
- Click Fetch Models to populate the model dropdown, or type a model name manually
- Click Test Connection, then Save Settings
Ollama (local, no API key): Install Ollama, pull a model (ollama pull gemma4), select "Ollama (Local)" in the provider dropdown.
See README_CN.md for provider-specific instructions in Chinese.
Model Selection Guide
This plugin follows Karpathy's philosophy: feed the LLM full Wiki context, not chunked RAG retrieval. Long-context models are strongly recommended — the larger your Wiki grows, the more context the LLM needs to maintain cross-page consistency and answer questions accurately.
Why not RAG/embeddings? Karpathy's original critique argues that RAG fragments knowledge and breaks the LLM's ability to reason across the full knowledge graph. A single long-context LLM call over the relevant Wiki pages preserves relational understanding.
Top recommendations:
| Model | Context Window | Why |
|---|---|---|
| DeepSeek V4 | 1M tokens | Best value — ultra-low pricing, strong Chinese support. Ideal for large Wikis. |
| Gemini 3.1 Pro | 1M+ tokens | Largest context window. Strong reasoning. Excellent for very large Wikis. |
| Claude Opus 4.7 | 1M tokens | Strongest agentic coding and reasoning. Best for complex multi-page synthesis. |
| GPT-5.5 | 1M tokens | Latest OpenAI flagship. Top AI intelligence index. Excellent for knowledge work. |
| Claude Sonnet 4.6 | 1M tokens | Great balance of speed, cost, and quality for mid-size Wikis |
For local models (Ollama): context windows are typically smaller (8K–128K). Consider limiting Wiki scope or using a cloud provider for ingestion + local model for query.
Anthropic Compatible (Coding Plan): If your provider offers an Anthropic-compatible API endpoint (common with Coding Plan subscriptions), select "Anthropic Compatible" and enter your provider's Base URL and API Key. Uses the same Claude models via the Anthropic SDK format.
Usage
| Method | How |
|---|---|
Ingest from sources/ |
Cmd+P → "Ingest Sources" — processes the entire sources/ folder |
| Ingest any folder | Cmd+P → "Ingest from Folder" — pick a folder, generate Wiki from existing notes |
| Query Wiki | Cmd+P → "Query Wiki" — ask questions, get streaming answers with [[wiki-links]] |
Re-ingesting the same source does incremental updates on entity/concept pages (new info merged in). Summary pages are regenerated.
Smart Batch Skip: When ingesting a folder, the plugin automatically detects already-processed files and skips them to save time and API costs. If wiki/sources/${slug}.md exists, the source is considered ingested. The batch report shows "Skipped (already ingested): X/Y" count.
Ingestion Acceleration: For sources with many entities (20+), enable parallel page generation in Settings → Ingestion Acceleration:
- Page Generation Concurrency: 1 (serial, safest) to 5 (parallel, fastest). Start with 3 for most providers. Increase batch delay if you hit rate limits.
- Batch Delay: 100-2000ms between parallel batches. Increase to 500ms+ for OpenAI (60 RPM limit) or if you see 429 errors.
Safety note: Parallel generation uses
Promise.allSettled— if one page fails, others continue. Failed pages are retried individually with exponential backoff. Links to not-yet-created pages are valid Obsidian syntax ([[entity-name]]) and resolve automatically once the target page exists.
Source Mentions Preservation: The "Mentions in Source" section preserves verbatim quotes from your original source. If your Wiki output language differs from the source language, translations appear in parentheses after the original text. This ensures traceability while maintaining readability.
---
type: entity
created: 2026-04-29
reviewed: true
---
# Supervised Learning
Your carefully curated content here...
Commands
| Command | Description |
|---|---|
| Ingest single source | Select a note → generate Wiki pages |
| Ingest from folder | Select any folder → batch generate Wiki from existing notes |
| Query wiki | Conversational Q&A over your Wiki, with streaming |
| Lint wiki | Detect contradictions, stale info, orphaned pages |
| Regenerate index | Manually rebuild wiki/index.md |
| Suggest schema updates | LLM analyzes Wiki and proposes schema improvements |
Example
Input: sources/machine-learning.md
# Machine Learning
Machine learning uses algorithms to learn from data.
## Types
- Supervised learning
- Unsupervised learning
- Reinforcement learning
Output — Summary: wiki/sources/machine-learning.md
# Machine Learning
Core concepts and algorithms for learning from data.
## Key Concepts
- [[Supervised Learning]] — Learning from labeled data
- [[Unsupervised Learning]] — Discover patterns in unlabeled data
- [[Reinforcement Learning]] — Learn through interaction
Output — Entity: wiki/entities/supervised-learning.md
---
type: entity
created: 2026-05-08
sources: [[sources/machine-learning]]
---
# Supervised Learning
## Definition
Supervised learning learns predictive models from labeled data.
## Key Features
- Requires labeled dataset
- Common algorithms: linear regression, decision trees, neural networks
## Related Concepts
- [[Machine Learning]] — The broader field
- [[Unsupervised Learning]] — Learning without labels
## Related Entities
- [[Arthur Samuel]] — Pioneer of machine learning
## Mentions in Source
- "Machine learning uses algorithms to learn from data."
Architecture
Karpathy's three-layer separation design:
sources/ # Your source documents (read-only)
↓ ingest
wiki/ # LLM-generated Wiki pages
↓ query / maintain
schema/ # Wiki structure configuration (naming, templates, categories)
Modular codebase (src/): wiki/ (engine, query, ingest, lint, page-factory), schema/ (schema-manager, auto-maintain), ui/ (settings, modals), plus shared llm-client.ts, prompts.ts, texts.ts (i18n).
Generated pages:
wiki/sources/filename.md— Source summarywiki/entities/entity-name.md— Entity pages (people, orgs, projects, etc.)wiki/concepts/concept-name.md— Concept pages (theories, methods, terms, etc.)wiki/index.md— Auto-generated indexwiki/log.md— Operation log
Troubleshooting
"Please configure API Key first" — Go to Settings → LLM Wiki, enter your API key, click Test Connection then Save.
Wiki pages show as code blocks — Fixed in v1.0.7+. Rebuild affected pages.
Chinese filenames become untitled-xxx — Fixed in v1.0.3+. Full Unicode supported.
JSON parsing / "Source analysis failed" — Fixed in v1.0.8+ with LLM-based repair fallback. Open Developer Tools (Ctrl+Shift+I) for detailed logs.
Contributing
- Fork the repo
- Create a feature branch:
git checkout -b feature/your-feature - Commit:
git commit -m 'feat: add your feature' - Push and open a Pull Request
Use TypeScript, follow existing code style, and update version numbers in manifest.json, package.json, and versions.json.
Questions, ideas, or want to share how you use LLM-Wiki? Join the Discussions.
License
MIT License — see LICENSE.
Acknowledgments
- Concept: Andrej Karpathy's LLM Wiki — the original vision that inspired this plugin
- Platform: Obsidian Plugin API
- LLM SDKs: Anthropic SDK, OpenAI SDK