Vault Coach
🌐 Language: English | 中文
An Obsidian plugin for intelligent local knowledge base Q&A, powered by Advanced RAG (Retrieval-Augmented Generation) and a locally-running Ollama service. Your data never leaves your machine.

Table of Contents
Features
✅ Phase 1 (Current — v0.0.2)
| Feature |
Description |
| 🔍 Hybrid Retrieval |
TF-IDF keyword search + semantic vector search + RRF fusion |
| ✏️ Query Rewrite |
Local LLM auto-rewrites queries to improve retrieval quality |
| 📊 Rerank |
Optional external rerank service, falls back to heuristic rerank |
| 🧠 Long-term Memory |
Cross-session extraction and injection of user preferences |
| 🔄 Incremental Index Sync |
Watches vault file changes and updates the index automatically |
| 🌊 Streaming Output |
Pseudo-streaming response rendering with low perceived latency |
| 💾 Conversation Persistence |
Chat history saved locally and restored after restart |
| 🔒 100% Local |
Powered by Ollama REST API — no data leaves your device |
Quick Start
Prerequisites
- Obsidian v1.5.0+
- Ollama running locally
- At least one chat model (e.g.
gemma3:4b) and one embedding model pulled
Installation (Development Build)
# 1. Clone into your vault's plugin directory
cd <your-vault>/.obsidian/plugins
git clone https://github.com/Wanjin5508/vault-coach vault-coach
# 2. Install dependencies and build
cd vault-coach
npm install
npm run build
# 3. Enable Vault Coach in Obsidian Settings > Community Plugins
Basic Setup
- Open Settings → Vault Coach
- Set your Ollama base URL (default:
http://127.0.0.1:11434)
- Enter your chat model name (e.g.
gemma3:4b) and embedding model name
- Click Rebuild Index or wait for auto-sync to complete
- Click the 💬 ribbon icon to start chatting
Configuration
General
| Setting |
Description |
| Assistant Name |
Name shown in the sidebar header |
| Default Greeting |
First message shown after conversation reset (Markdown supported) |
| Open on Startup |
Auto-open Vault Coach when Obsidian loads |
| Default Retrieval Mode |
keyword / vector / hybrid (hybrid recommended) |
| Collapse Sources |
Whether to collapse the sources section by default |
Knowledge Base
| Setting |
Default |
Description |
| Scope |
Whole Vault |
Or limit to a specific folder |
| Chunk Size |
600 chars |
Maximum characters per chunk |
| Chunk Overlap |
120 chars |
Overlap between adjacent chunks |
| Auto Sync |
✅ |
Watch file changes and update index automatically |
| Debounce Delay |
15,000 ms |
Wait time after last file change before syncing |
| Max Wait Time |
120,000 ms |
Force sync after this duration regardless |
| File Threshold |
8 files |
Trigger immediate sync when this many files change |
Advanced RAG
| Setting |
Default |
Description |
| Query Rewrite |
✅ |
LLM rewrites the query before retrieval |
| Vector Retrieval |
✅ |
Generate embeddings during index build |
| Rerank |
✅ |
Rerank retrieved candidates |
| Keyword top-k |
10 |
Keyword retrieval candidate count |
| Vector top-k |
10 |
Vector retrieval candidate count |
| Hybrid limit |
12 |
Max candidates after fusion |
| Rerank top-k |
8 |
Candidates entering rerank stage |
| Context chunks |
8 |
Final chunks injected into the prompt |
| Source limit |
5 |
Max sources shown per answer |
| Temperature |
0.2 |
Generation temperature (keep low for RAG) |
Local Model
| Setting |
Description |
| LLM Base URL |
Ollama address, default http://127.0.0.1:11434 |
| Chat Model |
Used for query rewrite and answer generation |
| Embedding Model |
Used for vector retrieval; rebuild index after changing |
| Rerank Service URL |
Optional; leave empty to use heuristic rerank |
| Rerank Model |
Used when a rerank service URL is configured |
Long-term Memory
| Setting |
Default |
Description |
| Enable Memory |
✅ |
Extract and store useful facts after each turn |
| Memory Top-k |
4 |
Max memories injected per answer |
| Max Memory Items |
150 |
Oldest/least-accessed items are evicted when exceeded |
| Max Persisted Messages |
60 |
Max conversation messages kept in local storage |
Roadmap
See PROJECT_PLAN.md for the full three-phase development plan.
Phase Overview
| Phase |
Name |
Status |
Key Goal |
| Phase 1 |
Advanced RAG Q&A |
✅ Complete |
Hybrid retrieval + Rerank + Long-term memory + Incremental index |
| Phase 2 |
Knowledge Graph Enhancement |
🔜 Planned |
Entity extraction + Graph construction + Smart query routing |
| Phase 3 |
Agentic Interview Assistant |
🔜 Planned |
Multi-role agents + Interview simulation + Skill diagnosis |
| Phase 4 |
Standalone Application |
🔜 Vision |
Independent frontend/backend + Voice + Avatar |
Architecture
┌─────────────────────────────────────────────────┐
│ view.ts (UI) │
│ Obsidian ItemView · Right Sidebar │
└────────────────────┬────────────────────────────┘
│
┌────────────────────▼────────────────────────────┐
│ main.ts │
│ Plugin Entry · State Management · Vault Events │
└──────┬─────────────┬──────────────┬─────────────┘
│ │ │
┌──────▼──────┐ ┌────▼─────┐ ┌────▼──────────────┐
│ rag-engine │ │knowledge │ │ persistent-store │
│ Advanced │ │ -base │ │ runtime-state.json │
│ RAG Flow │ │Index/Ret.│ │ index-snapshot.json│
└──────┬──────┘ └──────────┘ └───────────────────┘
│
┌──────▼──────────────────────────────────────────┐
│ model-client.ts │
│ Ollama REST API · /api/chat · /api/embed │
└─────────────────────────────────────────────────┘
For detailed technical documentation, see TECHNICAL_DOC.docx.
Known Issues
- Non-true streaming:
requestUrl returns the full response at once; true token-level streaming will require a fetch + ReadableStream migration
- Memory search lacks semantic similarity: Currently uses keyword matching only; embedding-based memory search is planned for Phase 2
- VIEW_TYPE typo: The constant value in
constants.ts says value-coach-view instead of vault-coach-view (non-breaking, will be fixed in next release)
Contributing
Issues and PRs are welcome! Please check PROJECT_PLAN.md for the current development direction before opening a PR.
License
MIT License