Quick Start
Get tldw running locally in under a minute. Choose pip for development or Docker Compose for a batteries-included setup.
pip install tldw_server
python -m uvicorn tldw_Server_API.app.main:app --reload
git clone https://github.com/rmusser01/tldw_server.git
cd tldw_server
docker compose up
API docs at localhost:8000/docs · Quickstart wizard at localhost:8000/api/v1/config/quickstart
What is tldw?
tldw (Too Long; Didn't Watch) is an open-source research assistant and media analysis platform. It helps you ingest, transcribe, analyze, and interact with video, audio, PDFs, ebooks, and web pages through a powerful REST API and web interface.
The project draws inspiration from The Young Lady's Illustrated Primer in Neal Stephenson's The Diamond Age — a personal knowledge assistant that helps you learn and research at your own pace. While we can't replicate the Primer, tldw is a practical step toward that vision.
Everything runs on your infrastructure. No telemetry, no data collection, no cloud lock-in. You own your data and control every aspect of the system.
- 25 LLM Providers — OpenAI, Anthropic, Google Gemini, Cohere, DeepSeek, Groq, Mistral, OpenRouter, HuggingFace, Qwen, plus Ollama, llama.cpp, vLLM, Kobold.cpp, MLX, and more
- 15+ Media Formats — Video, audio, PDF, EPUB, DOCX, HTML, XML, email, and MediaWiki dumps. yt-dlp for 1,000+ sites. OCR and visual document processing
- 7 STT Engines — faster_whisper, NeMo Parakeet/Canary, Qwen3-ASR, VibeVoice-ASR with real-time WebSocket streaming and diarization
- 18 TTS Engines — OpenAI, ElevenLabs, Kokoro, Dia, Chatterbox, VibeVoice, NeuTTS, and more. Voice cloning, emotion control, and streaming synthesis
- Hybrid RAG Pipeline — BM25 + vector fusion, semantic caching, HyDE, query expansion, cross-encoder reranking, faithfulness scoring, regression detection, and batch checkpointing
- 6 Evaluation Types — G-Eval, RAG eval, proposition checking, response quality, OCR quality, and embeddings A/B testing with batch processing and webhooks
- Character Chat — SillyTavern V2 cards with PNG export, world books, 20+ prompt formats, chatbooks, prompt library, and notebook notes
- 11 MCP Modules — Knowledge, media, notes, chats, characters, prompts, flashcards, quizzes, slides, and kanban with JWT/RBAC and Prometheus metrics
- Multi-Provider Search — Google, Brave, DuckDuckGo, SearXNG, and arXiv with LLM-powered result aggregation and full content scraping
Features
A modular platform that covers the full research workflow — from ingestion to insight.
Ingest video, audio, PDF, EPUB, DOCX, HTML, Markdown, and XML. Built-in yt-dlp integration downloads from 1,000+ sites with automatic metadata extraction.
Transcribe with faster_whisper, NVIDIA NeMo (Parakeet, Canary), or Qwen2Audio. Real-time streaming over WebSocket. Text-to-speech via OpenAI-compatible API and local Kokoro ONNX.
Hybrid retrieval combining SQLite FTS5 full-text search with ChromaDB vector embeddings, BM25 scoring, and cross-encoder re-ranking for high-accuracy results.
Drop-in replacements for Chat Completions, Audio Transcription, Embeddings, and TTS endpoints. Use your existing client libraries without changes.
Self-hosted with zero telemetry. Run models locally with llama.cpp, Ollama, or vLLM. All data stays in your SQLite or PostgreSQL databases. Bring your own keys.
SillyTavern-compatible character cards with V2 PNG export. Prompt library with import/export. Notebook-style note-taking. Chat history search and management.
FAQ
Is this production-ready?
tldw is in active development (v0.1.x). The API surface is stabilizing but may still change. It's suitable for personal research use, home labs, and development environments. We welcome early adopters and contributors who are comfortable with a fast-moving project.
How is my data handled?
All data stays local in SQLite databases (or your own PostgreSQL). There is no telemetry, no analytics, no phone-home behavior. You control ingestion, storage, and deletion. API keys are stored in local config files or environment variables — never transmitted to us.
Which models are supported?
16+ providers including OpenAI, Anthropic, Google, Cohere, DeepSeek, Groq, Mistral, OpenRouter, HuggingFace, and Qwen. Local options include llama.cpp, Kobold.cpp, Ollama, vLLM, TabbyAPI, and any OpenAI-compatible endpoint.
How can I contribute?
Check out the good first issues on GitHub, or join the Discord to chat with the community. The project follows the GPL-3.0 license and welcomes contributions of all kinds.
Community
Join the conversation, get help, share ideas, and help shape the project.