I've been frustrated with AI coding tools that load 15K-28K tokens of system prompts before you can even ask a question. The AI spends most of its attention reading the manual, not solving your code.
So I built Huiyu Pi — a self-hosted AI coding agent that starts at ~80 tokens.

What it does:
Browser-based IDE (no heavy Electron app)
~80 tokens system prompt (not 20K)
~0.3s first token response
90%+ cheaper per request
100% local — your code, API keys, conversations never leave your machine
Multi-LLM: Claude, GPT, DeepSeek, Gemini, Mistral, Groq, xAI, OpenRouter
Built-in terminal, file editor, Git integration
PWA support (works on mobile)

How to try:
npx huiyu-pi
Then open http://localhost:9144
Tech stack:
Frontend: React 19, TypeScript, Vite, Tailwind CSS
Backend: Fastify, WebSocket, SSE
Terminal: xterm.js + node-pty
License: MIT
GitHub: https://github.com/huiyu9144/Huiyu-Pi
Would love feedback from the self-hosted community!
