Offline-first · No API keys · Open source

Your code stays
on your machine.

CyberPaw is a desktop coding agent powered by local LLMs. Read files, write code, run tests — all without sending a single byte to the cloud.

Download & Install Developer Guide → ★ GitHub
CyberPaw — ~/projects/myapp
Working directory: ~/projects/myapp
Loading gemma-4-E4B-it-Q4_K_M.gguf…
Model ready.

add error handling to the login function in src/auth.py

▸ Read src/auth.py
▸ Edit src/auth.py — wrapped auth call in try/except
▸ Bash pytest tests/test_auth.py

Done. Added try/except around the JWT decode call,
returns a 401 on InvalidTokenError. All 12 tests pass.

_

Everything a coding agent needs,
nothing it doesn't.

Built for developers who want full capability without giving up privacy or control.

🔒
Fully Offline
Runs entirely on your hardware using llama-cpp-python. No internet required after setup. Your code never leaves your machine.
🛠️
20 Built-in Tools
Read, write, edit files. Run shell commands. Search codebases with grep and glob. Execute Python in a persistent REPL. Browse the web (opt-in).
🖥️
Native Desktop App
Built with Tauri 2 — a native macOS/Windows/Linux window. Not Electron. Fast startup, small footprint, proper OS integration.
🤖
Multi-Agent
Spawn sub-agents for focused sub-tasks. Up to 3 levels of nesting. Each sub-agent has its own conversation context and tool access.
Metal Accelerated
On Apple Silicon, llama-cpp-python runs on the GPU via Metal. Get 30–50 tokens/sec on an M2 Pro with the E4B model.
🧩
Extensible
Add a new tool in one Python file. The tool registry auto-generates the schema injected into the system prompt — no boilerplate.

Three layers, one pipe.

CyberPaw is a Tauri app wrapping a React terminal UI that talks to a Python agent over NDJSON on stdin/stdout.

01 — FRONTEND
React + xterm.js
A terminal UI in a Tauri WebView. Handles input, renders streamed tokens, shows tool call events and permission dialogs.
02 — BRIDGE
Tauri (Rust)
Spawns the Python sidecar, bridges WebView ↔ sidecar via NDJSON on stdin/stdout. Persists config. Manages the window.
03 — AGENT
Python + llama.cpp
Asyncio event loop running the agent harness. Streams tokens from the local model, parses tool calls, executes tools, loops.

Zero data leaves
your machine.

Every inference call goes to a process on localhost. No telemetry, no cloud sync, no API keys to rotate.

  • Local inference llama-cpp-python runs the model in-process. No network calls during generation.
  • No telemetry CyberPaw collects nothing. No crash reports, no usage analytics, no account required.
  • Opt-in network access Web tools (WebFetch, WebSearch, Playwright) are disabled by default and require explicit opt-in per session.
  • Permission system Every file write and shell command can require your approval before executing.
# All traffic stays on localhost
 
OUTBOUND_REQUESTS = 0
API_KEYS_REQUIRED = 0
TELEMETRY_EVENTS = 0
CLOUD_SYNC = False
 
# Inference path
user_prompt
→ tauri_ipc
→ python_sidecar
llama_cpp (localhost)
→ streamed_tokens
→ terminal_ui
 
✓ No external calls made

Download once, run forever.

CyberPaw ships without model weights. Pick the one that fits your hardware. Download from the app — no manual steps.

RECOMMENDED
Gemma 4 E2B
Google's 2B MoE model. Fast responses, good code quality. Best choice for 8 GB RAM machines.
Q4_K_M 2.9 GB 8 GB RAM
Gemma 4 E4B
Google's 4B MoE model. Better reasoning and code quality. Recommended for 16 GB RAM machines.
Q4_K_M 4.6 GB 8+ GB RAM

Models are saved to ~/CyberPaw/models/. The app detects already-downloaded models and shows a Load button — no re-downloading needed.

Your private coding agent is one command away.

Run the setup script, download a model, and start coding — all offline.