Offline-first · No API keys · Open source

Your code stays
on your machine.

CyberPaw is a desktop coding agent powered by local LLMs. Read files, write code, run tests — all without sending a single byte to the cloud.

Download & Install Developer Guide → ★ GitHub

CyberPaw — ~/projects/myapp

Working directory: ~/projects/myapp

⠿ Loading gemma-4-E4B-it-Q4_K_M.gguf…

Model ready.

❯ add error handling to the login function in src/auth.py

▸ Read src/auth.py

▸ Edit src/auth.py — wrapped auth call in try/except

▸ Bash pytest tests/test_auth.py

✓ Done. Added try/except around the JWT decode call,

returns a 401 on InvalidTokenError. All 12 tests pass.

❯ _

Why CyberPaw

Everything a coding agent needs,
nothing it doesn't.

Built for developers who want full capability without giving up privacy or control.

🔒

Fully Offline

Runs entirely on your hardware using llama-cpp-python. No internet required after setup. Your code never leaves your machine.

🛠️

20 Built-in Tools

Read, write, edit files. Run shell commands. Search codebases with grep and glob. Execute Python in a persistent REPL. Browse the web (opt-in).

🖥️

Native Desktop App

Built with Tauri 2 — a native macOS/Windows/Linux window. Not Electron. Fast startup, small footprint, proper OS integration.

🤖

Multi-Agent

Spawn sub-agents for focused sub-tasks. Up to 3 levels of nesting. Each sub-agent has its own conversation context and tool access.

⚡

Metal Accelerated

On Apple Silicon, llama-cpp-python runs on the GPU via Metal. Get 30–50 tokens/sec on an M2 Pro with the E4B model.

🧩

Extensible

Add a new tool in one Python file. The tool registry auto-generates the schema injected into the system prompt — no boilerplate.

How it works

Three layers, one pipe.

CyberPaw is a Tauri app wrapping a React terminal UI that talks to a Python agent over NDJSON on stdin/stdout.

01 — FRONTEND

React + xterm.js

A terminal UI in a Tauri WebView. Handles input, renders streamed tokens, shows tool call events and permission dialogs.

02 — BRIDGE

Tauri (Rust)

Spawns the Python sidecar, bridges WebView ↔ sidecar via NDJSON on stdin/stdout. Persists config. Manages the window.

03 — AGENT

Python + llama.cpp

Asyncio event loop running the agent harness. Streams tokens from the local model, parses tool calls, executes tools, loops.

Privacy by design

Zero data leaves
your machine.

Every inference call goes to a process on localhost. No telemetry, no cloud sync, no API keys to rotate.

✦
Local inference llama-cpp-python runs the model in-process. No network calls during generation.
✦
No telemetry CyberPaw collects nothing. No crash reports, no usage analytics, no account required.
✦
Opt-in network access Web tools (WebFetch, WebSearch, Playwright) are disabled by default and require explicit opt-in per session.
✦
Permission system Every file write and shell command can require your approval before executing.

# All traffic stays on localhost

OUTBOUND_REQUESTS = 0

API_KEYS_REQUIRED = 0

TELEMETRY_EVENTS = 0

CLOUD_SYNC = False

# Inference path

user_prompt

→ tauri_ipc

→ python_sidecar

→ llama_cpp (localhost)

→ streamed_tokens

→ terminal_ui

✓ No external calls made

Supported Models

Download once, run forever.

CyberPaw ships without model weights. Pick the one that fits your hardware. Download from the app — no manual steps.

RECOMMENDED

Gemma 4 E2B

Google's 2B MoE model. Fast responses, good code quality. Best choice for 8 GB RAM machines.

Q4_K_M 2.9 GB 8 GB RAM

Gemma 4 E4B

Google's 4B MoE model. Better reasoning and code quality. Recommended for 16 GB RAM machines.

Q4_K_M 4.6 GB 8+ GB RAM

Models are saved to ~/CyberPaw/models/. The app detects already-downloaded models and shows a Load button — no re-downloading needed.

Get started

Your private coding agent is one command away.

Run the setup script, download a model, and start coding — all offline.

Installation Guide Developer Docs → View on GitHub

Your code stayson your machine.