Problem
You want to provide Claude Code as a service to multiple users, but running Claude Code through the API directly costs $15/$75 per million tokens (input/output). A Claude Max subscription at $100-200/month provides significantly more value, but the Claude Code binary is designed for single-user local use. You need an architecture that runs the actual Claude Code binary for each user while keeping costs manageable and sessions isolated.
Solution
Architecture: Node.js orchestrator + Fly.io Firecracker VMs + WebSocket tunnels
User Browser
↓ WebSocket
Node.js Orchestrator (Fly.io app)
↓ SSH / WebSocket tunnel
Ephemeral Firecracker VM (per user)
└── Claude Code binary + user's git repo
Step 1: Define the VM template with Fly Machines API
import Anthropic from "@anthropic-ai/sdk";
interface VMConfig {
userId: string;
repoUrl: string;
oauthToken: string;
}
async function provisionVM(config: VMConfig): Promise<string> {
const response = await fetch("https://api.machines.dev/v1/apps/cc-workers/machines", {
method: "POST",
headers: {
Authorization: `Bearer ${FLY_API_TOKEN}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
config: {
image: "registry.fly.io/cc-worker:latest",
guest: { cpus: 2, memory_mb: 2048 },
env: {
REPO_URL: config.repoUrl,
CLAUDE_OAUTH_TOKEN: config.oauthToken,
},
auto_destroy: true,
restart: { policy: "no" },
},
}),
});
const machine = await response.json();
return machine.id;
}
Step 2: Dockerfile for the worker VM
FROM ubuntu:24.04
RUN apt-get update && apt-get install -y \
curl git nodejs npm openssh-server \
&& npm install -g @anthropic-ai/claude-code
# Setup SSH for tunnel access
RUN mkdir /run/sshd
COPY sshd_config /etc/ssh/sshd_config
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
Step 3: Stream output back via WebSocket
import WebSocket from "ws";
import { spawn } from "child_process";
// Inside the VM: bridge Claude Code stdio to WebSocket
const wss = new WebSocket.Server({ port: 8080 });
wss.on("connection", (ws) => {
const claude = spawn("claude", ["--json"], {
cwd: "/workspace",
env: { ...process.env, CLAUDE_OAUTH_TOKEN: process.env.CLAUDE_OAUTH_TOKEN },
});
claude.stdout.on("data", (data) => ws.send(data.toString()));
ws.on("message", (msg) => claude.stdin.write(msg + "\n"));
claude.on("exit", () => ws.close());
ws.on("close", () => claude.kill());
});
Step 4: Destroy VM after session ends
async function destroyVM(machineId: string): Promise<void> {
await fetch(`https://api.machines.dev/v1/apps/cc-workers/machines/${machineId}`, {
method: "DELETE",
headers: { Authorization: `Bearer ${FLY_API_TOKEN}` },
});
}
Why It Works
Fly.io Machines use Firecracker microVMs that boot in under a second and are fully isolated. Each user gets their own ephemeral VM with Claude Code installed, their git repo cloned, and OAuth credentials injected. The auto_destroy: true flag ensures VMs are cleaned up when the process exits, so you only pay for active compute. By running the actual Claude Code binary (not the API), you use OAuth subscription tokens at the subscription rate rather than per-token API pricing.
Context
- Fly Machines bill per-second of compute, so a 10-minute session costs fractions of a cent in compute
- The OAuth token approach uses the Claude subscription, which is significantly cheaper than API pricing for heavy usage
- Always use
auto_destroyand set resource limits to prevent runaway costs - Git operations in the VM should point at a remote server so work persists after VM destruction
- Consider Fly.io Sprites (sprites.dev) for a managed version of this pattern with built-in sandbox tooling