Route AI tasks to the best model using a Node.js orchestrator

Problem

Some tasks need deep reasoning (Opus), some need speed (Haiku), and some need vision or code generation (Sonnet). When you hard-code a single model, you either overpay for simple tasks or get poor results on complex ones. Manually choosing a model per request is tedious and breaks the flow of automated pipelines. You need a router that dispatches to the right model based on what the task actually requires.

Solution

Step 1: Define task categories and model mappings

const TASK_ROUTES = {
  reasoning: {
    model: "claude-opus-4-6",
    description: "Architecture decisions, complex debugging, code review",
  },
  generation: {
    model: "claude-sonnet-4-5-20250929",
    description: "Code generation, refactoring, standard implementation",
  },
  triage: {
    model: "claude-haiku-4-5-20251001",
    description: "Classification, summarization, simple Q&A, routing",
  },
  vision: {
    model: "claude-sonnet-4-5-20250929",
    description: "Image analysis, screenshot interpretation, diagram reading",
  },
};

Step 2: Build the router with the Anthropic SDK

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

async function classifyTask(prompt) {
  const response = await client.messages.create({
    model: TASK_ROUTES.triage.model,
    max_tokens: 50,
    messages: [{
      role: "user",
      content: `Classify this task into exactly one category: reasoning, generation, triage, vision.
Task: "${prompt.substring(0, 200)}"
Category:`,
    }],
  });
  const category = response.content[0].text.trim().toLowerCase();
  return TASK_ROUTES[category] ? category : "generation";
}

async function routeTask(prompt, options = {}) {
  const category = options.forceCategory || await classifyTask(prompt);
  const route = TASK_ROUTES[category];

  const response = await client.messages.create({
    model: route.model,
    max_tokens: options.maxTokens || 4096,
    messages: [{ role: "user", content: prompt }],
  });

  return {
    category,
    model: route.model,
    content: response.content[0].text,
    usage: response.usage,
  };
}

Step 3: Add cost tracking

const COST_PER_MILLION = {
  "claude-opus-4-6": { input: 15, output: 75 },
  "claude-sonnet-4-5-20250929": { input: 3, output: 15 },
  "claude-haiku-4-5-20251001": { input: 0.8, output: 4 },
};

function estimateCost(model, usage) {
  const rates = COST_PER_MILLION[model];
  const inputCost = (usage.input_tokens / 1_000_000) * rates.input;
  const outputCost = (usage.output_tokens / 1_000_000) * rates.output;
  return { inputCost, outputCost, total: inputCost + outputCost };
}

Step 4: Use the router in your pipeline

// Simple tasks go to Haiku automatically
const summary = await routeTask("Summarize this error log in one sentence: ...");

// Complex tasks route to Opus
const review = await routeTask("Review this auth module for security issues: ...");

// Force a specific route when you know best
const code = await routeTask(prompt, { forceCategory: "generation" });

Why It Works

Using Haiku for classification is cheap and fast, costing fractions of a cent per routing decision. The classification step inspects the task description and selects the model that matches the complexity, so simple tasks avoid the cost and latency of Opus while complex tasks get the reasoning power they need. The forceCategory escape hatch lets you override the router when you have domain knowledge about what a task requires.

Context

The triage model (Haiku) adds approximately 200ms to each request for classification
For latency-sensitive pipelines, skip classification and use static route rules based on the calling function
Cost savings are significant: routing 80% of tasks to Haiku instead of Sonnet can reduce API spend by 70%+
The same pattern works with OpenAI, Google, or mixed-provider setups by extending the route map
Add fallback logic so if the primary model is overloaded, the router retries with an alternative
Log routing decisions to track which categories dominate and tune the model assignments over time