Skip to content

Test voice agents with simulated callers using Hamming

reference

Testing voice agents with different accents, background noise, and edge cases requires manual phone calls

voice-agenttestinghammingsimulation
19 views

Problem

Voice agents built with ElevenLabs, Play.ht, or custom pipelines are difficult to test systematically. Verifying accent handling, background noise robustness, and interruption behavior requires a person to manually call the agent and try each variation. This is slow, non-reproducible, and scales poorly. You cannot easily test 50 accent variations or simulate a noisy restaurant by hand.

Solution

Use Hamming.ai to automate voice agent testing with simulated callers, accent variations, and scenario playback.

Step 1: Define test scenarios

# test-scenarios.yaml
scenarios:
  - name: "Basic order placement"
    caller_profile:
      accent: "american-midwest"
      background_noise: "quiet"
    script:
      - say: "Hi, I'd like to place an order for delivery"
      - wait_for_response: true
      - say: "A large pepperoni pizza and garlic bread"
      - wait_for_response: true
      - say: "123 Main Street, apartment 4B"
    expected_outcomes:
      - order_captured: true
      - address_captured: "123 Main Street, apartment 4B"

  - name: "Accent handling - Indian English"
    caller_profile:
      accent: "indian-english"
      background_noise: "moderate-office"
    script:
      - say: "I am wanting to make a reservation for four people"
      - wait_for_response: true
      - say: "Saturday evening, seven thirty PM"
    expected_outcomes:
      - reservation_party_size: 4
      - reservation_time: "19:30"

  - name: "Interruption handling"
    caller_profile:
      accent: "australian"
    script:
      - say: "I need to cancel my--"
      - interrupt_after_ms: 500
      - say: "Sorry, I need to cancel my appointment tomorrow"
    expected_outcomes:
      - intent_recognized: "cancel_appointment"

Step 2: Run the test suite

import { HammingClient } from "@hamming/sdk";

const hamming = new HammingClient({ apiKey: process.env.HAMMING_API_KEY });

const results = await hamming.runTestSuite({
  agentPhoneNumber: "+1-555-YOUR-AGENT",
  scenarioFile: "./test-scenarios.yaml",
  parallel: 5,
});

for (const result of results) {
  console.log(`${result.scenario}: ${result.passed ? "PASS" : "FAIL"}`);
  console.log(`  Latency: ${result.avgResponseMs}ms`);
  console.log(`  Accuracy: ${result.transcriptAccuracy}%`);
}

Why It Works

Hamming simulates realistic callers by combining text-to-speech with accent models and background noise injection. The simulated caller follows a script but adapts to the agent's responses, mimicking real conversation flow. By defining expected outcomes, you get automated pass/fail results without listening to every call. Running scenarios in parallel tests dozens of variations in minutes instead of hours of manual calling.

Context

  • Hamming.ai focuses specifically on voice agent QA, unlike general speech-to-text benchmarks
  • Use it for regression testing after changing your voice agent's prompt, model, or speech provider
  • Pairs well with ElevenLabs Conversational AI, Sesame, or any agent reachable via phone number
  • The cost per simulated call is significantly lower than manual QA staff test calls
  • For testing speech-to-text accuracy alone, consider dedicated STT benchmarks rather than full call simulation
About this share
Contributormblode
Repositorymblode/shares
CreatedFeb 10, 2026
View on GitHub