|
| 1 | +--- |
| 2 | +title: Voice Simulation Runs |
| 3 | +description: Test your Voice Agent's interaction capabilities with realistic voice simulations on thousands of scenarios and evaluate your agent. |
| 4 | +--- |
| 5 | + |
| 6 | +## Evaluate voice agent performance with simulated sessions at scale with multiple schenarios with test runs. |
| 7 | + |
| 8 | +You can run test runs with dataset containing multiple scenarios for your voice agent, |
| 9 | + |
| 10 | +<Steps> |
| 11 | + |
| 12 | +<Step title="Create a Dataset for testing"> |
| 13 | +- Configure the agent dataset template with: |
| 14 | +- **Agent scenarios**: Define specific situations for testing (e.g., "Update address", "Order an iPhone") |
| 15 | +- **Expected steps**: List expected actions and responses |
| 16 | + |
| 17 | + |
| 18 | +</Step> |
| 19 | + |
| 20 | +<Step title="Set up the Test Run"> |
| 21 | +- Navigate to your voice agent, click "Test", and "Simulated session" mode will be pre-selected since voice agent can not be tested single turn. |
| 22 | +- Pick your agent dataset from the dropdown |
| 23 | +- Select relevant evaluators |
| 24 | + |
| 25 | +<Note> |
| 26 | + At the moment, only built-in evaluators are supported for voice simulation runs; custom evaluators will be supported in the future. |
| 27 | +</Note> |
| 28 | + |
| 29 | + |
| 30 | + |
| 31 | +</Step> |
| 32 | + |
| 33 | +<Step title="Execute Test Run"> |
| 34 | +- Click "Trigger test run" to begin |
| 35 | +- The system will call to your voice agent and simulate conversations for each scenario |
| 36 | +</Step> |
| 37 | + |
| 38 | +<Step title="Review results"> |
| 39 | +- Each session runs end-to-end for thorough evaluation |
| 40 | +- You'll see detailed results for every scenario |
| 41 | +- Text based evaluators will be evaluated via turns of the call transcription rest will be evaluated by call recording audio file. |
| 42 | + |
| 43 | + |
| 44 | +</Step> |
| 45 | + |
| 46 | +<Step title="Inspect the single entry"> |
| 47 | +- You can click on any entry to inspect the detailed results for that specific scenario. |
| 48 | +- By default, test run will evalouate few metrics for each entry based on the recording audio file which are: |
| 49 | + - **Avg Latency**: How long it took the agent to respond to the prompt |
| 50 | + - **Talk ratio**: How much of the time the agent was talking compared to the time the simulation agent was talking |
| 51 | + - **Avg pitch**: The average pitch of the agent's response |
| 52 | + - **Words per minute**: How many words the agent was able to speak in a minute |
| 53 | + |
| 54 | + |
| 55 | +</Step> |
| 56 | + |
| 57 | +</Steps> |
| 58 | + |
0 commit comments