Measure and improve the quality of the AI-Q blueprint.
To create custom evaluators or benchmarks, refer to the [NeMo Agent Toolkit Evaluation documentation](https://docs.nvidia.com/nemo/agent-toolkit/latest/improve-workflows/evaluate.html). The benchmarks below are pre-built for AI-Q.
- Benchmarks — Run standardized evaluation suites