Skip to content

Conversation

@vinhngx
Copy link

@vinhngx vinhngx commented Nov 5, 2025

Implementing a LLM TCO Calculator per the blog: https://developer.nvidia.com/blog/llm-inference-benchmarking-how-much-does-your-llm-inference-cost/

This notebook allows you to:

  • setup a NIM LLM inference server
  • benchmark performance with AIperf
  • collect the data and plug into the LLM TCO calculator (in the form of an Excel spreadsheet)
  • Work out the inference costs, such as $/million-tokens

Summary by CodeRabbit

Release Notes

  • Documentation

    • Added comprehensive README for AIPerf Utility Notebooks with setup and usage guidance.
  • New Features

    • New notebook for benchmarking NIM LLM deployments with performance metrics collection across multiple configurations.
    • Export aggregated performance results to Excel format for TCO calculator integration.

@github-actions
Copy link

github-actions bot commented Nov 5, 2025

Try out this PR

Quick install:

pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@main

Recommended with virtual environment (using uv):

uv venv --python 3.12 && source .venv/bin/activate
uv pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@main

@codecov
Copy link

codecov bot commented Nov 5, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@coderabbitai
Copy link

coderabbitai bot commented Nov 5, 2025

Walkthrough

The PR introduces two new files: a README documenting AIPerf Utility Notebooks and a comprehensive Jupyter notebook that integrates NVIDIA AIPerf benchmarking with NIM LLM server performance metrics collection, exporting results to Excel for TCO calculator integration.

Changes

Cohort / File(s) Summary
Documentation
notebooks/README.md
Introduces README for AIPerf Utility Notebooks with overview and reference to TCO_calculator.ipynb for benchmarking NIM LLM deployments and exporting to TCO calculator.
Benchmark Notebook
notebooks/TCO_calculator.ipynb
Comprehensive new notebook integrating AIPerf with NIM LLM server. Includes setup/installation of AIPerf, metadata configuration for models and hardware, NIM server orchestration, benchmark execution across multiple concurrencies and sequence lengths, JSON artifact parsing, dataframe aggregation of performance metrics with metadata, and Excel export workflow for downstream TCO calculator import.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

  • Verify AIPerf installation and deprecation warning handling correctness
  • Validate benchmark orchestration logic and concurrency/sequence length iteration patterns
  • Review data aggregation logic that combines JSON profiles with metadata
  • Confirm Excel export schema and column mappings align with TCO calculator requirements
  • Validate sample metadata structure and configuration completeness

Poem

🐰 Hop, skip, and benchmark so fine,
AIPerf and NIM in perfect align,
Data streams flow like carrots divine,
To spreadsheets they hop in columns that shine,
TCO calculations now measure and design! ✨

Pre-merge checks

✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add a LLM TCO calculator tool' directly and clearly summarizes the main change - introducing a new LLM TCO calculator tool. It accurately reflects the primary objective of the PR.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
notebooks/TCO_calculator.ipynb (1)

772-775: Consider automating the Excel import process.

The current workflow requires manual copying of data from data.xlsx to the TCO calculator spreadsheet. This manual step is error-prone and could be automated using Python libraries like openpyxl.

If you'd like to automate this, I can help generate a script that:

  1. Opens the LLM_TCO_Calculator.xlsx workbook
  2. Locates the "data" sheet
  3. Appends the benchmark data programmatically
  4. Saves the updated workbook

Would you like me to create this automation?

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 80a0901 and eb77e19.

⛔ Files ignored due to path filters (1)
  • notebooks/LLM_TCO_Calculator.xlsx is excluded by !**/*.xlsx
📒 Files selected for processing (2)
  • notebooks/README.md (1 hunks)
  • notebooks/TCO_calculator.ipynb (1 hunks)
🧰 Additional context used
🪛 GitHub Actions: Pre-commit
notebooks/README.md

[error] 1-1: Trailing whitespace check failed. Files were modified by this hook.


[error] 1-1: Add-license hook modified the file to include license header.

notebooks/TCO_calculator.ipynb

[warning] 1-1: No handler registered for file: notebooks/TCO_calculator.ipynb. Please add a new handler to /home/runner/work/aiperf/aiperf/tools/add_copyright.py!

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: integration-tests (ubuntu-latest, 3.12)
  • GitHub Check: integration-tests (ubuntu-latest, 3.11)
  • GitHub Check: integration-tests (macos-latest, 3.11)
  • GitHub Check: integration-tests (macos-latest, 3.10)
  • GitHub Check: integration-tests (macos-latest, 3.12)
  • GitHub Check: integration-tests (ubuntu-latest, 3.10)
  • GitHub Check: build (macos-latest, 3.10)
  • GitHub Check: build (macos-latest, 3.11)
  • GitHub Check: build (ubuntu-latest, 3.10)
  • GitHub Check: build (ubuntu-latest, 3.11)
🔇 Additional comments (2)
notebooks/TCO_calculator.ipynb (2)

744-765: LGTM!

The Excel export logic is clean and well-structured. Column ordering is logical, with metadata fields first followed by performance metrics.


324-373: No fixes required; the model identifier formats are correct for their respective purposes.

The apparent inconsistency between model identifiers is not problematic. AIPerf's -m parameter (line 349: meta/llama3-8b-instruct) correctly identifies the model deployed at the NIM endpoint, matching the Docker image specification (line 300: nvcr.io/nim/meta/llama3-8b-instruct:latest). The --tokenizer parameter (line 363: meta-llama/Meta-Llama-3-8B-Instruct) correctly uses HuggingFace's model identifier format for token counting—these serve different purposes and do not require matching.

Per AIPerf documentation, the -m parameter must match the deployed endpoint model, which it does. The REQUEST_COUNT = CONCURRENCY * 3 multiplier aligns with NVIDIA's documented benchmarking best practice for obtaining stable measurements.

Copy link
Contributor

@debermudez debermudez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question on the triton container version.

Copy link
Contributor

@debermudez debermudez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets go ahead with this now.
I will add an item to our backlog to test this in the near future on updated containers.

@vinhngx vinhngx changed the title Adding a LLM TCO calculator tool feat: add a LLM TCO calculator tool Nov 13, 2025
@github-actions github-actions bot added the feat label Nov 13, 2025
@vinhngx
Copy link
Author

vinhngx commented Nov 13, 2025

Thanks @matthewkotila and @debermudez . Resolved some minor issues in the PR in this thread. Could this be merged now?
I noticed some failing check, not sure if that affect the merge or not (?)

@debermudez
Copy link
Contributor

The mac test should be fine.
@saturley-hall do you know how to resolve the sign off issue for @vinhngx? He is an nvidian.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants