-
Notifications
You must be signed in to change notification settings - Fork 11
feat: add a LLM TCO calculator tool #445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Try out this PRQuick install: pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@mainRecommended with virtual environment (using uv): uv venv --python 3.12 && source .venv/bin/activate
uv pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@main |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
WalkthroughThe PR introduces two new files: a README documenting AIPerf Utility Notebooks and a comprehensive Jupyter notebook that integrates NVIDIA AIPerf benchmarking with NIM LLM server performance metrics collection, exporting results to Excel for TCO calculator integration. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes
Poem
Pre-merge checks✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (1)
notebooks/TCO_calculator.ipynb (1)
772-775: Consider automating the Excel import process.The current workflow requires manual copying of data from
data.xlsxto the TCO calculator spreadsheet. This manual step is error-prone and could be automated using Python libraries likeopenpyxl.If you'd like to automate this, I can help generate a script that:
- Opens the
LLM_TCO_Calculator.xlsxworkbook- Locates the "data" sheet
- Appends the benchmark data programmatically
- Saves the updated workbook
Would you like me to create this automation?
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
notebooks/LLM_TCO_Calculator.xlsxis excluded by!**/*.xlsx
📒 Files selected for processing (2)
notebooks/README.md(1 hunks)notebooks/TCO_calculator.ipynb(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: Pre-commit
notebooks/README.md
[error] 1-1: Trailing whitespace check failed. Files were modified by this hook.
[error] 1-1: Add-license hook modified the file to include license header.
notebooks/TCO_calculator.ipynb
[warning] 1-1: No handler registered for file: notebooks/TCO_calculator.ipynb. Please add a new handler to /home/runner/work/aiperf/aiperf/tools/add_copyright.py!
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: integration-tests (ubuntu-latest, 3.12)
- GitHub Check: integration-tests (ubuntu-latest, 3.11)
- GitHub Check: integration-tests (macos-latest, 3.11)
- GitHub Check: integration-tests (macos-latest, 3.10)
- GitHub Check: integration-tests (macos-latest, 3.12)
- GitHub Check: integration-tests (ubuntu-latest, 3.10)
- GitHub Check: build (macos-latest, 3.10)
- GitHub Check: build (macos-latest, 3.11)
- GitHub Check: build (ubuntu-latest, 3.10)
- GitHub Check: build (ubuntu-latest, 3.11)
🔇 Additional comments (2)
notebooks/TCO_calculator.ipynb (2)
744-765: LGTM!The Excel export logic is clean and well-structured. Column ordering is logical, with metadata fields first followed by performance metrics.
324-373: No fixes required; the model identifier formats are correct for their respective purposes.The apparent inconsistency between model identifiers is not problematic. AIPerf's
-mparameter (line 349:meta/llama3-8b-instruct) correctly identifies the model deployed at the NIM endpoint, matching the Docker image specification (line 300:nvcr.io/nim/meta/llama3-8b-instruct:latest). The--tokenizerparameter (line 363:meta-llama/Meta-Llama-3-8B-Instruct) correctly uses HuggingFace's model identifier format for token counting—these serve different purposes and do not require matching.Per AIPerf documentation, the
-mparameter must match the deployed endpoint model, which it does. TheREQUEST_COUNT = CONCURRENCY * 3multiplier aligns with NVIDIA's documented benchmarking best practice for obtaining stable measurements.
debermudez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question on the triton container version.
debermudez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets go ahead with this now.
I will add an item to our backlog to test this in the near future on updated containers.
Signed-off-by: Vinh Nguyen <[email protected]>
Signed-off-by: Vinh Nguyen <[email protected]>
|
Thanks @matthewkotila and @debermudez . Resolved some minor issues in the PR in this thread. Could this be merged now? |
|
The mac test should be fine. |
Implementing a LLM TCO Calculator per the blog: https://developer.nvidia.com/blog/llm-inference-benchmarking-how-much-does-your-llm-inference-cost/
This notebook allows you to:
Summary by CodeRabbit
Release Notes
Documentation
New Features