feat: cost analysis tool with pricing configs, one-time report generation tools, and profiling config example#172
Conversation
Greptile SummaryThis PR adds a new Three P2 findings remain:
Confidence Score: 5/5Safe to merge; all remaining findings are P2 quality/hardening suggestions with no blocking defects on the primary path. No P0 or P1 issues. The three P2 findings are either best-practice hardening (HTML escaping in an offline report), user-experience nits (silent truncation of extra trace paths), or only triggered by unusual pricing configs (ambiguous substring model names). Core parsing, pricing arithmetic, and report generation are correct. pricing.py (substring model matching), _report_template_single.py and _report_template_comparison.py (innerHTML escaping) Important Files Changed
Sequence DiagramsequenceDiagram
participant CLI as python -m report
participant GR as generate_report()
participant YAML as config.yml
participant PR as PricingRegistry
participant NA as nat_adapter.parse_trace()
participant BD as _build_report_data()
participant RT as render_html()
participant HTML as tokenomics_report.html
CLI->>GR: trace path(s) + config path
GR->>YAML: yaml.safe_load(config)
YAML-->>GR: pricing dict
GR->>PR: PricingRegistry.from_dict(pricing_raw)
PR-->>GR: pricing registry
loop for each trace path
GR->>NA: parse_trace(path, pricing)
Note over NA: builds _TaskWindow list, infers phase per LLM_END
NA-->>GR: list[RequestProfile]
GR->>BD: _build_report_data(profiles, pricing)
BD-->>GR: report_data dict
end
alt single run
GR->>RT: render_html → _report_template_single
else comparison mode
GR->>GR: _build_comparison_data(run_datas)
GR->>RT: render_html → _report_template_comparison
end
RT-->>GR: HTML string
GR->>HTML: write to output_path
Reviews (3): Last reviewed commit: "Merge branch 'NVIDIA-AI-Blueprints:devel..." | Re-trigger Greptile |
This cost/profiling tool takes an NAT eval trace and generates a single HTML report. It breaks down cost ($), tokens, and latency by model, workflow phase (orchestrator vs planner vs researcher), tools, and per-query behavior.
The final charts are generated for cost, latency, cache use, and token patterns.
The
nat_adapter.pyis needed because NAT gives a raw event trace; it does not tell you “how much did the planner vs researchers cost?”. So this tool infers phase from task tool spans and subagent_type, applies pricing, and turns that into actionable rollups.