Skip to content

feat: cost analysis tool with pricing configs, one-time report generation tools, and profiling config example#172

Open
cdgamarose-nv wants to merge 7 commits intoNVIDIA-AI-Blueprints:developfrom
cdgamarose-nv:cdgamarose/tokenomics
Open

feat: cost analysis tool with pricing configs, one-time report generation tools, and profiling config example#172
cdgamarose-nv wants to merge 7 commits intoNVIDIA-AI-Blueprints:developfrom
cdgamarose-nv:cdgamarose/tokenomics

Conversation

@cdgamarose-nv
Copy link
Copy Markdown
Collaborator

@cdgamarose-nv cdgamarose-nv commented Apr 6, 2026

This cost/profiling tool takes an NAT eval trace and generates a single HTML report. It breaks down cost ($), tokens, and latency by model, workflow phase (orchestrator vs planner vs researcher), tools, and per-query behavior.

The final charts are generated for cost, latency, cache use, and token patterns.

The nat_adapter.py is needed because NAT gives a raw event trace; it does not tell you “how much did the planner vs researchers cost?”. So this tool infers phase from task tool spans and subagent_type, applies pricing, and turns that into actionable rollups.

@cdgamarose-nv cdgamarose-nv marked this pull request as ready for review April 8, 2026 05:49
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 8, 2026

Greptile Summary

This PR adds a new aiq_agent.tokenomics package that parses NAT profiler trace JSON files, applies configurable per-model/per-tool pricing, and renders self-contained HTML reports with cost, latency, token, and efficiency breakdowns. The architecture is clean and well-structured across its modules, with a solid test suite covering core parsing and aggregation logic.

Three P2 findings remain:

  • Question text from eval traces is inserted into the detail table via innerHTML without HTML escaping (_report_template_single.py L748, _report_template_comparison.py L510).
  • When more than two --trace paths are supplied, only the first two are compared and the rest are silently dropped.
  • The PricingRegistry substring fallback is insertion-order-dependent; overlapping model keys (e.g., \"gpt-4\" and \"gpt-4o\") can silently resolve to the wrong price.

Confidence Score: 5/5

Safe to merge; all remaining findings are P2 quality/hardening suggestions with no blocking defects on the primary path.

No P0 or P1 issues. The three P2 findings are either best-practice hardening (HTML escaping in an offline report), user-experience nits (silent truncation of extra trace paths), or only triggered by unusual pricing configs (ambiguous substring model names). Core parsing, pricing arithmetic, and report generation are correct.

pricing.py (substring model matching), _report_template_single.py and _report_template_comparison.py (innerHTML escaping)

Important Files Changed

Filename Overview
src/aiq_agent/tokenomics/pricing.py Pricing registry with exact → substring → default lookup; substring fallback is insertion-order-dependent and can silently match the wrong model when multiple configured keys overlap.
src/aiq_agent/tokenomics/nat_adapter.py New module converting NAT profiler traces to RequestProfile objects via timing-window-based phase inference; logic is sound, ast.literal_eval fallback well-guarded.
src/aiq_agent/tokenomics/report/_report_builders.py Aggregation logic is well-structured; comparison builder silently ignores runs beyond the second without warning.
src/aiq_agent/tokenomics/report/_report_template_single.py Comprehensive single-run HTML/JS template; question text is inserted into tbody.innerHTML without HTML escaping.
src/aiq_agent/tokenomics/report/_report_template_comparison.py Comparison template mirrors the single-run structure; same unescaped question-text innerHTML issue present in renderDetail.
src/aiq_agent/tokenomics/profile.py Clean data classes for PhaseStats and RequestProfile; consistent within the codebase.
src/aiq_agent/tokenomics/report/_report_base.py Shared CSS/JS constants and HTML assembly helper; loads Plotly from a public CDN.
src/aiq_agent/tokenomics/report/_report_stats.py Pure stat helpers (percentiles, latency stats, CSV prediction loader); clean and correct.
src/aiq_agent/tokenomics/report/init.py Public API: generates a single- or comparison-mode HTML report; output path fallback logic is clear and correct.
src/aiq_agent/tokenomics/report/main.py Minimal CLI entry point with argparse; correctly delegates to generate_report.

Sequence Diagram

sequenceDiagram
    participant CLI as python -m report
    participant GR as generate_report()
    participant YAML as config.yml
    participant PR as PricingRegistry
    participant NA as nat_adapter.parse_trace()
    participant BD as _build_report_data()
    participant RT as render_html()
    participant HTML as tokenomics_report.html

    CLI->>GR: trace path(s) + config path
    GR->>YAML: yaml.safe_load(config)
    YAML-->>GR: pricing dict
    GR->>PR: PricingRegistry.from_dict(pricing_raw)
    PR-->>GR: pricing registry

    loop for each trace path
        GR->>NA: parse_trace(path, pricing)
        Note over NA: builds _TaskWindow list, infers phase per LLM_END
        NA-->>GR: list[RequestProfile]
        GR->>BD: _build_report_data(profiles, pricing)
        BD-->>GR: report_data dict
    end

    alt single run
        GR->>RT: render_html → _report_template_single
    else comparison mode
        GR->>GR: _build_comparison_data(run_datas)
        GR->>RT: render_html → _report_template_comparison
    end

    RT-->>GR: HTML string
    GR->>HTML: write to output_path
Loading

Reviews (3): Last reviewed commit: "Merge branch 'NVIDIA-AI-Blueprints:devel..." | Re-trigger Greptile

@cdgamarose-nv cdgamarose-nv marked this pull request as draft April 8, 2026 15:55
@cdgamarose-nv cdgamarose-nv marked this pull request as ready for review April 9, 2026 16:41
@cdgamarose-nv cdgamarose-nv requested a review from AjayThorve April 9, 2026 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant