Skip to content

Commit 29ae9f6

Browse files
committed
feat: add aiperf viz DEP
1 parent 37e2ec5 commit 29ae9f6

File tree

1 file changed

+239
-0
lines changed

1 file changed

+239
-0
lines changed

deps/0012-aiperf-viz.md

Lines changed: 239 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,239 @@
1+
# Dynamo AIPerf Analyze and Visualize For Profiling Sweeps
2+
3+
**Status**: Draft
4+
5+
**Authors**: [ilana-n]
6+
7+
**Category**: Feature
8+
9+
**Replaces**: N/A
10+
11+
**Replaced By**: N/A
12+
13+
**Sponsor**: [TBD]
14+
15+
**Required Reviewers**: [TBD]
16+
17+
**Review Date**: [TBD]
18+
19+
**Pull Request**: [TBD]
20+
21+
**Implementation PR / Tracking Issue**: [TBD]
22+
23+
# Summary
24+
25+
Introduce two new commands to AIPerf: `aiperf analyze` runs parameter sweeps and stores results, and `aiperf plot` generates visualizations from stored results. This addresses the common workflow of profiling across multiple configurations (e.g., different concurrency levels) to compare performance and generate pareto curves.
26+
27+
# Motivation
28+
29+
Users need to profile models across multiple parameter configurations to understand performance trade-offs.
30+
31+
The typical workflow is:
32+
1. Run profiling sweeps using a custom script with different parameters (concurrency, sequence length, etc.)
33+
2. Visualize results to identify optimal configurations
34+
3. Generate pareto curves showing throughput vs latency trade-offs
35+
36+
AIPerf can improve the user experience by having built-in methods to orchestrate these sweeps and visualize the results.
37+
38+
## Goals
39+
40+
* Run parameter sweeps with a single command that orchestrates multiple profiling runs
41+
* Store sweep results in organized directories
42+
* Provide interactive visualizations where users can adjust settings in the browser
43+
* Allow users to configure which plots are generated by default
44+
* Support static artifact generation for reports and CI/CD
45+
46+
## Non Goals
47+
48+
* Real-time visualization during profiling
49+
* Integration with external databases
50+
51+
# Proposal
52+
53+
## Command Design
54+
55+
### Analyze Command
56+
57+
Runs a parameter sweep and stores results:
58+
```bash
59+
# Basic sweep
60+
aiperf analyze \
61+
--model Qwen3-0.6B \
62+
--concurrency 1,4,8,16 \
63+
--output-dir ./my_sweep
64+
65+
# Multiple parameters
66+
aiperf analyze \
67+
--model Qwen3-0.6B \
68+
--concurrency 1,4,8 \
69+
--output-seq-len 128,256,512 \
70+
--output-dir ./my_sweep
71+
72+
# With config file
73+
aiperf analyze --config sweep.yaml --output-dir ./my_sweep
74+
```
75+
76+
**Behavior:**
77+
- Runs each profiling configuration sequentially (if we have multiple GPUs we could run in parallel?)
78+
- Stores each run in a unique subdirectory
79+
- Shows progress: `[2/4] Running: concurrency=4... (48s)`
80+
- Outputs message at completion: `Run 'aiperf plot ./my_sweep --host' to visualize`
81+
82+
### Plot Command
83+
84+
Generates visualizations from stored results. Supports both sweep-level comparison and single-run analysis:
85+
```bash
86+
# Sweep-level: Compare multiple runs interactively
87+
aiperf plot ./my_sweep --host
88+
89+
# Single-run: Analyze one profiling run
90+
aiperf plot ./my_sweep/Qwen3-0.6B-concurrency8 --host
91+
92+
# Or use --run flag to specify subdirectory
93+
aiperf plot ./my_sweep --run Qwen3-0.6B-concurrency8 --host
94+
95+
# Generate static images for sweep
96+
aiperf plot ./my_sweep --format png
97+
98+
# Generate static images for single run
99+
aiperf plot ./my_sweep/Qwen3-0.6B-concurrency8 --format png
100+
101+
# Generate HTML report
102+
aiperf plot ./my_sweep --format html
103+
```
104+
105+
**Sweep-Level Mode (multiple runs):**
106+
107+
When pointing to a sweep directory containing multiple runs:
108+
- Launches web server with sweep comparison dashboard
109+
- Users can adjust settings in browser:
110+
- Select which runs to compare
111+
- Change x-axis/y-axis for comparison plots
112+
- Toggle between plot types
113+
- Apply filters
114+
- Default plots focus on comparison:
115+
- Pareto curve (throughput vs latency across runs)
116+
- Throughput vs swept parameter
117+
- Latency distribution comparison
118+
- Resource utilization comparison
119+
120+
**Single-Run Mode (one run):**
121+
122+
When pointing to a specific run subdirectory:
123+
- Launches web server with single-run deep-dive dashboard
124+
- Users can explore per-request metrics over time:
125+
- Time series: TTFT (Time to First Token) per request
126+
- Time series: ITL (Inter-token Latency) per request
127+
- Time series: E2E latency per request
128+
- Request throughput over time
129+
- Resource utilization over time (GPU memory, utilization)
130+
- Interactive controls:
131+
- Zoom into time ranges
132+
- Filter by request characteristics (input length, output length)
133+
- Toggle between metrics
134+
- Overlay multiple metrics
135+
136+
**Static Mode (`--format`):**
137+
- Generates plot files in `{sweep_dir}/visualizations/` or `{run_dir}/visualizations/`
138+
- PNG: Individual plot images
139+
- HTML: Self-contained interactive report
140+
141+
## Configuration
142+
143+
Users can configure default plots for both interactive mode and static mode in `~/.aiperf/config.yaml`:
144+
```yaml
145+
visualization:
146+
default_plots:
147+
- pareto_curve
148+
- throughput_vs_concurrency
149+
- latency_distribution
150+
- resource_utilization
151+
152+
custom_plots:
153+
- name: "Memory Efficiency"
154+
type: "line"
155+
x_axis: "concurrency"
156+
y_axis: "gpu_memory_used"
157+
```
158+
159+
If no config exists, AIPerf provides some pre-determinted defaults:
160+
- Pareto curve (throughput vs latency)
161+
- Throughput vs swept parameter
162+
- Latency distribution
163+
164+
## Sweep Configuration File
165+
```yaml
166+
# sweep.yaml
167+
model: Qwen3-0.6B
168+
169+
sweep:
170+
concurrency: [1, 4, 8, 16, 32]
171+
output_seq_len: [128, 256, 512]
172+
173+
fixed:
174+
input_seq_len: 512
175+
```
176+
177+
## Visualization Technology: Plotly Dash
178+
179+
The plot command is planned to use **Plotly Dash**, a Python framework for building interactive web applications.
180+
181+
**Key Benefits:**
182+
- **Pure Python** - No JavaScript required, consistent with AIPerf's codebase
183+
- **Interactive by design** - Users adjust settings (axes, filters, run selection) in browser without re-running commands
184+
- **Self-contained** - Runs locally with no external services required
185+
- **Customizable** - Full control over AIPerf-specific visualizations and styling
186+
187+
**Proposed Flow:**
188+
```
189+
aiperf plot ./my_sweep --host
190+
191+
Load sweep data → Create Dash app → Launch server at localhost:8080
192+
193+
Browser opens with:
194+
- Sidebar: controls for run selection, axes, plot types
195+
- Main area: interactive Plotly charts (zoom, pan, hover)
196+
- Tabs: switch between visualizations (pareto, throughput, latency)
197+
```
198+
199+
**Example Interactivity:**
200+
```python
201+
# When user changes dropdown, plot updates instantly via callback
202+
@app.callback(Output('plot', 'figure'), Input('x-axis', 'value'))
203+
def update_plot(x_axis):
204+
return generate_plot(x_axis=x_axis)
205+
```
206+
207+
This approach should provide the customization needed for AIPerf's use cases. Alternative frameworks (Streamlit, React+Flask) could be considered during implementation if requirements change.
208+
209+
# Alternate Solutions
210+
211+
## Alt 1: Single Command (Sweep + Auto-Visualize)
212+
213+
Automatically generate visualizations after sweep completes.
214+
215+
**Pros:**
216+
- One command for everything
217+
- Immediate results
218+
219+
**Cons:**
220+
- Sweeps can take hours; users may want to visualize later
221+
- Cannot re-visualize with different settings without re-running sweep
222+
- Not flexible for CI/CD
223+
224+
**Reason Rejected:**
225+
Separate commands provide better control and avoid re-running expensive profiling operations when users only want different visualizations or running visualizations when users only want raw profiling exports.
226+
227+
## Alt 2: External Tools (TensorBoard/WanDB)
228+
229+
Use existing visualization platforms.
230+
231+
**Pros:**
232+
- No maintenance burden for UI/UX integrations.
233+
- Feature-rich
234+
235+
**Cons:**
236+
- WanDB needs login and WiFi access
237+
- Tensorboard is more specific to ML applications
238+
- Less control over AIPerf-specific visualizations
239+

0 commit comments

Comments
 (0)