Skip to content

Commit

Permalink
avxcputime: Summarize AVX-512 cputime per task as a histogram
Browse files Browse the repository at this point in the history
The AVX-512 instruction set in x86 can accelerate FPU execution but also causes
core turbo frequency drop that impacts sibling CPUs. This can affect performance
of other workloads running on the sibling CPU. Detecting AVX-512 usage is thus
critical in cloud environments for placement.

This tool summarizes AVX-512 cputime as a histogram, showing the amount of CPU
time consumed by AVX-512 per-task. This provides valuable insights - it can
identify processes and CPUs executing AVX-512 to avoid scheduling sensitive jobs
together. The cputime distribution can also help debug AVX-512 performance issues.

This program is based on this Linux kernel patch that tracks per-task AVX-512 usage:

torvalds/linux@2f7726f

It uses BPF to measure the time between a task's AVX-512 timestamps to calculate
the CPU time in AVX-512 mode. The aggregated cputime distribution is printed for
analysis.

Signed-off-by: Zhiyong Ye <[email protected]>
  • Loading branch information
yezhiyong30 committed Nov 6, 2023
1 parent b8b943a commit 3b9d4ba
Show file tree
Hide file tree
Showing 4 changed files with 614 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ pair of .c and .py files, and some are directories of files.


- tools/[argdist](tools/argdist.py): Display function parameter values as a histogram or frequency count. [Examples](tools/argdist_example.txt).
- tools/[avxcputime](tools/avxcputime.py): Summarize AVX-512 cputime per task as a histogram. [Examples](tools/avxcputime_example.txt).
- tools/[bashreadline](tools/bashreadline.py): Print entered bash commands system wide. [Examples](tools/bashreadline_example.txt).
- tools/[bindsnoop](tools/bindsnoop.py): Trace IPv4 and IPv6 bind() system calls (bind()). [Examples](tools/bindsnoop_example.txt).
- tools/[biolatency](tools/biolatency.py): Summarize block device I/O latency as a histogram. [Examples](tools/biolatency_example.txt).
Expand Down
113 changes: 113 additions & 0 deletions man/man8/avxcputime.8
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
.TH avxcputime 8 "2023-10-29" "USER COMMANDS"
.SH NAME
avxcputime \- Summarize AVX-512 cputime per task as a histogram.
.SH SYNOPSIS
.B avxcputime [\-h] [\-T] [\-m] [\-C] [\-P] [\-c CPU] [\-p PID] [interval] [count]
.SH DESCRIPTION
The AVX-512 instruction set in x86 can accelerate FPU execution but also causes
core turbo frequency drop that impacts sibling CPUs. This can affect performance
of other workloads running on the sibling CPU. Detecting AVX-512 usage is thus
critical in cloud environments for placement.

This tool summarizes AVX-512 cputime as a histogram, showing the amount of CPU
time consumed by AVX-512 per-task. This provides valuable insights - it can
identify processes and CPUs executing AVX-512 to avoid scheduling sensitive jobs
together. The cputime distribution can also help debug AVX-512 performance issues.

This tool uses in-kernel eBPF maps for storing timestamps and the histogram,
for efficiency. Despite this, the overhead of this tool may become significant
for some workloads: see the OVERHEAD section.

Since this uses BPF, only the root user can use this tool.
.SH REQUIREMENTS
CONFIG_BPF and bcc.
.SH OPTIONS
.TP
\-h
Print usage message.
.TP
\-T
Include timestamps on output.
.TP
\-m
Output histogram in milliseconds.
.TP
\-C
Print a histogram for each CPU separately.
.TP
\-P
Print a histogram for each PID (tgid from the kernel's perspective).
.TP
\-c CPU
Only show this CPU (filtered in kernel for efficiency).
.TP
\-p PID
Only show this PID (filtered in kernel for efficiency).
.TP
interval
Output interval, in seconds.
.TP
count
Number of outputs.
.SH EXAMPLES
.TP
Summarize AVX-512 cputime per task as a histogram:
#
.B avxcputime
.TP
Print 1 second summaries, 10 times:
#
.B avxcputime 1 10
.TP
Print 1 second summaries, using milliseconds as units for the histogram, and include timestamps on output:
#
.B avxcputime \-mT 1
.TP
Show each CPU separately, 1 second summaries:
#
.B avxcputime \-C 1
.TP
Show each PID separately, 1 second summaries:
#
.B avxcputime \-P 1
.TP
Trace CPU 1 only, 1 second summaries:
#
.B avxcputime \-c 1 1
.TP
Trace PID 185 only, 1 second summaries:
#
.B avxcputime \-p 185 1
.TP
usecs
Microsecond range
.TP
msecs
Millisecond range
.TP
count
How many times a task event fell into this range
.TP
distribution
An ASCII bar chart to visualize the distribution (count column)
.SH OVERHEAD
This traces scheduler tracepoints, which can become very frequent. While eBPF
has very low overhead, and this tool uses in-kernel maps for efficiency, the
frequency of scheduler events for some workloads may be high enough that the
overhead of this tool becomes significant. Measure in a lab environment
to quantify the overhead before use.
.SH SOURCE
This is from bcc.
.IP
https://github.com/iovisor/bcc
.PP
Also look in the bcc distribution for a companion _example.txt file containing
example usage, output, and commentary for this tool.
.SH OS
Linux
.SH STABILITY
Unstable - in development.
.SH AUTHOR
Zhiyong Ye <[email protected]>
.SH SEE ALSO
cpudist(8)
Loading

0 comments on commit 3b9d4ba

Please sign in to comment.