Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avxtime: Summarize AVX-512 cputime per task as a histogram #4795

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

yezhiyong30
Copy link
Contributor

The AVX-512 instruction set in x86 can accelerate FPU execution but also causes core turbo frequency drop that impacts sibling CPUs. This can affect performance of other workloads running on the sibling CPU. Detecting AVX-512 usage is thus critical in cloud environments for placement.

This tool summarizes AVX-512 cputime as a histogram, showing the amount of CPU time consumed by AVX-512 per-task. This provides valuable insights - it can identify processes and CPUs executing AVX-512 to avoid scheduling sensitive jobs together. The cputime distribution can also help debug AVX-512 performance issues.

This program is based on this Linux kernel patch that tracks per-task AVX-512 usage:

torvalds/linux@2f7726f

It uses BPF to measure the time between a task's AVX-512 timestamps to calculate the CPU time in AVX-512 mode. The aggregated cputime distribution is printed for analysis.

@yezhiyong30
Copy link
Contributor Author

Can you help me review it? I will improve it with any comments.
@brendangregg @goldshtn @devidasjadhav @drzaeus77 @yonghong-song @4ast

@chenhengqi
Copy link
Collaborator

I wonder how to create a workload to test this tool.

Copy link
Member

@brendangregg brendangregg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with tracing x86_fpu_regs_deactivated, so I'll assume you've dug into that and understand how it all works.

Note that there are AVX-512 PMCs but I don't think they are easily findable. https://github.com/intel/PerfSpect recently added support for AVX-512 metrics, although I don't think it can emit a per-PID breakdown or the context-switch event runtimes. (It's also unlikly to work in most cloud environments due to a lack of the PMU.)

I think this tool should be a good starting point. Future enhancements could add more detail about how the system is affected (power, clockspeed, CPI) during AVX-512 runs, likely requiring PMCs.

critical in cloud environments for placement.

This tool summarizes AVX-512 cputime as a histogram, showing the amount of CPU
time consumed by AVX-512 per-task. This provides valuable insights - it can
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused about this first sentence: If it's measuring AVX-512 time per task, what is the interval? Per second? It looks like it's doing it for each context switch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, maybe I didn't describe it clearly enough. It is similar to cpudist. It calculates the AVX-512 time from the beginning of each context switch until the end of x86_fpu_regs_deactivated.

README.md Outdated
@@ -85,6 +85,7 @@ pair of .c and .py files, and some are directories of files.


- tools/[argdist](tools/argdist.py): Display function parameter values as a histogram or frequency count. [Examples](tools/argdist_example.txt).
- tools/[avxcputime](tools/avxcputime.py): Summarize AVX-512 cputime per task as a histogram. [Examples](tools/avxcputime_example.txt).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider calling it just avxtime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, avxtime sounds great.

@yezhiyong30
Copy link
Contributor Author

I wonder how to create a workload to test this tool.

Hi hengqi, thanks for the comment. I have used this benchmark to simulate avx512 workload for testing. I hope it will be helpful to you.

https://github.com/VictorRodriguez/AVX-SG

@yezhiyong30
Copy link
Contributor Author

I'm not familiar with tracing x86_fpu_regs_deactivated, so I'll assume you've dug into that and understand how it all works.

Note that there are AVX-512 PMCs but I don't think they are easily findable. https://github.com/intel/PerfSpect recently added support for AVX-512 metrics, although I don't think it can emit a per-PID breakdown or the context-switch event runtimes. (It's also unlikly to work in most cloud environments due to a lack of the PMU.)

I think this tool should be a good starting point. Future enhancements could add more detail about how the system is affected (power, clockspeed, CPI) during AVX-512 runs, likely requiring PMCs.

Hi Brendan, thank you for your insightful comments. As you mentioned, there are now some PMCs available to support the detection of AVX-512. The PMCs of the fp_arith_inst_retired series can be used to calculate FLOPS. However, I found that many machines do not support this PMC, especially in cloud environments.

In addition, there are some PMCs that can hint at the execution of AVX-512 instructions, such as:

  • core_power.lvl0turbo_license
  • core_power.lvl1_turbo_license
  • core_power.lvl2_turbo_license

The more AVX-512 executions there are, the higher the level will be, and the CPU will downclock more severely. However, these PMCs can only be used to estimate the execution of AVX-512 instructions.

The AVX-512 instruction set in x86 can accelerate FPU execution but also causes
core turbo frequency drop that impacts sibling CPUs. This can affect performance
of other workloads running on the sibling CPU. Detecting AVX-512 usage is thus
critical in cloud environments for placement.

This tool summarizes AVX-512 cputime as a histogram, showing the amount of CPU
time consumed by AVX-512 per-task. Within the specified time period, it tracks
the time spent from being scheduled on the CPU until the FPU registers are
deactivated for each task that uses AVX-512.

This provides valuable insights - it can identify processes and CPUs executing
AVX-512 to avoid scheduling sensitive jobs together. The cputime distribution
can also help debug AVX-512 performance issues.

This program is based on this Linux kernel patch that tracks per-task AVX-512 usage:

torvalds/linux@2f7726f

It uses BPF to measure the time between a task's AVX-512 timestamps to calculate
the CPU time in AVX-512 mode. The aggregated cputime distribution is printed for
analysis.

Signed-off-by: Zhiyong Ye <[email protected]>
@yezhiyong30
Copy link
Contributor Author

Hi @brendangregg ,

I've updated the PR. Changes are:

  • Rename avxcputime to avxtime.
  • Added explanation of AVX-512 time measurement method.

Within the specified time period, it tracks the time spent from being scheduled
on the CPU until the FPU registers are deactivated for each task that uses AVX-512.

Welcome to review it again. Any comments can help me improve this patch.

@yezhiyong30 yezhiyong30 changed the title avxcputime: Summarize AVX-512 cputime per task as a histogram avxtime: Summarize AVX-512 cputime per task as a histogram Dec 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants