Metrics

Performance metrics are critical for evaluating the efficiency of computer hardware using specific, user-defined calculations based on hardware events. One key metric frequently used is the "Cycles per Instruction" (CPI). This metric helps to measure how many CPU cycles are consumed for executiong an instruction, providing insight into the system's efficiency—the fewer the cycles needed per instruction, the more efficient the system.

Tip

Our examples include a working code-example: statistics/metric.cpp.

When defining custom metrics, you should take a look at the list of metrics in the Likwid project.

Note

Metrics are not applicable for live events.

Built-in Metrics

perf-cpp comes pre-equipped with several built-in metrics which can be used analogously to events. To employ these metrics, include their names in the perf::EventCounter instance as shown in the Utilizing Metrics section:

Metric name	Description
`gigahertz`	Processor speed during the measurement (`cycles/seconds*1e+09`).
`cycles-per-instruction`	Represents the number of cycles required per instruction.
`instructions-per-cycle`	Represents the number of instructions executed per cycle.
`cache-hit-ratio`	Indicates the ratio of cache hits to total cache accesses.
`cache-miss-ratio`	Indicates the ratio of cache misses to total cache accesses.
`dTLB-miss-ratio`	The ratio of data TLB misses to data TLB accesses.
`iTLB-miss-ratio`	The ratio of instruction TLB misses to instruction TLB accesses.
`L1-data-miss-ratio`	Reflects the ratio of L1 data cache misses to L1 data cache accesses.
`branch-miss-ratio`	Reflects the ratio of branch misses to executed branches.

Utilizing Metrics

Metrics function similarly to hardware events in the perf::EventCounter:

#include <perfcpp/event_counter.h>

const auto counter_definition = perf::CounterDefinition{};
auto event_counter = perf::EventCounter{ counter_definition };

/// Add the metric like a "normal" hardware event.
event_counter.add("cycles-per-instruction");

/// Record events and metrics.
event_counter.start();
/// ....
event_counter.stop();
const auto result = event_counter.result();

/// Access the metric like events.
const auto cycles_per_instruction = result.get("cycles-per-instruction");

When metrics are used, perf-cpp internally counts the required hardware events (like cycles and instructions for CPI) and displays only the specified metrics and events.

Creating Custom Metrics

Metrics are often based on the performance counters supported by the underlying hardware. You can create custom metrics to tailor them to your specific hardware.

Tip

The Likwid project gives an excellent and extensive list of available metrics for various CPUs. Take a look at their groups/ directory.

There are two ways to define custom metrics. For both, you will need to create your own instance of the perf::CounterDefinition and pass it to the perf::EventCounter or perf::Sampler.

Using Formulas

The first option is to express a metric as a calculation of several hardware and time events, for example:

auto counter_definition = perf::CounterDefinition{};
counter_definition.add("stalls-by-mem-loads", 
                        "(CYCLE_ACTIVITY_STALLS_LDM_PENDING / CYCLE_ACTIVITY_STALLS_TOTAL) * 100");

auto event_counter = perf::EventCounter{ counter_definition };

The formular can use the following operators: +, -, *, and /.

In addition, scientific numbers (e.g., 1E5, 1e-5) can be used.

Note

In formulas, event names that contain operators (like - in L1D-misses) need to be escaped using single quotes, e.g., 'L1D-misses'.

The example depends on events from the Intel SkylakeX architecture and is taken from Likwid.

Implementing Metrics using the Interface

The second option is to define metrics by implementing the perf::Metric interface, for example:

#include <perfcpp/metric.h>
class StallsPerCacheMiss final : public perf::Metric
{
public:
    /// Provides a name used to access the metric value.
    [[nodiscard]] std::string name() const override 
    {
        return "stalls-per-cache-miss"; 
    }
    
    /// Identifies the necessary hardware events for this metric.
    [[nodiscard]] std::vector<std::string> required_counter_names() const 
    { 
        return {"stalls", "cache-misses"}; 
    }
    
    /// Calculates the metric using the recorded hardware event data.
    /// Calculation happens after stopping the EventCounter.
    [[nodiscard]] std::optional<double> calculate(const CounterResult& result) const
    {
        const auto stalls = result.get("stalls");
        const auto cache_misses = result.get("cache-misses");

        if (stalls.has_value() && cache_misses.has_value())
        {
            return stalls.value() / cache_misses.value();
        }

        return std::nullopt;
    }
};

After implementing custom metrics, incorporate them into the perf::CounterDefinition to utilize them effectively:

auto counter_definition = perf::CounterDefinition{};
counter_definition.add(std::make_unique<StallsPerCacheMiss>());

auto event_counter = perf::EventCounter{ counter_definition };
event_counter.add("stalls-per-cache-miss");

You can also rename the metrics as needed:

/// Add the metric using a custom name:
counter_definition.add("SPM", std::make_unique<StallsPerCacheMiss>());

/// Use the custom name:
event_counter.add("SPCM");

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!