Skip to content
Advertisement

What is C-state Cx in cpupower monitor

I am profiling an application for execution time on an x86-64 processor running linux. Before starting to benchmark the application, I want to make sure that the Dynamic Frequency scaling and idle states are disabled.

Check on Frequency scaling

$ cat /sys/devices/system/cpu/cpufreq/boost
0

This tells me that the Frequency scaling(Intel’s Turbo Boost or AMD’s Turbo Core) is disabled. In fact, we set it to a constant 2GHz which is evident from the next exercise.

Check on CPU idling

$ cpupower --cpu 0-63 idle-info
CPUidle driver: none

CPUidle governor: menu
analyzing CPU 0:

CPU 0: No idle states

analyzing CPU 1:

CPU 1: No idle states

analyzing CPU 2:

CPU 2: No idle states
...

So, the idle states are disabled. Now that I am sure both the “features” which can meddle with bench marking are disabled, I go ahead to monitor the application using cpupower.

But then, when I run my application for monitoring the C-states, I see that more than 99% time is spent in C0 state which should be the case. However, I also see something called Cx state in which the cores spend 0.01 – 0.02% of the time.

$ cpupower monitor -c ./my_app
./my_app took 32.28017 seconds and exited with status 0
    |Mperf
CPU | C0   | Cx   | Freq
   0| 99.98|  0.02|  1998
  32| 99.98|  0.02|  1998
   1|100.00|  0.00|  1998
  33| 99.99|  0.01|  1998
   2|100.00|  0.00|  1998
  34| 99.99|  0.01|  1998
   3|100.00|  0.00|  1998
  35| 99.99|  0.01|  1998
  ...

So, would be glad to understand the below.

  1. What is Cx state? And should I be less bothered looking at such low numbers?
  2. Are there any other features apart from Frequency scaling and CPU idling that I should care about (from a bench marking perspective)?

Bonus Question

  1. What does CPUidle driver: none mean?

Edit 1

For the 2nd question on additional concerns during benchmarking, I recentlv found out that the local timer interrupts on a CPU core for scheduling purposes could skew the measurements, so the CONFIG_NO_HZ_FULL is enabled in Linux kernel to enable tickless mode

Advertisement

Answer

The beauty of open source software is that you can always go and check 🙂
cpupower monitor uses different monitors, the mperf monitor defines this array:

static cstate_t mperf_cstates[MPERF_CSTATE_COUNT] = {
    {
        .name           = "C0",
        .desc           = N_("Processor Core not idle"),
        .id         = C0,
        .range          = RANGE_THREAD,
        .get_count_percent  = mperf_get_count_percent,
    },
    {
        .name           = "Cx",
        .desc           = N_("Processor Core in an idle state"),
        .id         = Cx,
        .range          = RANGE_THREAD,
        .get_count_percent  = mperf_get_count_percent,
    },

    {
        .name           = "Freq",
        .desc           = N_("Average Frequency (including boost) in MHz"),
        .id         = AVG_FREQ,
        .range          = RANGE_THREAD,
        .get_count      = mperf_get_count_freq,
    },
};

Quite logically, Cx means any C-state not C0, i.e. any idle state (Note that these states are not the ACPI states, though an higher number is a deeper sleep state – for ACPI off is C6).

Note how Cx is computed:

if (id == Cx)
    *percent = 100.0 - *percent;

Cx is simply the complement of C0.
This is because the IA32_M/APERF counter used do not count in any C-state but C0:

C0 TSC Frequency Clock Count
Increments at fixed interval (relative to TSC freq.) when the logical processor is in C0.

A similar definition for IA32_APERF is present in the manuals.


There are a lot of thing to consider when benchmarking, probably more than can be listed as a secondary answer.
In general the later run of the tested code will find at least part of the data hot in the caches (same for TLBs and any internal caching).

Interrupt affinity is also something to consider depending on the benchmarked program.

However I’d say that with turbo boost and scaling disabled you are pretty much ready to test.


A CPUIdle driver is a component of the kernel that control the platform-dependent part of entering and egressing to/from idle states.
For Intel CPUs (and AMD ones?) the kernel can either use the ACPI processor_idle driver (if enabled) or the intel_idle (that uses mwait).

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement