Skip to content

Tag: perf

Counting L3 cache access event on Amd Zen 2 processors

I am trying to figure out the event to use with the perf stat command to count L3 cache accesses on an AMD Zen 2 processor. As per the PPR (, section, page 168, the event is x01 and the umask is x80 for “[L3 Cache Accesses] (L3RequestG1)”. From what I understand, the event to use in perf stat

Weird Backtrace in Perf

I used the following command to extract backtraces leading to user level L3-misses in a simple evince benchmark: As it is clear, the sampling period is quite large (10000 events between consecutive samples). For this experiment, the output of perf script had some samples similar to this one: At the bottom of the backtrace, there is a symbol called [unknown],

definition of linux perf cache-misses event?

I am trying to use linux perf to profile cache performance. perf list shows there is a cache-misses event. However, what’s the definition of this “cache-misses” event? Is it one of L1D/L1i cache, L2 cache or L3 cache? Thanks! Answer The cache-misses event corresponds to the misses in the last level cache (LLC). Note that this is an architectural performance

how to get rid of the “unknown” section in the perf

what I did is: then I get a tiny part of “unknown”, like looks this is due to libc ‘malloc’ call. then I write a program on the same machine to test it. then I did the same thing as above, there is no “unknown” section. how to explain/fix this? Answer The [unknown] block in the perf report output refers

Linux perf_events annotation frame pointer confusion

I ran sudo perf record -F 99 find / followed by sudo perf report and selected “Annotate fdopendir” and here are the first seven instructions: push %rbp push %rbx mov %edi,%esi mov %edi,%ebx mov $0x1,%edi sub $0xa8,%rsp mov %rsp,%rbp The first instruction appears to be saving the caller’s base frame pointer. I believe instructions 2 through 5 are irrelevant to

How does perf associate events to functions?

More precisely how does the perf tool associate PMU events to functions i already realized that when the kernel perf subsystem records the event counters it also records the Program Counter (PC) so it can associate the count to a function. However to really get fine grain result, you need to sample the counters in a very high rate, otherwise

Which perf events can use PEBS?

I want to understand which events can have the precise modifier on my CPU (Sandy Bridge). Intel Software Developer’s Manual (Table 18-32. PEBS Performance Events for Intel Microarchitecture Code Name Sandy Bridge) contains only the following events: INST_RETIRED, UOPS_RETIRED, BR_INST_RETIRED, BR_MISP_RETIRED, MEM_UOPS_RETIRED, MEM_LOAD_UOPS_RETIRED, MEM_LOAD_UOPS_LLC_HIT_RETIRED. And SandyBridge_core_V15.json lists the same events with PEBS > 0. However there are some examples of

Use linux perf utility to report counters every second like vmstat

There is perf command-linux utility in Linux to access hardware performance-monitoring counters, it works using perf_events kernel subsystems. perf itself has basically two modes: perf record/perf top to record sampling profile (the sample is for example every 100000th cpu clock cycle or executed command), and perf stat mode to report total count of cycles/executed commands for the application (or for