I am trying to figure out the event to use with the perf stat command to count L3 cache accesses on an AMD Zen 2 processor. As per the PPR (http://developer.amd.com/wordpress/media/2017/11/54945_PPR_Family_17h_Models_00h-0Fh.pdf), section 2.1.13.4.1, page 168, the event is x01 and the umask is x80 for “…
Tag: perf
Is linux perf accurate for measuring cache misses for multithread C program?
Can linux perf measure cache misses for multithread program, or it can only report the result for master thread? I used it on a C program using pthread, it seemed the cache miss number was lower than the expected number. Answer Yes, perf stat is an accurate total across all threads. (Unless your CPU has an er…
Weird Backtrace in Perf
I used the following command to extract backtraces leading to user level L3-misses in a simple evince benchmark: As it is clear, the sampling period is quite large (10000 events between consecutive samples). For this experiment, the output of perf script had some samples similar to this one: At the bottom of …
definition of linux perf cache-misses event?
I am trying to use linux perf to profile cache performance. perf list shows there is a cache-misses event. However, what’s the definition of this “cache-misses” event? Is it one of L1D/L1i cache, L2 cache or L3 cache? Thanks! Answer The cache-misses event corresponds to the misses in the las…
how to get rid of the “unknown” section in the perf
what I did is: then I get a tiny part of “unknown”, like looks this is due to libc ‘malloc’ call. then I write a program on the same machine to test it. then I did the same thing as above, there is no “unknown” section. how to explain/fix this? Answer The [unknown] block in…
Linux perf_events annotation frame pointer confusion
I ran sudo perf record -F 99 find / followed by sudo perf report and selected “Annotate fdopendir” and here are the first seven instructions: push %rbp push %rbx mov %edi,%esi mov %edi,%ebx mov $0x1,%edi sub $0xa8,%rsp mov %rsp,%rbp The first instruction appears to be saving the caller’s bas…
How does perf associate events to functions?
More precisely how does the perf tool associate PMU events to functions i already realized that when the kernel perf subsystem records the event counters it also records the Program Counter (PC) so it can associate the count to a function. However to really get fine grain result, you need to sample the counte…
Which perf events can use PEBS?
I want to understand which events can have the precise modifier on my CPU (Sandy Bridge). Intel Software Developer’s Manual (Table 18-32. PEBS Performance Events for Intel Microarchitecture Code Name Sandy Bridge) contains only the following events: INST_RETIRED, UOPS_RETIRED, BR_INST_RETIRED, BR_MISP_R…
How do I use a newer perf tool front end with a record from an older perf version
I am running perf record on an older version of the kernel on an ARM board. The kernel version is 3.18.21-rt19 The perf version on the board is similarly perf version 3.18.21. While I can record and use the report feature on this perf, the TUI for report on this version is quite awful/non-existent. Instead of…
Use linux perf utility to report counters every second like vmstat
There is perf command-linux utility in Linux to access hardware performance-monitoring counters, it works using perf_events kernel subsystems. perf itself has basically two modes: perf record/perf top to record sampling profile (the sample is for example every 100000th cpu clock cycle or executed command), an…