There is perf
command-linux utility in Linux to access hardware performance-monitoring counters, it works using perf_events
kernel subsystems.
perf
itself has basically two modes: perf record
/perf top
to record sampling profile (the sample is for example every 100000th cpu clock cycle or executed command), and perf stat
mode to report total count of cycles/executed commands for the application (or for the whole system).
Is there mode of perf
to print system-wide or per-CPU summary on total count every second (every 3, 5, 10 seconds), like it is printed in vmstat
and systat-family tools (iostat
, mpstat
, sar -n DEV
… like listed in http://techblog.netflix.com/2015/11/linux-performance-analysis-in-60s.html)? For example, with cycles and instructions counters I will get mean IPC for every second of system (or of every CPU).
Is there any non-perf
tool (in https://perf.wiki.kernel.org/index.php/Tutorial or http://www.brendangregg.com/perf.html) which can get such statistics with perf_events
kernel subsystem? What about system-wide per-process IPC calculation with resolution of seconds?
Advertisement
Answer
There is perf stat
option “interval-print” of -I N
where N is millisecond interval to do interval counter printing every N milliseconds (N>=10): http://man7.org/linux/man-pages/man1/perf-stat.1.html
-I msecs, --interval-print msecs Print count deltas every N milliseconds (minimum: 10ms) The overhead percentage could be high in some cases, for instance with small, sub 100ms intervals. Use with caution. example: perf stat -I 1000 -e cycles -a sleep 5 For best results it is usually a good idea to use it with interval mode like -I 1000, as the bottleneck of workloads can change often.
There is also importing results in machine-readable form, and with -I
first field is datetime:
With -x, perf stat is able to output a not-quite-CSV format output … optional usec time stamp in fractions of second (with -I xxx)
vmstat
, systat-family tools iostat
, mpstat
, etc periodic printing is -I 1000
of perf stat (every second), for example system-wide (add -A to separate cpu counters):
perf stat -a -I 1000
The option is implemented in builtin-stat.c http://lxr.free-electrons.com/source/tools/perf/builtin-stat.c?v=4.8 __run_perf_stat
function
531 static int __run_perf_stat(int argc, const char **argv) 532 { 533 int interval = stat_config.interval;
For perf stat -I 1000
with some program argument (forks=1
), for example perf stat -I 1000 sleep 10
there is interval loop (ts
is the millisecond interval converted to struct timespec
):
639 enable_counters(); 641 if (interval) { 642 while (!waitpid(child_pid, &status, WNOHANG)) { 643 nanosleep(&ts, NULL); 644 process_interval(); 645 } 646 } 666 disable_counters();
For variant of system-wide hardware performance monitor counting and forks=0
there is other interval loop
658 enable_counters(); 659 while (!done) { 660 nanosleep(&ts, NULL); 661 if (interval) 662 process_interval(); 663 } 666 disable_counters();
process_interval()
http://lxr.free-electrons.com/source/tools/perf/builtin-stat.c?v=4.8#L347 from the same file uses read_counters();
which loops over event list and invokes read_counter()
which loops over all known threads and all cpus and starts actual reading function:
306 for (thread = 0; thread < nthreads; thread++) { 307 for (cpu = 0; cpu < ncpus; cpu++) { ... 310 count = perf_counts(counter->counts, cpu, thread); 311 if (perf_evsel__read(counter, cpu, thread, count)) 312 return -1;
perf_evsel__read
is the real counter read while program is still running:
1207 int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread, 1208 struct perf_counts_values *count) 1209 { 1210 memset(count, 0, sizeof(*count)); 1211 1212 if (FD(evsel, cpu, thread) < 0) 1213 return -EINVAL; 1214 1215 if (readn(FD(evsel, cpu, thread), count, sizeof(*count)) < 0) 1216 return -errno; 1217 1218 return 0; 1219 }