I have a program and I want to measure its execution (wallclock) time for different input sizes.
In some other similar questions I read that using clock_gettime
in the source code wouldn’t be reliable because of the CPUs branch predictor, register renaming, speculative execution, out-of-order execution etc., and sometimes even the optimizer can move the clock_gettime
call someplace other than where I put it.
But these questions I read were about measuring the time of a specific function. Would these problems still exist if I’m measuring for the whole program (i.e. the main
function)? I’m looking for relative measurements, how the execution time changes for different input sizes, not the absolute value.
How would I get better results? Using timing functions in the code:
start = clock_gettime(); do_stuff(); end = clock_gettime(); execution_time = end - start;
or with the time
command in bash
:
time ./program
Advertisement
Answer
Measuring in the program will give you a more accurate answer. Sure, in theory, in some cases you can get the clock_gettime
calls moved where you don’t expect them. In practice, it will not happen if you have only a function call in between. (If in doubt, check the resulting assembler code)
Calling time
in shell will include things you don’t care about, like the time it takes to load your executable and get to the interesting point. On the other hand, if your do_stuff
takes a few seconds, then it doesn’t really matter.
I’d go with the following recommendation:
- If you can isolate your function easily and make it take a few seconds (you can also loop it, but measure empty loop for comparison as well), then either
clock_gettime
ortime
will do just fine. - If you cannot isolate easily, but your function consistently takes hundreds of milliseconds, use
clock_gettime
. - If you cannot isolate and you’re optimising something tiny, have a look at rdtsc timing for a measuring a function which talks about measuring actual executed cycles.