Skip to content
Advertisement

Save output from ‘stats’ command in gnuplot

I want to statistically analyse outputfiles from a benchmark that runs on 600 nodes. In particular, I need the min, upper quartile, median, lower quartile, min and mean values. My output are the files testrun16-[1-600]

with the code:

ListofFiles = system('dir testrun16-*')

set print 'MaxValues.dat'
do for [file in ListofFiles]{
stats file using 1 nooutput
print STATS_max
}

set print 'upquValues.dat'
do for [file in ListofFiles]{
stats file using 1 nooutput
print STATS_up_quartile
}

set print 'MedianValues.dat'
do for [file in ListofFiles]{
stats file using 1 nooutput
print STATS_median
}

set print 'loquValues.dat'
do for [file in ListofFiles]{
stats file using 1 nooutput
print STATS_lo_quartile
}

set print 'MinValues.dat'
do for [file in ListofFiles]{
stats file using 1 nooutput
print STATS_min
}

set print 'MeanValues.dat'
do for [file in ListofFiles]{
stats file using 1 nooutput
print STATS_mean
}

unset print
set term x11
set title 'CLAIX2016 distribution of OSnoise using FWQ'
set xlabel "Number of Nodes"
set ylabel "Runtime [ns]"
plot 'MaxValues.dat' using 1 title 'maximum value', 'upquValues.dat' title 'upper quartile', 'MedianValues.dat' using 1 title 'median value', 'loquValues.dat' title 'lower quartile', 'MinValues.dat' title 'minimum value', 'MeanValues.dat' using 1 title 'mean value';
set term png
set output 'noises.png'
replot

I gain these values and can plot them. However, the tuples from each run get mixed up. The mean of testrun16-17.dat is plotted on x=317, it’s min is also at another place.

How can I save the output but keep the tuples together and plot each node on it’s actual place?

Advertisement

Answer

Windows (and Linux?) might have some special way to sort (or unsort) data in a directory list. To eliminate this uncertainty you can loop your files by number. However, this assumes that all numbers from 1 to maximum (=FilesCount, in your case 600) actually exist. You tagged Linux, sorry, but I only know Windows and the command to get a list of only the filenames in Windows is 'dir /B testrun16-*'.

Is there a special reason why you write the statistic numbers in 7 different files? Why not into one file?

Something like this: (modified after OP comment)

### batch statistics
reset session

FileRootName = 'testrun16'
FileList = system('dir /B '.FileRootName.'-*')
FilesCount =  words(FileList)
print "Files found: ", FilesCount

# function for extracting the number from the filename 
GetFileNumber(s) = int(s[strstrt(s,"-")+1:strstrt(s,".dat")-1])

set print FileRootName.'_Statistics.dat'
    print "File Max UpQ Med LoQ Min Mean"
    do for [FILE in FileList] {
        stats FILE u 1 nooutput
        print sprintf("%d %g %g %g %g %g %g", 
        GetFileNumber(FILE), 
        STATS_max, STATS_up_quartile, STATS_median, 
        STATS_lo_quartile, STATS_min, STATS_mean)
    }
set print

plot FileRootName.'_Statistics.dat' 
       u 1:2 title 'maximum value', 
    '' u 1:3 title 'upper quartile', 
    '' u 1:4 title 'median value', 
    '' u 1:5 title 'lower quartile', 
    '' u 1:6 title 'minimum value', 
    '' u 1:7 title 'mean value'
### end of code
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement