Skip to content
Advertisement

Track down high CPU load average

Trying to understand what’s going on with my server. It’s a 2 cpu server, so:

$> grep 'model name' /proc/cpuinfo | wc -l
2

While on load avergae, queue is showing ~8 :

$> uptime
16:31:30 up 123 days,  9:04,  1 user,  load average: 8.37, 8.48, 8.55

So You can assume, load is really high and things are pailing up, there is some load on the system and it’s not just a spike. However, Looking at top cpu consumers:

> ps -eo pcpu,pid,user,args | sort -k 1 -r | head -6
%CPU   PID USER     COMMAND
 8.3 27187 ****     server_process_c
 1.0 22248 ****     server_process_b
 0.5 22282 ****     server_process_a
 0.0 31167 root     head -6
 0.0 31166 root     sort -k 1 -r
 0.0 31165 root     ps -eo pcpu,pid,user,args

Results of free command:

             total       used       free     shared    buffers     cached
Mem:          7986       7934         52          0          9       2446
-/+ buffers/cache:       5478       2508
Swap:        17407         60      17347
This is the result on an ongoing basis, e.g. not even

a single CPU is being used, top consumer, is always ~8.5%.

My Question: What are my ways to track down the root of the high load?

Advertisement

Answer

Based on your free output, there are times when system memory is exhausted so swap buffer is used (see column used = 60). Total memory used used - (buffers + cached) which result almost zero. It means there are time when all physical RAM is consumed.

For server, try to avoid page fault which may cause swapping data from system memory to swap buffer (or vice versa) as much as possible because accessing hard drive is very slow than system RAM.

In your top output, try to investigate wa column. Higher percentage value means CPU spend more times waiting for data IO from disk rather than doing meaningful computation.

Cpu(s): 87.3%us,  1.2%sy,  0.0%ni, 27.6%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st

Try to reduce daemon or service that you do not need to reduce memory footprint and consider to add more RAM to the system.

For 2 CPU(s) server, ideal load is less than 2.0 (each CPU load is less than 1.0). Load of 8.0 means each CPU load is roughly 4.0 which is not very good.

Advertisement