Skip to content
Advertisement

How do i fix ksoftirqd 100% cpu

I am doing nothing and ksofttirqd uses 100% of my cpu and my pc is very slow. I looked in my /proc/interrupts and there are a lot of “local time interrupts”, “thermal event interrupts”. How do i fix it?

(I run ubuntu 18.4)

sensors

iwlwifi-virtual-0
Adapter: Virtual device
temp1:        +44.0°C  

dell_smm-virtual-0
Adapter: Virtual device
fan1:        3893 RPM
fan2:        3916 RPM

acpitz-virtual-0
Adapter: Virtual device
temp1:        +25.0°C  (crit = +107.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +100.0°C  (high = +100.0°C, crit = +100.0°C)
Core 0:        +74.0°C  (high = +100.0°C, crit = +100.0°C)
Core 1:       +100.0°C  (high = +100.0°C, crit = +100.0°C)
Core 2:        +73.0°C  (high = +100.0°C, crit = +100.0°C)
Core 3:        +78.0°C  (high = +100.0°C, crit = +100.0°C)
Core 4:        +73.0°C  (high = +100.0°C, crit = +100.0°C)
Core 5:        +72.0°C  (high = +100.0°C, crit = +100.0°C)
Core 6:        +74.0°C  (high = +100.0°C, crit = +100.0°C)
Core 7:        +71.0°C  (high = +100.0°C, crit = +100.0°C)

pch_cannonlake-virtual-0
Adapter: Virtual device
temp1:        +63.0°C  

Advertisement

Answer

As you can see from the sensors output, your CPU is running too hot. In response to that it is probably throttling like crazy to keep itself from melting. Strangely is only seems to be one of the cores that is too hot, which is unusual, because CPUs internally shuffle the workload between the cores to evenly distribute the load.

Here is what I’d recommend for debugging and fixing this:

  1. verify that the CPU fan is running fine,
  2. verify that the fan is properly mounted on the CPU (no gaps),
  3. verify that your thermal paste between CPU and fan is sufficient — in my experience this is actually the most likely culprit

If all of these are fine, then I’m afraid your CPU is broken and needs to be replaced.

But one thing is certain: this is not a software issue.


PS: I think you can ignore the high number of local time interrupts, because they are not unusual. The thermal interrupts on the other hand are. Here are my current values on a system that shows no issues:

           CPU0       CPU1       CPU2       CPU3       
...
LOC:  254543051  255115593  261569855  252995765   Local timer interrupts
...
TRM:          0          0          0          0   Thermal event interrupts
Advertisement