Skip to content
Advertisement

Kafka broker crash every day – OOM killer

I have a cluster of 3 kafka brokers Version 0.10.2.1. Each broker has it’s own host 2 cpu / 16G RAM, In addition we are using docker to wrap the broker process.

The problems is as follows: Almost every day at the same time we see all of our kafka clients failed for 10 minutes. At the beginning I thought it is related to Kafka No broker in ISR for partition But after a while I discover that the broker just crash due to OOM-killer.

I also played with the Xmx and Xms before I discover that it is the OOM-killer. I had:

-Xmx2048M -Xms2048M

-Xmx4096M -Xms2048M

Same behavior for both

In addition currently we don’t have ulimit

JavaScript

less kern.log

LOGS:

JavaScript

And ….

JavaScript

Any suggestion of how to approach this ??

Advertisement

Answer

We found the problem. First I will say that adding more RAM to the machine also solved the problem but it is “expensive solution”.

The problem was as follows: Since I was working with EC2 ubuntu distribution I got daily crontabs in all of my cluster exactly at the same time. One of the scripts was mlocate this script apparently took too many resources.

I assume that since all cluster of kafka has some issues with IO and Memory, brokers was trying to use more memory and then the OOM killer killed them. When 2 of my 3 brokers were down some services were down.

So the solution was:

  1. Change the crontab to work in different hours of the day in each broker.

  2. Disable mlocate

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement