Skip to content
Advertisement

Changing Kafka Host name entry in zookeeper and persisting it across storm topology restart

Background

  • 6 node Kafka Cluster
  • 3 node Zookeeper Cluster
  • 3 node Nimbus Cluster
  • Apache Storm Worker hosts dynamically adjusted using amazon spot fleet

Scenario

For a particular topology for a given partition it subscribes to, the Zookeeper entry looks as follows

{"topology":{"id":"Topology_Name-25-1520374231","name":"Topology_Name"},"offset":217233,"partition":0,"broker":{"host":"Zk_host_name","port":9092},"topic":"topic1"}

Now for worker hosts to access Zk_host_name, a mapping is added on each worker host in /etc/hosts file as ip ZK_host_name

Now we decided to move to something called Route 53 DNS management service provided by AWS. That way a fixed name such as QA-ZK-Host1 can be set and be mapped to corresponding ip. So that ip can be changed in future giving a flexibility.

Now the original entry as above needed to be changed for the sake of consistency. So corresponding topology was stopped, so as to avoid ongoing changes to offset and using set command the value of the hostname is changed.

set /node_path {"topology":{"id":"Topology_Name-25-1520374231","name":"Topology_Name"},"offset":217233,"partition":0,"broker":{"host":"QA-ZK-Host1","port":9092},"topic":"topic1"}

Problem

The above command works fine and get command on the path gives the changed value. But the moment topology is restarted, old name is restored. So how to make it persist even after topology restart.

Advertisement

Answer

The object you are referencing is being written to Storm’s Zookeeper here https://github.com/apache/storm/blob/master/external/storm-kafka/src/jvm/org/apache/storm/kafka/PartitionManager.java#L341.

The “broker” property is created at https://github.com/apache/storm/blob/master/external/storm-kafka/src/jvm/org/apache/storm/kafka/DynamicBrokersReader.java#L186. As you can see, the host property is not your Zookeeper host, but the host running Kafka. The value is being read from Kafka’s Zookeeper (see point 3 at https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper).

If you want to change the value, you’ll likely need to do it in Kafka. Take a look at http://kafka.apache.org/090/documentation.html (or whatever version you’re using) and search for “advertised.host.name”, I think that’s the setting you want to change.

Advertisement