Skip to content
Advertisement

The best way to authorize ssh key of each node to all nodes in the cluster

I want to create a cluster infrastructure that each node communicates with others over shh. I want to use ansible to create a idempotent playbook/role that can be executed when cluster initialized or new nodes added to cluster. I was able to think of 2 scenarios to achieve this.

 First Scenario

  • task 1 fetches the ssh key from a node (Probably assigns it to a variable or writes to a file).
  • Then task 2 that executed locally loops over other nodes and authorizes the first node with fetched key.

This scenario supports free strategy. Tasks can be executed without waiting for all hosts. But it also requires all nodes to have related user and public key. Because if you are creating users within the same playbook (due to free strategy), when the task 2 starts running there may be users that are not created on other nodes in the cluster.

First Scenario

Although i am a big fan of free strategy, i din’t implement this scenario due to efficiency reasons. It makes n^2 connections for n node cluster.

 Second Scenario

  • task 1 fetches the ssh key from all nodes in order. Then writes each one to a file which name is set according to ansible_hostname.
  • Then task 2 that executed locally loops over other nodes and authorizes all keys.

This scenario only supports linear strategy. You can create users within same playbook thanks to linear strategy, all users will be created before task 1 starts running.

Second Scenario

I think it is an efficient scenario. It makes only 2n connections for n node cluster. I did implement it and i put the snippet i wrote.

---
- name: create node user
  user:
    name: "{{ node_user }}"
    password: "{{ node_user_pass |password_hash('sha512') }}"
    shell: /bin/bash
    create_home: yes
    generate_ssh_key: yes

- name: fetch all public keys from managed nodes to manager
  fetch: 
    src: "/home/{{ node_user }}/.ssh/id_rsa.pub"
    dest: "tmp/{{ ansible_hostname }}-id_rsa.pub"
    flat: yes

- name: authorize public key for all nodes
  authorized_key:
    user: "{{ node_user }}"
    key: "{{ lookup('file', 'tmp/{{ item }}-id_rsa.pub')}}"
    state: present
  with_items:
    - "{{ groups['cluster_node'] }}"

- name: remove local public key copies
  become: false
  local_action: file dest='tmp/' state=absent
  changed_when: false
  run_once: true

Maybe i can use lineinfile instead of fetch but other than that i don’t know if it is the right way. It takes so long when cluster size getting larger (Because of the linear strategy). Is there a more efficient way that i can use?

Advertisement

Answer

When Ansible loops through authorized_key, it will (roughly) perform the following tasks:

  1. Create a temporary authorized_key python script on the control node
  2. Copy the new authorized_key python script to the managed node
  3. Run the authorized_key python script on the managed node with the appropriate parameters

This increases n2 as the number of managed nodes increases; with 1000 boxes, this task is performed 1000 times per box.

I’m having trouble finding specific docs which properly explains exactly what’s going on under-the-hood, so I’d recommend running an example script get a feel for it:

- hosts: all
  tasks:
    - name: do thing
      shell: "echo "hello this is {{item}}""
      with_items:
        - alice
        - brian
        - charlie

This should be ran with the triple verbose flag (-vvv) and with the output piped to ./ansible.log (ex. ansible-playbook example-loop.yml -i hosts.yml -vvv > example-loop-output.log). Searching through those logs for command.py and sftp will help get a feel for how your script scales as the list retrieved by "{{ groups['cluster_node'] }}" increases.

For small clusters, this inefficiency is perfectly acceptable. However, it may become problematic on large clusters.

Now, the authorized_key module is essentially just generating an authorized_keys file with a) the keys which already exist within authorized_keys and b) the public keys of each node on the cluster. Instead of repeatedly generating an authorized_keys file on each box individually, we can construct the authorized_keys file on the control node and deploy it to each box.

The authorized_keys file itself can be generated with assemble; this will take all of the gathered keys and concatenate them into a single file. However, if we just synchronize or copy this file over, we’ll wipe out any non-cluster keys added to authorized_keys. To avoid this, we can use blockinfile. blockinfile can manage the cluster keys added by Ansible. We’ll be able to add new keys while removing those which are outdated.

- hosts: cluster
  name: create node user and generate keys
  tasks:
    - name: create node user
      user:
        name: "{{ node_user }}"
        password: "{{ node_user_pass |password_hash('sha512') }}"
        shell: /bin/bash
        create_home: yes
        generate_ssh_key: yes

    - name: fetch all public keys from managed nodes to manager
      fetch:
        src: "/home/{{ node_user }}/.ssh/id_rsa.pub"
        dest: "/tmp/keys/{{ ansible_host }}-id_rsa.pub"
        flat: yes
  become: yes

- hosts: localhost
  name: generate authorized_keys file
  tasks:
    - name: Assemble authorized_keys from a directory
      assemble:
        src: "/tmp/keys"
        dest: "/tmp/authorized_keys"

- hosts: cluster
  name: update authorized_keys file
  tasks:
   - name: insert/update configuration using a local file
     blockinfile:
       block: "{{ lookup('file', '/tmp/authorized_keys') }}"
       dest: "/home/{{ node_user }}/.ssh/authorized_keys"
       backup: yes
       create: yes
       owner: "{{ node_user }}"
       group: "{{ node_group }}"
       mode: 0600
  become: yes

As-is, this solution isn’t easily compatible with roles; roles are designed to only handle a single value for hosts (a host, group, set of groups, etc), and the above solution requires switching between a group and localhost.

We can remedy this with delegate_to, although it may be somewhat inefficient with large clusters, as each node in the cluster will try assembling authorized_keys. Depending on the overall structure of the ansible project (and the size of the team working on it), this may or may not be ideal; when skimming a large script with delegate_to, it can be easy to miss that something’s being performed locally.

 - hosts: cluster
      name: create node user and generate keys
      tasks:
        - name: create node user
          user:
            name: "{{ node_user }}"
            password: "{{ node_user_pass |password_hash('sha512') }}"
            shell: /bin/bash
            create_home: yes
            generate_ssh_key: yes

        - name: fetch all public keys from managed nodes to manager
          fetch:
            src: "/home/{{ node_user }}/.ssh/id_rsa.pub"
            dest: "/tmp/keys/{{ ansible_host }}-id_rsa.pub"
            flat: yes

        - name: Assemble authorized_keys from a directory
          assemble:
            src: "/tmp/keys"
            dest: "/tmp/authorized_keys"
          delegate_to: localhost

        - name: insert/update configuration using a local file
          blockinfile:
            block: "{{ lookup('file', '/tmp/authorized_keys') }}"
            dest: "/home/{{ node_user }}/.ssh/authorized_keys"
            backup: yes
            create: yes
            owner: "{{ node_user }}"
            group: "{{ node_group }}"
            mode: 0600
      become: yes
Advertisement