gathering public SSH keys from the AWS System Log and creating custom SSH host entries using Ansible

what is this and why?

I work in private VPCs in AWS a lot. I’m testing, debugging, and fidgeting with instances, requiring SSH access. If DNS isn’t configured, I’ve then had to look up IP addresses for instances through the AWS console (or just memorize the IP). After that, there’s the SSH public host key fingerprint. I either blindly accept it or manually look it up. If I terminate and redeploy the instance, there’s a chance I’ll get that SSH WARNING!, where I’ll need to delete the known_hosts entry.

It was annoying and tedius and manual and of questionable security. I needed a way to automate a human-friendly hostname (especially if DNS isn’t configured internally) and gather those SSH public host keys. That’s what I set out to do.

This playbook configures the ~/.ssh/config file, allowing ZSH (Bash supports this too, I think) to auto-complete using the AWS instace name. It also imports every pubic SSH fingerprint/host key it can find through the AWS System Log. This should achieve the goals of:

  • Ease of use when SSH’ing to instances. Removing the need to memorize the IP addresses in a VPC or go to the AWS console and look up an instance’s IP addresses.
  • A secure method for accepting host fingerprints. SSH’ing to an instance and blindly accepting the public host fingerprint could be considered as on-par with clicking through a self-signed certificate. This securely (through an authenticated interface) captures the instance’s public host key and seeds the known_hosts file with those keys.
  • And the final kicker, manually approving all those SSH keys can be tedius. This playbook automates that process.

introducing the SSH host scanner

And thus, the Ansible playbook localhost_ssh_scan.yml was born. It’s a sweet little playbook that does the following:

  • gathers EC2 remote facts using the Ansible module ec2_remote_facts.
  • configures an SSH host entry in the .ssh/config file for each AWS host found (and with ZSH, you have tab auto-complete for those hostnames).
  • scrapes the AWS System Log for the SSH public key of every instance in the region specified.
  • adds those SSH public keys to the .ssh/known_hosts file.

There are a few more touches to that but they will be covered in a bit. Requirements include:

  • ZSH
  • Ansible 2.x
  • An AWS account with instances
  • AWS access/secret credentials configured (either through config files or environment variables)
  • No fear of an automated script touching your ~/.ssh/known_hosts and ~/.ssh/config files. Use at your own risk and make backups and all that. :wink:

Take a look at the playbook here. Note that normally I create roles for playbooks. This was whipped up real quick like. Maybe one day I’ll rip it up into roles.

Running it is as simple as:

ansible-playbook localhost_ssh_config.yml -e region=us-east-1

And this is what it does…

first up - get all the EC2 instances

First up, gathering the running EC2 instances. Super useful Ansible module:

    - name: Gather EC2 remote facts.
      ec2_remote_facts:
        region: "{{ region }}"
        filters:
          instance-state-name: running
      register: ec2_remote_facts

Now we have instances to add.

configure SSH hosts

In order to easily tab complete and SSH to say, host dev-myserver-1, SSH hostname entries are used. For example, something like this:


Host dev-myserver-1
  HostName 192.168.1.5
  ServerAliveInterval 30

makes it very trivial to ssh user@dev-myserver-1.

And that’s exactly what this Ansible blockinfile task performs against the ~/.ssh/config file. The first step here is to remove old instances/blocks from the file, in case an instance has been modified or removed:


    - name: Remove previous SSH config settings.
      blockinfile:
        dest: "{{ ansible_env.HOME }}/.ssh/config"
        marker: "# {mark} {{ region }} - {{ item.tags.Name.replace(' ','_') }} - ansible generated"
        state: absent
      with_items: "{{ ec2_remote_facts.instances }}"
      when: item.tags.Name is defined and wipe is not defined

After clearing out the entries, the existing AWS instances are added and host blocks are created. These shortcuts allow us to translate AWS instance Names to IP addresses:

    - name: Configure SSH proxy host.
      blockinfile:
        dest: "{{ ansible_env.HOME }}/.ssh/config"
        marker: "# {mark} {{ region }} - {{ item.tags.Name.replace(' ','_') }} - ansible generated"
        block: |
          Host {{ item.tags.Name.replace(' ','_') }}
            HostName {{ item.private_ip_address }}
            ServerAliveInterval 30
      with_items: "{{ ec2_remote_facts.instances }}"
      when: item.tags.Name is defined and item.private_ip_address and wipe is not defined

For example, the above will produce the following in the ~/.ssh/confg:

# BEGIN us-east-1 - dev-node-c-1 - ansible generated
Host dev-myserver-1
  HostName 192.168.1.5
  ServerAliveInterval 30
# END us-east-1 - dev-node-c-1 - ansible generated

This creates a clear way to SSH to an instance (ssh dev-myserver-1), as opposed to memorizing or looking up an IP address. This coupled with some autocomplete features (like the ones provided with ZSH and oh-my-zsh), makes for an easy way to remember and interact with instances.

scraping the AWS System Log

I’ve spoken to this before, but I use the system log as a nice secure way to collect the SSH public key from instances. I’m a huge fan. Considering I’m often creating/destroying instances, scraping this is supremely useful.

For this to work, the SSH public key has to be in the AWS System Log. This occurs under two condtions:

  • the instance has started up for the first time.
  • the instance is configured to output the SSH public key to the AWS System Log.

The second one is a bit of a customization. When an AWS instance is first initialized, it outputs the SSH public key to the system log. That’s great. Unfortunately after subsequent reboots, it no longer generates a new key and the public key is no longer written to the AWS System Log. To make sure that after every reboot the public key is available, two startup scripts are modified: /etc/rc.local and /var/lib/cloud/scripts/per-boot/ssh-to-system-log.sh.

And now, the public key can be scraped:

    - name: Get SSH key from system log.
      shell: aws ec2 get-console-output \
               --region {{ region }} \
               --instance-id {{ item.id }} \
               --output text | sed -n 's/^.*\(ecdsa-sha2-nistp256 \)\(.*\)/\2/p' | tail -n 1 | awk '{print $1}' | awk '{print substr($0,1,140)}'
      register: host_key_info
      with_items: "{{ ec2_remote_facts.instances }}"
      when: item.tags.Name is defined and wipe is not defined

All SSH keys are now registered in the variable host_key_info. Time to dump them into the known_hosts file.

creating the known_host entries

First step, clear out old entries. The Ansible lineinfile module handles this, finding all of the markers using a regex:

    - name: Clean all environment created SSH fingerprints.
      lineinfile:
        dest: "{{ ansible_env.HOME }}/.ssh/known_hosts"
        regexp: "^.*#\\s{{ region }} - {{ item.tags.Name.replace(' ','_') }} - ansible generated$"
        state: absent
      with_items: "{{ ec2_remote_facts.instances }}"
      when: item.tags.Name is defined and wipe is not defined

Using the parallel lists ec2_remote_facts.instances and host_key_info.results, the AWS instance name, private DNS name, private IP address, public IP address, and the discovered SSH public key are used to construct a known_hosts entry:

    - name: Import SSH fingerpints - private IP addresses for items found in the AWS system console.
      lineinfile:
        dest: "{{ ansible_env.HOME }}/.ssh/known_hosts"
        line: "{{ item.0.tags.Name}},{{ item.0.private_dns_name }},{{ item.0.private_ip_address }}{% if item.0.public_ip_address is defined %},{{ item.0.public_ip_address }}{% endif %} ecdsa-sha2-nistp256 {{ item.1.stdout }} # {{ region }} - {{ item.0.tags.Name.replace(' ','_') }} - ansible generated"
      with_together:
        - "{{ ec2_remote_facts.instances }}"
        - "{{ host_key_info.results }}"
      when: wipe is not defined and item.1.stdout is defined and item.1.stdout != ''

This creates an entry, for the region us-east-1 of:

dev-myserver-1,ip-192-168-1-5.ec2.internal,192.168.1.5, ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBO3V70mRnDaQNIxbDwOBdS2XY3zcHP4ziEFTG5iaGqSQrnUmhC8CFgwvFXpZY4y3thJAuYu4gX7tMFmmd+gp7qI= # us-east-1 - dev-myserver-1 - ansible generated

Now there is a human readable hostname for SSH, complete with the SSH public key securely added to the known_hosts file.

full cleanup option

In case there was a need to just wipe the slate clean, there are a few wipe conditionals. This basically just removes all the entries that the playbook has added, based on the region marker.

ansible-playbook localhost_ssh_config.yml -e region=us-east-1 wipe=yes

conclusion

With this playbook, a user can now SSH to instances using the more human-friendly AWS names. Tab completion with ZSH makes it even more useful. It gathers these in a relatively secure fashion and makes trusting new (and existing) hosts an automated process.

In the future, I’d like to (re)add the custom marker functionality. For now, I ripped it out to keep the playbook mildly simpler.

Improvements I’d like to make include:

  • The cleanup setps are a bit cludgy. It’s possible they are not needed. Refactor them.
  • Add a marker variable, allowing for a customized labeling mechanism.

Cheers, - b

links