Automatically populating .ssh/known_hosts
One of the more irritating things about working with virtual machines is SSH host keys. Launch a new virtual machine. Get a new host key generated. Try to SSH in. Get a pesky warning message telling you that the authenticity of the host can't be established. Find the host key fingerprint in the virtual machine's console logs. Eyeball the two 32-character hexadecimal strings. Type "yes" and hope that they really were the same and not just mostly the same. Of course, if you don't care about security you could arrange for all your virtual machines to use the same host key, or use the -o StrictHostKeyChecking=no option; but as the FreeBSD Security Officer and the author of a secure online backup service neither of those are acceptable as far as I'm concerned.My work on FreeBSD AMIs for EC2 has made me even more sensitive to the irritation of host key checking, since building a set of AMIs for the 7 EC2 regions involves launching and SSHing into no less than 20 virtual machines. A couple of weeks ago I asked twitter for advice about this; ten people replied, and two people — Daniel Shahaf and Markus Friedl — made the critical observation that I wanted to use two tools: ssh-keyscan, to get a host key in a form suitable for the known_hosts file; and ssh-keygen -lf to take the host key from that form and convert it into a fingerprint I could compare against a known good value.
At that point I got busy with other things (most notably final preparations for the FreeBSD 9.0-RELEASE announcement) but on Sunday evening I sat down and wrote a much-needed shell script:
The ssh-knownhost script uses ssh-keyscan to download all the host keys for the specified hostname; uses ssh-keygen to compute their fingerprints; compares them to the list of fingerprints provided on the command-line; and adds any new host keys to ~/.ssh/known_hosts. Short, simple, and effective.# ssh-knownhost hostname [fingerprint ...]
Of course, this only works if you know which fingerprints to specify on the command line; for newly launched EC2 instances, they're mixed up in other console output. Enter another script:
The ec2-knownhost uses the fact that EC2 AMIs — standard ones, at least — print their host keys prefixed with ec2: and between lines -----BEGIN SSH HOST KEY FINGERPRINTS----- and -----END SSH HOST KEY FINGERPRINTS-----. A few lines of shell script is all it takes to extract the host key fingerprints and pass them to ssh-knownhost. Again, short, simple, and effective.# ec2-get-console-output INSTANCE | ec2-knownhost hostname
The scripts are available for download, and I'm placing them in the public domain, so please feel free to redistribute, modify, incorporate into other code, et cetera: ssh-knownhost, ec2-knownhost. I've signed their SHA256 hashes using GPG: ssh-knownhost-sigs.asc.
Enjoy!