The missing ImportVolume documentation
As a general rule, the documentation provided by Amazon Web Services is very good; in many ways, they set the standard for what documentation for public APIs should look like. Occasionally, however, important details are inexplicably absent from the documentation, and — I suspect in part due to Amazon's well known culture of secrecy — it tends to be very difficult to get those details. One such case is the EC2 ImportVolume API call.As the maintainer of the FreeBSD/EC2 platform, I wanted to use this API call to simplify the process of building EC2 images for FreeBSD. For several years my build process has involved launching an EC2 instance, building a FreeBSD disk image directly onto an EC2 volume, and then converting that volume into an EC2 machine image ("AMI"); but in order to integrate better with the FreeBSD release process, it was essential to be able to build the disk image offline and then upload it — ideally without ever launching an EC2 instance. The ImportVolume API call is intended for exactly this purpose; but one of the mandatory parameters to this call — Image.ImportManifestUrl — came without any documentation of what data the "disk image manifest" file should contain. Without that crucial documentation, it's impossible to create a manifest file; without creating a manifest file, it's impossible to make use of the ImportVolume API; and without the ImportVolume API I could not streamline the FreeBSD/EC2 build process.
Fortunately developers inside Amazon have access to better documentation. While the ImportVolume API is not implemented in the AWS CLI, it is implemented in the — much older, and now rarely used — EC2 API Tools package. Despite the EC2 API Tools not being useful for the FreeBSD release process — being written in Java, they are far too cumbersome — they did allow me to figure out the structure of the requisite manifest file; and so I am able to present the missing ImportVolume documentation:
One interesting thing about the metadata file is that it does not contain any information to allow the disk image parts stored on S3 to be validated; nor, for that matter, is there any mechanism in the EC2 ImportVolume API call to allow the manifest file to be validated. One assumes that this is due to a presumption that data stored in S3 is safe from any tampering; if I were designing this API, I would add a <sha256> tag into each <part> and add an Image.ImportManifestSHA256 parameter to the ImportVolume API call. In fact, these could easily be added as optional parameters, in order to provide security without compromising backwards compatibility.EC2 ImportVolume manifest file format
The EC2 ImportVolume manifest file is a standalone XML file containing a top-level <manifest> tag. This tag contains:
- A <version> tag with the value "2010-11-15".
- A <file-format> tag containing the image file format, as in the Image.Format parameter to ImportVolume: "RAW", "VHD", or "VMDK".
- An <importer> tag providing information about the software used to generate the manifest file; it contains <name>, <version> and <release> tags which seem to be purely informational.
- A <self-destruct-url> tag containing an S3 URL which is pre-signed for issuing a DELETE request on the manifest file object.
- An <import> tag containing a <size> tag (with the size in bytes of the disk image, as in the Image.Bytes parameter to ImportVolume), a <volume-size> tag (with the size in GB of the volume to be created, as in the Volume.Size parameter to ImportVolume), and a <parts> tag with a count attribute set to the number of <part> tags which it contains.
Each <part> tag corresponds to a portion of the disk image, has an index attribute identifying the position of this part in sequence (numbered starting at 0), and contains:
- A <byte-range> tag with start and end attributes specifying the position of the first and last bytes of this part in the disk image.
- A <key> tag containing the S3 object name for this part. [I'm not sure what purpose this serves, and it's possible that these could just be any unique names for the parts.]
- <head-url>, <get-url>, and <delete-url> tags containing S3 URLs which are pre-signed for issuing HEAD, GET, and DELETE requests respectively on the S3 object containing this part.
Example
The following is an example of a manifest file (with a repetitive section elided):<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <manifest> <version>2010-11-15</version> <file-format>RAW</file-format> <importer> <name>ec2-upload-disk-image</name> <version>1.0.0</version> <release>2010-11-15</release> </importer> <self-destruct-url>https://import-volume-example.s3.amazonaws.com/c05aebc0-98d3-41df-bd64-53eacf4de842/disk.imgmanifest.xml?AWSAccessKeyId=07G3159HQ3Z614FJ8GR2&Expires=1417078408&Signature=LZi1Hzkq%2FfC%2BUJrxj7m1DfozTUI%3D</self-destruct-url> <import> <size>1073741824</size> <volume-size>1</volume-size> <parts count="103"> <part index="0"> <byte-range end="10485759" start="0"/> <key>c05aebc0-98d3-41df-bd64-53eacf4de842/disk.img.part0</key> <head-url>https://import-volume-example.s3.amazonaws.com/c05aebc0-98d3-41df-bd64-53eacf4de842/disk.img.part0?AWSAccessKeyId=07G3159HQ3Z614FJ8GR2&Expires=1417078408&Signature=%2FFHSYRWxr5F7h1yoUuT1uV4lXD0%3D</head-url> <get-url>https://import-volume-example.s3.amazonaws.com/c05aebc0-98d3-41df-bd64-53eacf4de842/disk.img.part0?AWSAccessKeyId=07G3159HQ3Z614FJ8GR2&Expires=1417078408&Signature=KulBDTq%2BoHeWJzXZ2iOPSjk%2FN%2BQ%3D</get-url> <delete-url>https://import-volume-example.s3.amazonaws.com/c05aebc0-98d3-41df-bd64-53eacf4de842/disk.img.part0?AWSAccessKeyId=07G3159HQ3Z614FJ8GR2&Expires=1417078408&Signature=NgupOVUophQxUzgkBl9t%2BngofuM%3D</delete-url> </part> . . . <part index="102"> <byte-range end="1073741823" start="1069547520"/> <key>c05aebc0-98d3-41df-bd64-53eacf4de842/disk.img.part102</key> <head-url>https://import-volume-example.s3.amazonaws.com/c05aebc0-98d3-41df-bd64-53eacf4de842/disk.img.part102?AWSAccessKeyId=07G3159HQ3Z614FJ8GR2&Expires=1417078408&Signature=L7XD9KPWd1O%2B3Yt%2BLBJUzRXA4HA%3D</head-url> <get-url>https://import-volume-example.s3.amazonaws.com/c05aebc0-98d3-41df-bd64-53eacf4de842/disk.img.part102?AWSAccessKeyId=07G3159HQ3Z614FJ8GR2&Expires=1417078408&Signature=0Nl%2BRFguwfnRuYDS22F0noJ8BwE%3D</get-url> <delete-url>https://import-volume-example.s3.amazonaws.com/c05aebc0-98d3-41df-bd64-53eacf4de842/disk.img.part102?AWSAccessKeyId=07G3159HQ3Z614FJ8GR2&Expires=1417078408&Signature=S%2BuNzWlxA9%2F8P7549s4O2NbwVss%3D</delete-url> </part> </parts> </import> </manifest>
Having determined the format of this file, I was then able to return to my original task — streamlining the FreeBSD EC2 image build process. To that end, I wrote a BSD EC2 image upload tool; and yesterday I finished preparing patches to integrate EC2 builds into the FreeBSD release process. While I still have to negotiate with the FreeBSD release engineering team about these — they are, naturally, far more familiar with the release process than I am, and there may be some adjustments needed for my work to fit into their process — I am confident that when the FreeBSD 10.2 release cycle starts there will be EC2 images built by the release engineering team rather than by myself.
And, of course, in the spirit of open source: My code and the formerly missing ImportVolume documentation is now available for anyone who might find either of them useful.