The busy freebsd-update server
On Wednesday evening, Ken Smith announced the availablility of FreeBSD 7.0-RELEASE. And update1.freebsd.org started crying.Based on what I saw when FreeBSD 6.3-RELEASE was announced, I didn't expect any problems -- there was a visible increase in traffic, but it didn't come anywhere close to tying up the server. I hadn't accounted for two important factors:
- Upgrading from FreeBSD 6.x to FreeBSD 7.0 involves more and larger updates than upgrading to FreeBSD 6.3.
- FreeBSD 7.0 is far more popular than FreeBSD 6.3.
On Thursday, February 28th, 885 systems used FreeBSD Update to upgrade to FreeBSD 7.0-RELEASE; of these, 282 were running FreeBSD 7.0 betas or release candidates, 404 were running FreeBSD 6.3, 174 were running FreeBSD 6.2, 13 were running FreeBSD 6.1, 11 were running FreeBSD 6.0, and 1 was running FreeBSD 5.5.
In total, update1.freebsd.org handled 50.1 million HTTP requests -- an average of 58 requests per second -- serving up 130939 distinct files and patches totalling 39.9 GB -- an average data rate of 3.7 Mbps (not counting HTTP/TCP/IP overhead). The effect over this traffic on the server is perhaps best illustrated by the following two MRTG graphs; the first graph shows total and active Apache processes, while the second shows incoming and outgoing bandwidth:
A few notes are in order concerning the above graphs:
- This server has an uplink capped at 10 Mbps; on several occasions it came very close to that limit.
- The primary reason it didn't hit the 10 Mbps limit more is that for five hours Apache was at the maximum number of processes I had configured (100) and all of them were busy handling requests. When I woke up on Thursday morning (around 1800 UTC -- 10AM in my time zone) I logged in and increased Apache's process limit.
- When I wrote the code for converting MRTG bandwidth statistics into 95th percentile and GB/month values, I didn't bother handling leap years.
In short, the FreeBSD Update server was handling about as much traffic as it is capable of handling (at least unless its uplink is upgraded and I switch from Apache to a faster web server), and there were most likely some people who tried to use FreeBSD Update between 1200 UTC and 1800 UTC and found that the server was either very slow or completely unresponsive. If you had problems upgrading, please try again later -- perhaps a random day next week, since as I write this I already see the load increasing as Friday afternoon (UTC) approaches. For myself, I've learned an important lesson: Next time there's a FreeBSD release, I'm going to make sure there are several FreeBSD Update mirrors ready to share the load.
One final addendum: While my bsdiff binary patching tools is usually highly efficient -- for security updates, it routinely provides a greater than fifty-fold reduction in download size -- it performed quite poorly overall at producing patches for upgrading from FreeBSD 6.x to FreeBSD 7.0, providing only a five-fold reduction in download size. Why? Because FreeBSD 6.x uses gcc 3.4, while FreeBSD 7.0 uses gcc 4.2. Such a major change in compiler means that even binaries compiled from identical source code differ throughout, dramatically reducing the potential for bsdiff (or any other binary patch tool) to identify similarities. Let this be a lesson to anyone who uses binary patches to update devices: Think twice before changing compilers!
The (good) deal with freebsd-update(8)
Earlier today, I stumbled across a blog post by Radu Cristian Fotescu entitled The (bad) deal with freebsd-update(8), which (as the title suggests) casts FreeBSD Update in a rather unfavourable light. Since the author is misinformed about several details, I'm taking this opportunity to set the record straight.First, the author points out that there is an older version of FreeBSD Update available in the ports tree, which he states "can only fetch updates for FreeBSD 6.1". In fact, the version in the ports tree works for releases dating back to FreeBSD 4.7 (although it obviously doesn't provide binary updates to fix bugs which were uncovered after a release ceased to be supported by the FreeBSD Security Team). The only releases which the version of FreeBSD Update in the ports tree does not support are FreeBSD 6.2 and up -- versions of FreeBSD which contain a new (and vastly improved) version of FreeBSD Update in the base system. Once FreeBSD Update is in all supported FreeBSD releases (i.e., in June) I'll remove the old FreeBSD Update code from the ports tree.
Next, the author questions the logic of having "64-byte keys" (actually, 64 hexadecimal digit keys) as file names, and suggests that this makes FreeBSD Update overly complex. Nothing could be further from the truth: In fact, as I described in my BSDCan'07 talk, the "Reference by [SHA256] hash" method makes both FreeBSD Update and Portsnap far simpler than they would otherwise be.
The author then moves on to speaking of "a patch applied to a given release and patch level", thereby demonstrating a fundamental misunderstanding of how FreeBSD Update works. In the author's mind (apparently), to update a system from FreeBSD 6.2-RELEASE-p9 to FreeBSD 6.2-RELEASE-p10, FreeBSD Update downloads a (single) patch and applies it. Not so; rather, FreeBSD Update fetches a file which tells it what FreeBSD 6.2-RELEASE-p10 looks like. FreeBSD Update then makes the system look like that: It can leave files alone if they are already up to date (or if the user has asked it to leave those files alone); or it can download or generate the new versions of files. Put another way, in most patching systems, the server will answer the question "how do I get there from here?" -- with FreeBSD Update, the server merely answers the question "where should I be going?" and leaves it up to the FreeBSD Update client to figure out how to get there.
Related to this error is another mistake which immediately follows: The author asserts that the "full new binaries" are not available. In fact, for every file which appears in a (recent) FreeBSD release, or in a FreeBSD release plus patches, is available via the FreeBSD Update server. (I was concerned that I might be technically violating the GPL on some files by this fact, until I remembered that the FreeBSD source code is also distributed via FreeBSD Update.) FreeBSD Update uses patches in exactly the same way as Portsnap: As I described in my BSDCan'07 talk (linked above), FreeBSD Update and Portsnap rely on "opportunistic patching" -- they start out by attempting to fetch patches and apply them, but if anything goes wrong (the patch isn't available, the file generated by patching has the wrong SHA256 hash, et cetera), they gracefully fall back to fetching the complete file.
Next, the author points out that the list of binary patches used for updating to FreeBSD 6.3 is publicly visible. Oops -- this is fixed now. I don't have any desire to keep this list of file names secret, but there are two very good practical reasons for turning off the directory indexing: First, Apache processes chew up lots of RAM when generating large directory listings; and second, I was having problems with robots ignoring my "don't crawl here" directives in robots.txt and loading down my server with large numbers of pointless requests.
Moving on, the author points to the approach of RedHat, Debian, and Mandriva, of distributing entirely new package tarballs, as a model to be emulated. I don't know how fast the author's internet connection is, but I know one of the most frequent comments I hear about FreeBSD Update is how incredibly fast it is. This is what binary patches do for you -- provide a fifty-fold reduction in the bandwidth needed to download security updates. The tool I wrote for this purpose -- bsdiff -- is now used by Apple, FireFox, Sophos, and probably Amazon's Kindle (in this last case, I haven't heard from any developers, but they have bsdiff code on the device, so presumably they're using it) in addition to FreeBSD, and in the summer of 2006 I calculated that it had saved users upwards of 100 person-years of waiting for updates to download. Returning to downloading complete tarballs every time a small change is made might be simple, but it wouldn't be very popular with many people who have to wait for said tarballs to download!
Finally, the author complains that he can't find the FreeBSD Update server code. As a comment to the blog entry points out, the server code is in the FreeBSD projects repository.