A packet trace of portsnap.

Out of curiosity and to make sure that everything was functioning as I expected, I used tcpdump to capture a trace of network activity while running portsnap. Portsnap updated its compressed snapshot from 02:55:24 UTC to 14:04:14 UTC; during this period, 15 ports were modified and 5 new ports were added to the tree.

The first packet was sent at 15:49:56.569740 and the last packet was received at 15:50:03.316998, slightly less than seven seconds later. In total, 98 packets were sent and received: 5 outbound DNS requests (a total of 408 bytes including IP headers), 5 inbound DNS responses (1253 bytes including IP headers), 45 outbound TCP packets forming 5 HTTP connections and 32 HTTP requests (9152 bytes including IP headers), and 43 inbound TCP packets, from those same HTTP connections, containing 32 HTTP responses (36182 bytes including IP headers). The TCP:UDP ratios are 8.8:1 (packets) and 27.3:1 (bytes); the inbound:outbound ratios are 0.96:1 (packets) and 3.9:1 (bytes) -- slightly more outbound traffic than ideal for a client application (broadband internet access often has a 10:1 bandwidth ratio), but not too bad.

If each of the necessary files had been fetched separately instead of using pipelined HTTP, the same updating would have taken over twice as many packets (a minimum of 224 packets for 32 HTTP connections), at least 50% more time (and far longer than that if this system didn't happen to have an unusually short round-trip time of 30ms to the server), and roughly 8000 more bytes in TCP/IP overhead. For updates over a longer period of time or from a greater distance, a lack of pipelined HTTP could result in a factor of ten slowdown.

The approach of "fetch lots of independent small pieces over HTTP and knit them together" is very useful and increasingly popular; it's worth noting that Google Maps does exactly this with its tiled map, and in so doing manages to use far less bandwidth and be far more responsive than systems which fetch an entire new map from a server every time the user scrolls around. Whenever this approach is used, however, it is essential to carefully consider the round-trip time associated with each HTTP connection, and to make sure that pipelined HTTP is used where necessary. Unfortunately, over six years after RFC 2616 was published, client-side support for pipelined HTTP is still quite rare.

It has been said that if you build a better mousetrap, the world will beat a path to your door; what people fail to mention is that whether your mousetrap is called "pipelined HTTP", "FreeBSD", "portsnap", or "the subset-sum self-initializing quadratic sieve", the path-beating is likely to take years if not decades before it is complete.

Posted at 2005-12-28 17:45 | Permanent link | Comments
blog comments powered by Disqus

Recent posts

Monthly Archives

Yearly Archives


RSS