Tarsnap critical security bug
Tarsnap versions 1.0.22 through 1.0.27 have a critical security bug. It may be possible for me, Amazon, or US government agencies with access to Amazon's datacenters to decrypt data stored with those versions of Tarsnap. This is an absolutely unacceptable compromise of Tarsnap's security principles, and I sincerely apologize to everyone affected.There's a lot to say about this, and it's entirely possible that I'll miss covering some important points in this post; if I've missed something, please email me or post a comment below and I'll do my best to add the necessary information.
The bug
Tarsnap archives data by first converting it into a series of "chunks" of average size 64 kB; next compressing and encrypting each chunk; and finally uploading those chunks. The encryption is performed using a per-session AES-256 key in CTR mode.In versions 1.0.22 through 1.0.27 of Tarsnap, the CTR nonce value is not incremented after each chunk is encrypted. (The CTR counter is correctly incremented after each 16 bytes of data was processed, but this counter is reset to zero for each new chunk.)
How the bug happened
Up to version 1.0.21 of Tarsnap, AES-CTR was used in two places: First, to encrypt each chunk of data; and second, in the Tarsnap client-server protocol. In version 1.0.22 of Tarsnap, I introduced passphrase-protected key files, which used AES-CTR encryption (with a key computed using scrypt).In order to simplify the Tarsnap code — and in the hopes of reducing the potential for bugs — I took this opportunity to "refactor" the AES-CTR code into a new file (lib/crypto/crypto_aesctr.c in the Tarsnap source code) and modified the existing places where AES-CTR was used to take advantage of these routines.
It is at this point where the bug slipped into the chunk-encryption code (crypto_file_enc in lib/crypto/crypto_file.c):
/* Encrypt the data. */ - aes_ctr(&encr_aes->key, encr_aes->nonce++, buf, len, - filebuf + CRYPTO_FILE_HLEN); + if ((stream = + crypto_aesctr_init(&encr_aes->key, encr_aes->nonce)) == NULL) + goto err0; + crypto_aesctr_stream(stream, buf, filebuf + CRYPTO_FILE_HLEN, len); + crypto_aesctr_free(stream);The encr_aes->nonce++ turned into encr_aes->nonce, and as a result the same nonce value was used repeatedly. (The other places where Tarsnap uses AES-CTR — in the client-server protocol and in the handling of passphrase-protected key files — are not affected by this bug.)
Impact of the bug
As stated above: It may be possible for me, Amazon, or US government agencies with access to Amazon's datacenters to decrypt data stored with affected versions of Tarsnap. Other individuals/agencies are unlikely to be able to decrypt data for the simple reason of being unable to access the encrypted data: Amazon Web Services is considered to be sufficiently secure to handle medical records and credit cards, and while I often remind people that regulatory compliance is not at all the same thing as security, in this case I think they align fairly accurately. (Note that since the Tarsnap client-server protocol is encrypted, being able to intercept Tarsnap client-server traffic does not provide an attacker with access to the data.)There are two ways of decrypting AES-CTR data when the nonce is reused: By comparing two ciphertexts, or by using a known plaintext. In the first case, the ciphertexts A xor C and B xor C are compared to yield the exclusive OR of the two plaintexts, A xor B. If the plaintexts are English text or otherwise have a small amount entropy, this usually enough to allow both plaintexts to be extracted — in fact, this is one of the methods which was used by British codebreakers in the second world war. However, the blocks which Tarsnap encrypts do not have low entropy: Tarsnap compresses its chunks of data before encrypting them. While the compression is not perfect (there are, for instance, some predictable header bits), I do not believe that enough information is leaked to make such a ciphertext-only attack feasible.
Given a known plaintext, however — that is, if the attacker knows any block of data which was encrypted — then the attack is trivial: They need only compare the plaintext against corresponding ciphertext block to recover the AES-CTR keystream, which can then be used to decrypt other blocks of data. If Tarsnap is used to perform complete system backups, there will be many such plaintexts — files belonging to the operating system and the Tarsnap binary itself are obvious examples — but if Tarsnap is used selectively then it is possible that the attacker will have no such plaintext at his disposal.
Because Tarsnap uses per-session AES keys for encrypting blocks of data, this bug affects only data uploaded using affected versions of Tarsnap, and the known-plaintext attack will only endanger data uploaded during the same archive when the known plaintext is uploaded; so it is possible that an attacker would be able to decrypt some data but not all.
What Tarsnap users should do
Tarsnap users should immediately upgrade to version 1.0.28.Tarsnap users who wish to re-encrypt their stored data should register a new machine using tarsnap-keygen, upload their data using the newly generated keys, and then delete the old data by running tarsnap --nuke with the old keys. (Note that creating a new archive with the same set of keys will not cause data to be re-encrypted and uploaded, since Tarsnap's de-duplication will recognize the duplicated data.) Anyone wishing to do this should contact me via email so that I can provide a Tarsnap account credit to cover the bandwidth fees which would otherwise be charged. (Of course, if the US government wants your data, re-encrypting it and deleting the old version from Tarsnap won't force them to delete any copies they have made — but it might help you if the US government doesn't realize that it wants your data yet.)
Tarsnap users who wish to stop using Tarsnap should delete their stored data by running tarsnap --nuke and contact me via email for a refund.
Tarsnap users with any other questions or concerns should contact me via email, twitter, IRC, or any other convenient form of communication.
What I'm doing about this
After being contacted on Friday afternoon and confirming the bug, I immediately re-checked all of the Tarsnap crytographic code; I found no other bugs. Of course, this can't guarantee that there are no subtle issues lurking; but at least it makes very unlikely the possibility that other similarly obvious problems exist.I've also added "double-check all changes to critical security code, even if they are 'cosmetic' or 'refactoring' changes" to my pre-release checklist. When I wrote the original chunk-encryption code, I reviewed my work very carefully to make sure that I got it right — and it was right for two years, until I accidentally introduced this bug while making what I thought was an insignificant change. This is an important lesson to learn: Mistakes can happen any time a piece of code is modified.
Finally, I am instituting a Tarsnap bug bounty (complete details to follow in a later blog post). This bug was found and reported to me by someone who was reading the Tarsnap source code purely out of curiosity — I'm a great fan of curiosity, but I've also learned that money can help to encourage curiosity. While I hope that I this is the last time I have to pay out a bounty for a security bug, if there are other bugs I hope this bounty will result in them being found sooner rather than later.
Final remarks
I will not attempt to decrypt and read your data. Amazon claims that it does not inspect Amazon Web Services users' data. And the US government is theoretically bound by a constitution which prohibits unreasonable searches. This is all, however, entirely irrelevant: The entire point of Tarsnap's security is to remove the need for such guarantees. You shouldn't need to trust me; you shouldn't need to trust Amazon; and you most certainly shouldn't need to trust the US government.This was a very easy mistake to make. Anyone could have made it. It was also a very easy mistake to find. I should have found it, 19 months ago, before releasing version 1.0.22 of Tarsnap. I didn't, and I'm sorry.
I'd like to thank Taylor R Campbell for bringing this bug to my attention.
Q&A
Some questions I've been hearing, aggregated here so that people can stop asking them:- Is the updated Tarsnap in the FreeBSD ports tree? Yes. It wasn't when this announcement first went up, but I committed the update at 21:23 UTC.
-
Is there any way to download all the data for a machine, re-encrypt,
and re-upload?
This is theoretically possible, but needs some new code to be written,
and I didn't want to delay announcing this bug for the time it would
take to write that code. If you don't want to take the 'upload a new
archive and nuke the old ones' approach (e.g., if you have important
history to keep), you'll have to wait a few days at least.
UPDATE: This can be done using the new tarsnap-recrypt utility in version 1.0.29 of the Tarsnap client code.
-
How do I generate new keys?
tarsnap-keygen --keyfile /root/new-tarsnap.key --user me@example.com --machine mybox
-
So are my keys compromised now?
This bug affected data stored on Tarsnap, not the keys used to encrypt
it. If you delete all your data and then re-upload, it will be encrypted
securely -- the only reason to need new keys is if you have data already
stored and need to make sure that Tarsnap's deduplication doesn't prevent
the data from being re-uploaded.
One caveat to this: If your tarsnap keys were in an archive you stored, they might be compromised that way.
- I'm not worried about you, Amazon, or the US government reading my data; all I'm concerned about is keeping it safe from script kiddies. Do I need to worry about this? Script kiddies aren't going to be able to access Tarsnap's backing storage on S3, so this issue shouldn't affect you. (Whether your lack of worry about me, Amazon, and the US government is justified is another matter, but that's for you to judge, not me.)
- I don't want to create new keys; can I keep my existing key file and nuke first then upload? Yes. The purpose of creating a new key is to ensure that new data isn't deduplicated against old (insecurely encrypted) data, so if you delete all the old data first you'll be fine. Unless, of course, your computer dies between deleting the old archive and uploading the new ones...
Inequality in Equalland
Life in the nation of Equalland (population 80 million) is idyllic. Boring, but idyllic. By all measures, it is a wonderful place to live: Zero infant mortality; 100% high school graduation; 100% college graduation; zero unemployment; zero income inequality; a steadily rising stock market; no poverty; etc. There is one measure which raises some eyebrows, however: The wealthiest 20% of households own well over 50% of the nation's wealth.Every resident of Equalland has the same life story. From birth until age 18, they live with their parents, earning nothing and spending nothing — their parents cover all their needs. At age 18, they graduate from high school, become independent of their parents, and go to college. The government of Equalland funds the post-secondary system well enough that it can provide free tuition to students, but the college students of Equalland want to study full-time, so they take out student loans to cover their living expenses. At age 22 they graduate from college, get married, and have children (in Equalland, women always give birth to pairs of twins, one male and one female, in order to keep the gender ratio fixed at 1:1). They immediately find jobs, and work until age 65, all earning the same constant (inflation-indexed) salary, gradually saving up enough money (which they invest in the stock market) to pay for their retirement. At age 65 they retire, and they gradually spend their retirement savings until age 80, when they die peacefully in their sleep and their few dollars of remaining retirement savings are spent on funeral costs.
To provide a more concrete picture of the household economics of Equalland, here's some more numbers (dollar values are inflation-adjusted, i.e., expressed in constant "2011 dollars"):
- The stock market rises at a consistent rate of inflation + 4% each year.
- Student loans carry with them an interest rate equal to the stock market's growth rate (since both are zero-risk investments, there is no reason for them to have different rates).
- While at college, each student spends $15,000 per year.
- From age 22 until 65, every person earns $50,000 per year (while on parental leave or when sick, enlightened government policies replace 100% of their income).
- Every 4-person household (parents aged between 22 and 40) spends $90,000 per year, thus saving $10,000 per year towards retirement.
- After their kids leave home, household expenses drop by $10,000 per year (kids are expensive!), and thus parents start saving $20,000 per year towards their retirement.
- Upon retiring, their expenses drop slightly further, to $75,000 per year, due to a lack of employment-related costs (e.g., bus/train fares).
- Upon dying when they (simultaneously) reach age 80, each couple has $5,452 remaining, which covers their funeral costs.
From these numbers, we can obtain a complete picture of the wealth of Equalland:
The most indebted households in Equalland — $130,133 in debt, to be precise — are those formed by 22 year olds; not only have they spent 4 years taking student loans while studying at college, but they have also just married, thus doubling their per-household debt. The average household, in contrast, has a comfortable $206,080 in retirement savings.
But what of the wealthiest? Out of the 33 million households in Equalland, the wealthiest 20% — 6.6 million households — are aged between 58 years 5 months and 71 years 8 months: In short, those who are either about to retire or recently retired. Their average wealth is $693,182 — over triple the average wealth — and between them, they hold 4.58 trillion dollars out of Equalland's total 7.13 trillion dollars of household wealth... or slightly over 64%. So much for equality.
Obviously no such country exists, and most countries have significantly higher wealth inequality — in the US, for example, the top 20% of households own 84% of the wealth. But consider this: Equalland is an idealized scenario. If the stock market didn't rise consistently, or some people lived beyond age 80, or some people had significant medical costs in their final years — or, god forbid, there was any variability in how much individuals earned — then the top 20% of the population would need to hold more than 64% of the country's wealth just to maintain their standard of living after retiring.
Is there too much inequality in the world? Sure. Is all inequality bad? Not if you hope to retire some day.