Chunking attacks on Tarsnap (and others)
Ten years ago I wrote that it would require someone smarter than me to extract information from the way that Tarsnap splits data into chunks. Well, I never claimed to be the smartest person in the world! Working with Boris Alexeev and Yan X Zhang, I've just uploaded a paper to the Cryptology ePrint Archive describing a chosen-plaintext attack which would allow someone with access to the Tarsnap server (aka me, Amazon, or the NSA) or potentially someone with sufficient ability to monitor network traffic (e.g. someone watching your wifi transmissions) to extract Tarsnap's chunking parameters. We also present both known and chosen plaintext attacks against BorgBackup, and known plaintext attacks against Restic.EDIT: The paper is here. Once it's on the ePrint Archive I'll point to that copy instead, but I don't know exactly how long that will take.
And, of course, because Tarsnap is intended to be Online backups for the truly paranoid, I've released a new version of Tarsnap today (version 1.0.41) which contains mitigations for these attacks, bringing us back to "I can't see any computationally feasible attack"; but I'm also exploring possibilities for making the chunking provably secure.
I'm sure many people reading this right now are asking the same question: Are my secrets safe? To this I have to say "almost certainly yes". The attack we have to leak Tarsnap's chunking parameters is a chosen plaintext attack — you would have to archive data provided to you by the attacker — and the chosen plaintext has a particular signature (large blocks of "small alphabet" data) which would show up on the Tarsnap server (I can't see your data, but I can see block sizes, and this sort of plaintext is highly compressible). Furthermore, even after obtaining Tarsnap's chunking parameters, leaking secret data would be very challenging, requiring an interactive attack which mixes chosen plaintext with your secrets.
Leaking known data (e.g. answering the question "is this machine archiving a copy of the FreeBSD 13.5-RELEASE amd64 dvd1.iso file") is possible given knowledge of the chunking parameters; but this doesn't particularly enhance an attacker's capabilities since an attacker who can perform a chosen plaintext attack (necessary in order to extract Tarsnap's chunking parameters) can already determine if you have a file stored, by prompting you to store it again and using deduplication as an oracle.
In short: Don't worry, but update to the latest version anyway.
Thanks to Boris Alexeev, Yan X Zhang, Kien Tuong Truong, Simon-Philipp Merz, Matteo Scarlata, Felix Gunther and Kenneth G. Paterson for their assistance. It takes a village.