Oil changes, safety recalls, and software patches
Every few months I get an email from my local mechanic reminding me that it's time to get my car's oil changed. I generally ignore these emails; it costs time and money to get this done (I'm sure I could do it myself, but the time it would cost is worth more than the money it would save) and I drive little enough — about 2000 km/year — that I'm not too worried about the consequences of going for a bit longer than nominally advised between oil changes. I do get oil changes done... but typically once every 8-12 months, rather than the recommended 4-6 months. From what I've seen, I don't think I'm alone in taking a somewhat lackadaisical approach to routine oil changes.On the other hand, there's another type of notification which elicits more prompt attention: Safety recalls. There are two good reasons for this: First, whether for vehicles, food, or other products, the risk of ignoring a safety recall is not merely that the product will break, but rather that the product will be actively unsafe; and second, when there's a safety recall you don't have to pay for the replacement or fix — the cost is covered by the manufacturer.
I started thinking about this distinction — and more specifically the difference in user behaviour — in the aftermath of the "WannaCry" malware. While WannaCry attracted widespread attention for its "ransomware" nature, the more concerning aspect of this incident is how it propagated: By exploiting a vulnerability in SMB for which Microsoft issued patches two months earlier. As someone who works in computer security, I find this horrifying — and I was particularly concerned when I heard that the NHS was postponing surgeries because they couldn't access patient records. Think about it: If the NHS couldn't access patient records due to WannaCry, it suggests WannaCry infiltrated systems used to access patient records — meaning that someone else exploiting the same vulnerabilities could have accessed those records. The SMB subsystem in Windows was not merely broken; until patches were applied, it was actively unsafe.
I imagine that most people in my industry would agree that security patches should be treated in the same vein as safety recalls — unless you're certain that you're not affected, take care of them as a matter of urgency — but it seems that far more users instead treat security patches more like oil changes: something to be taken care of when convenient... or not at all, if not convenient. It's easy to say that such users are wrong; but as an industry it's time that we think about why they are wrong rather than merely blaming them for their problems.
There are a few factors which I think are major contributors to this problem. First, the number of updates: When critical patches occur frequently enough to become routine, alarm fatigue sets in and people cease to give the attention updates deserve, even if on a conscious level they still recognize the importance of applying updates. Easy problem to identify, hard problem to address: We need to start writing code with fewer security vulnerabilities.
Second, there is a long and sad history of patches breaking things. In a few cases this is because something only worked by accident — an example famous in the FreeBSD community is the SA-05:03.amd64 vulnerability, which accidentally made it possible to launch the X server while running as an unprivileged user — but more often it is simply the result of a mistake. While I appreciate that there is often an urgency to releasing patches, and limited personnel (especially for open source software), releasing broken patches is something which it is absolutely vital to avoid — because it doesn't only break systems, but also contributes to a lack of trust in software updates. During my time as FreeBSD Security Officer, regardless of who on the security team was taking responsibility for preparing a patch and writing the advisory, I refused to sign and release advisories until I was convinced that our patch both fixed the problem and didn't accidentally break anything else; in some cases this meant that our advisories went out a few hours later, but in far more cases it ensured that we released one advisory rather than a first advisory followed by a second "whoops, we broke something" follow-up a few days later. My target was always that our track record should be enough that FreeBSD users would be comfortable blindly downloading and installing updates on their production systems, without spending time looking at the code or deploying to test systems first — because some day there will be a security update which they don't have time to look over carefully before installing.
The problems of the large volume of patches and their reputation for breaking things is made worse by the fact that many systems use the same mechanism for distributing both security fixes and other changes — bug fixes and new features. This has become a common pattern largely in the name of user friendliness — why force users to learn two systems when we can do everything through a single update mechanism? — but I worry that it is ultimately counterproductive, in that presenting updates through the same channel tends to conflate them in the minds of users, with the result that critical security updates instead end up being given the lesser attention more appropriately due to a new feature update. Even if the underlying technology used for fetching and installing updates is the same, it may be that exposing different types of updates through different interfaces would result in better user behaviour. My bank sends me special offers in the mail but phones if my credit card usage trips fraud alarms; this is the sort of distinction in intrusiveness we should see for different types of software updates.
Finally, I think there is a problem with the mental model most people have of computer security. Movies portray attackers as geniuses who can break into any system in minutes; journalists routinely warn people that "nobody is safe"; and insurance companies offer insurance against "cyberattacks" in much the same way as they offer insurance against tornados. Faced with this wall of misinformation, it's not surprising that people get confused between 400 pound hackers sitting on beds and actual advanced persistent threats. Yes, if the NSA wants to break into your computer, they can probably do it — but most attackers are not the NSA, just like most burglars are not Ethan Hunt. You lock your front door, not because you think it will protect you from the most determined thieves, but because it's an easy step which dramatically reduces your risk from opportunistic attack; but users don't see applying security updates as the equivalent of locking their front door when they leave home.
Computer security is a mess; there's no denying that. Vendors publishing code with thousands of critical vulnerabilities and government agencies which stockpile these vulnerabilities rather than helping to fix them certainly do nothing to help. But WannaCry could have been completely prevented if users had taken the time to install the fixes provided by Microsoft — if they had seen the updates as being something critical rather than an annoyance to put off until the next convenient weekend.
As a community, it's time for computer security professionals to think about the complete lifecycle of software vulnerabilities. It's not enough for us to find vulnerabilities, figure out how to fix them, and make the updates available; we need to start thinking about the final step of how to ensure that end users actually install the updates we provide. Unless we manage to do that, there will be a lot more crying in the years to come.