Improving TOFU With Transparency
TOFU is an OK substitute for when you have nothing better, but it’s never the best choice.
Before you get upset, I’m not talking about the coagulated soy-bean excretion often served fried, I’m talking about the Trust-On-First-Use authentication scheme. Just like the food, there are times to reach for TOFU, but too much of it might not be the best for your health. This post dives into when TOFU works, when it doesn’t and some mitigations that make use of transparency logs.
What is TOFU?
TOFU is an authentication system used to bootstrap trust between two entities on an untrusted network. Pretend you’re at a crowded train station looking for someone you’re supposed to meet named Alice to exchange a sensitive briefcase. Depending on how well you and Alice know each other, this might be a hard problem. This section outlines a few options to find Alice in the train station,
You can’t just trust the people you ask, because you know nothing about them and this is an untrusted network. An attacker could simply reply “Yes”:
Since you know nothing about Alice, you have to rely on everyone being nice and honest. Unless you’re looking to contact a specific member of the Nigerian Royal Family about an unexpected inheritance, I’d suggest against this technique — we can do better.
Another option is to use a third-party identity provider. You agree to trust a third-party, and this person issues identification cards that are hard to forge. Then, you can ask each individual to present this identification, and verify it before trusting them. In computer networks, this is referred to as Public Key Infrastructure, or PKI. If you’re looking for a deep dive, read this post from Mike Malone at SmallStep. It’s excellent. Going back to our ID card/train station analogy, it would look like:
Here, you check the ID cards for each person you encounter. You can verify that the identity card is authentic by checking with the third-party Identity Provider, then validate that it points to the correct person, Alice. This model is better, but it also requires a complicated third-party identity provider who can be trusted to issue IDs. In web security, these are referred to as Certificate Authorities. What if we don’t have one of those?
Despite JWT insisting otherwise, a scheme of none doesn’t work. PKI is great if you have it, but it’s hard and requires a third-party you can trust. We’re arriving at our goldilocks option: Trust-On-First-Use, which is just as boring as it sounds.
If we assume we’re going to be exchanging many secret briefcases with the same someone named Alice over time and we’re feeling lucky, we can remember the first person who claims to be Alice and trust them from then on. In network security, the incredibly commonly used tool SSH uses this technique. The first time a client contacts a specific server at an IP address, the client simply trusts that the connection has reached the correct server.
In this protocol, the client asks the server for a secret fingerprint and saves that locally. Each time the client contacts the server again later, the client can compare that fingerprint to the first one and trust that the server hasn’t changed. This sounds pretty scary, but in practice works out OK if it’s hard to predict the first time you’ll look for Alice at that particular train station, or connect to a specific IP address.
When TOFU Crumbles
If you cook TOFU too long or you let it get too old and dry out, it can crumble and fall apart. Other times it’s edible though. SSH uses TOFU most of the time and it’s mostly OK, because clients typically connect to the same servers over and over again and can remember a lot of fingerprints for a long time. Trust on First Use works well if the number of “first uses” is relatively low compared to the number of overall connections.
Another example of TOFU where this assumption works much less well is in supply-chain security. Say you’re installing a new package from an open source package manager, and you only know the name and version number. You want to make sure you get the right package, but you have no idea what package actually looks like. We could use the
alg: none scheme, but that’s not great. The package manager could return anything it wants, including malware:
We can TOFU this, and remember the hashes of all packages we install in a local database. Then, we compare the package we install with this database before installing.
This makes sense at first, since we can probably store a list of package names and their hashes. This doesn’t work in practice though, for a few reasons:
- If we can store package hashes and names, we can probably just store the entire packages. The hashes don’t really add much.
- We typically install new packages (or update) much more often than we connect to new servers over SSH. The risk here is magnified a bit because of the sheer number of chances we have to get it wrong.
- Not all builds happen on machines where we have local state! Ephemeral environments for CI builds are a common best practice, which makes caching state hard.
In these models, Trust on First Use can quickly become Trust on Every Use! So what can we do?
Enter Transparency Logs! These are really cool data structures that allow us to maintain an immutable log. For our package manager example, we could store a list of hashes for specific package versions. The great part is that clients don’t actually have to trust this list to be tamper-proof, they can audit it themselves! Any entries in the log can be cross-checked with other clients, and we can make sure the log stays consistent over time. If you want to learn more about how these work, I’d suggest this blog post from Russ Cox.
In our package manager example, we can move the cache of package hashes from our local server to the transparency log. We check that package-versions are in this log before using them, and that the hash matches what we downloaded ourselves. If we get a mismatch, we know that either the log is lying or the package has been tampered with. If the log happens to be lying, anyone else will be able to see it too. Here’s what that looks like:
Entries can get added into this log explicitly if you know who is responsible for publishing them, or the first time they’re retrieved. If you have an authentication mechanism for users to publish, you might have enough PKI to not need this system anyway, so we’ll focus on the add-on-first-download mechanism:
If you’re paying close attention, you’ll notice that Add-On-First-Retrieval (AOFR) looks an awful lot like TOFU! The main difference is that with a Transparency Log, the TOFU becomes global. Instead of each (potentially stateless) client trusting the same package the first time they download it, all clients trust the first package that was ever retrieved by any other client. This model was first put into common use in the Golang Module Transparency Log, where specific module versions are entered into the log and pinned forever the first time they are retrieved by any
go get commands.
Transparency improves the ratio of first-retrievals to all-retrievals, and gets us back closer to the case in SSH. An attacker would have to correctly guess the first time a package will be retrieved by anyone and be ready to MITM that connection, wherever it is from. This is obviously much harder than MITM’ing a specific targeted connection. Further, if an attacker is actually able to do this somehow, there will be a public record of it in the transparency log that anyone could detect later.
What Else Is There?
In Web Of Trust, trust is collaboratively built up out-of-band. It’s a really elegant solution that unfortunately doesn’t really work in the real world. Translating trust from the meat space into the meta-verse is a slow process that requires actually meeting up in meatspace. It also requires individuals participating to maintain (and protect) a secret key to represent their virtual identity. Losing a key requires starting from scratch, and it turns out that on average humans are too bad at keeping secrets around long enough for the web to really work. Too many people are starting from scratch too often to keep critical mass.
Blockchain is another really elegant answer to the answer here, but it turns out that they’re actually really similar to Transparency Logs. The main difference is that Transparency Logs don’t require a distributed consensus algorithm, which is typically the thing that requires all the wasteful electricity and complicated mining setups. Transparency logs do require centralized infrastructure where blockchains are fully decentralized, but you don’t actually need to trust a Transparency Log to return correct results — you just need to trust it to remain running. My mental model blockchain vs. transparency logs looks like:
Throwing in the other two (PKI and Web of Trust):
TOFU isn’t great, but it’s way better than nothing. If it’s all you have, you might be able to mitigate some of the risk by using a Transparency Log. The Rekor Log, which is part of the overall Sigstore Project is designed to act as a global transparency log for any published artifacts. We’re hoping to make it easy to implement transparency in other software supply chains. Come get involved if this sounds fun to you!
Another area of future work is to integrate The Update Framework with Rekor. TUF provides a thorough mechanism for authenticating releases, but most implementations of it today rely on PKI or TOFU to retrieve the initial root policy. We recently added support for storing TUF roots in Rekor, so stateless clients can retrieve and trust TUF roots in a globally-consistent manner. I think this will make TUF usable in many broader contexts, which is sorely needed to secure our open source supply chains. One more for fun: