Envelopes and Wrappers and Formats, Oh My!
There are a bunch of confusing formats and specifications that describe how to, well, format signatures and related information. This topic is confusing because terminology is hard and each specification tackles things at slightly different levels of abstraction, from slightly different angles. This blog post tries to explain them all, with a few recommendations for specific use cases.
This is mostly focused on the use of signatures in the context of supply chain security. This means signing artifacts, metadata, and code. Some information might be useful here in other contexts, but I make no promises.
Digital Signature Basics
Before we get into this, lets describe signatures at a high level. A digital signature for a piece of data can be created using an asymmetric key-pair consisting of a public and a private key and a signature algorithm. The private key must be kept secret and is used to create the actual signature itself. This is roughly what that process looks like:
The public key is distributed to users who can then use it to verify the signature and the blob of signed data. If the data or the signature is tampered with, the signature will not match the data. This process looks like:
There are a handful of moving pieces involved in creating and verifying a signature. The result is an even bigger handful of formats for how to encapsulate, serialize, and transport all of this information around. Unfortunately these specifications are not all safe for general purpose use. Some are designed for specific use cases and only safe in those deployment modes. Some are outdated and generally insecure or hard to use. And some are just plain bad and should be avoided if at all possible.
At a high level there are four pieces of data to represent, but for each one things can get complicated:
- The data itself to sign
- The public key
- The signature
- The algorithm information
Let’s start with the data itself.
Data (To be signed)
Signature algorithms work with numbers, not complicated data structures. This means that the actual data you want to sign first needs to be converted or transformed to a form that can be passed to a signing algorithm, typically a stream of raw bytes.
The transformation here is trivial if you’re signing an arbitrary blob (say a file or other type of artifact). But things can get complicated if you’re signing a JSON object or other in-memory data structure. Care must be taken here to avoid the Cryptographic Doom Principle, since verifying a signature might require the untrusted data to be deserialized into an object before it has been verified as trusted. Serialization libraries are notorious for bugs, so the simpler this transformation can be, the better.
Some algorithms (ed25519) work by signing the actual byte stream itself, while others require the data to be hashed first (which then requires the hash algorithm to be specified later). Depending on how the byte streams get transmitted, there may be even more encoding/decoding steps to and from something like base64. A simplified version of the overall transformation process (including a hashing step) looks like this:
The fewer transformation steps the better. Each of these links here can and have been attacked via things like JSON serialization bugs, weak hash algorithms and even simply forgetting to do checks!
I mentioned earlier that many signature algorithms require data to be hashed first. This requires both the choice of a strong cryptographic hash function, and the communication of this algorithm to the verifying individual. Using an insecure hash like MD5 that is subject to collision attacks would render even the strongest signature algorithms worthless, so care must be taken to not allow attackers to trick users into using a weak algorithm. This means the hash algorithm must be communicated carefully!
Some specifications allow for “agility” at the message level by including the algorithm as a “signed” or “protected” attribute, while others fix it at the protocol level. Not all signature algorithms even require a hashing step, so this may be a noop.
Other Data Transformations
There are other potentially useful transformations to make to the “to be signed” data before hashing and signing it, beyond basic serialization. If the protocol or format needs to specify authenticated parameters (such as the signature algorithm, hash algorithm, or key information), this information can be packaged up as metadata or headers that are combined with the “to be signed” data. This final result is then fed into the rest of the signing protocol. This looks like:
Again, care must be taken here to properly authenticate the required information before accepting the signature as valid. This gets tricky in practice when pieces of this information are required to understand how to validate the rest of the signature. This can be done securely, but practice has shown repeatedly that the more steps a user has to take to verify a signature, the more opportunities there are to mess up.
There is another general class of attack that this technique can help protect against though, known as encoding confusion. If a specific key is used to sign many different types of data (say protobuf and JSON), an attacker may be able to craft a message that can be interpreted validly in both encodings. There’s a great POC that explains this attack written here by Mark Lodato. The SSH signature protocol uses this technique to prevent key reuse attacks as well.
Signatures In Practice
As illustrated above, there are a few moving parts to track in order to sign a file using a private key, then transmit the signature, file and public key to an end user for verification. Grouping some of these pieces up into wrappers or envelopes can help and simplify the process, making it easier to build command line tools or libraries to handle this for you. This section outlines some of them!
The Update Framework and the In-Toto project currently use a format called the “metablock” to package up a signature and the signed JSON data. This format signs JSON objects rather than byte-streams, so it relies on canonicalization. This is generally a smell, but this protocol has been audited several times and there are no known canonicalization attacks against it. The specification is here, and this is an example:
The format supports multiple signatures over the same JSON data, and there is an optional, unauthenticated “key hint” used to help select the right key or signature during verification. Algorithm details (for both the signature and hash function) are specified out of band, and are typically attached to the public key itself in some way.
There is a proposal to switch from this signature format to DSSE, which I’ll explain next.
DSSE is a new format designed to replace the Metablock for in-toto, TUF, and other projects. The main difference is that DSSE does not rely on canonicalization. Instead, DSSE supports a “protected” payload type attribute which allows a client to safely deserialize the byte-stream. The specification is defined here, and here is an example implementation with test vectors. Here’s an example:
envelope supports multiple signatures over the same signed data, and also includes an optional, unauthenticated key hint used to help select the right key or signature from a list during verification. DSSE also specifies a very simple Pre-Auth-Encoding to make serialization and deserialization of untrusted data as safe as possible.
The SSH signature protocol is a feature of OpenSSH’s
ssh-keygen tool. It can be used to sign and verify files. For a deep dive, check out this other blog post I wrote on the topic.. The SSH protocol supports many different signature and hash algorithms. The hash algorithm is stored in the signature object itself. The entire public key is also stored in the signature object, which can be used instead of a key hint. This is a little dangerous though, and care should be taken to not blindly trust this public key.
Because SSH keys can be used “online” for server authentication, the SSH protocol also prefixes the “to be signed” data with some headers before signing. These headers can protect against protocol attacks, where an attacker might trick someone into signing something as part of an authentication challenge.
The full protocol is defined here. This is a great choice if you or your users already have SSH keys setup and distributed, and you’d like to avoid building a full PKI. There’s some work going on now to allow SSH signatures in the Git tool, which would be great because most GitHub/GitLab users already authenticate pushes with SSH keys.
This is clearly the elephant in the room. The JOSE suite was first standardized in RFC7517, then later amended in RFC8725 with corrections and best practices. A JSON Web Token (JWT) is a compact representation of a JSON object that has been signed, encrypted, or HMAC’ed in some way. A JSON Web Signature (JWS) is a JWT that has been signed (or HMAC’ed, for some reason), rather than encrypted.
A JWS consists of a header (a JSON object), the JWT itself which is signed (another JSON object), and the signature. The signature protects the header and the JWT payload, so they must be concatenated before signing/verifying. These three pieces are all wrapped up into a single string of base64-encoded elements, concatenated with “.” characters for easy, URL-safe distribution.
JWTs/JWSs are rarely used by themselves. Instead, “profiles” of well-understood “claims” and “headers” are defined. For example, a JWS must contain two headers,
typ header specifies that this is a JWS (instead of JSON Web Encryption ), and the
alg header looks something like
RS256, which contains both the signature and hash algorithms.
This is where the problems come in. The algorithm should not be stored here next to the signed data, it should be paired with the public key. Algorithm attacks with JWS happen constantly, where attackers either trick a verifier into using an HMAC algorithm instead of a signature algorithm to verify some data, or even worse — use the
none algorithm, which provides zero protection at all.
Unfortunately, JWS continues to see adoption, mostly because many people incorrectly assume that the RFC/IETF standardization process conveys some heightened level of security. The cryptography and info-sec communities have weighed in here as clearly as they can, and JWS should only be used when required for interoperability (mostly OIDC).
Otherwise, consider using something else where the algorithm value is stored correctly. There is a huge list of other problems with JWT/JWS and I haven’t even begun to scratch the surface here. JWS can be used safely, very carefully, but you’ll be in for major headaches if you’re not in control of all of the libraries your users might use to implement your protocol.
The PASETO specification was designed specifically to address the shortcomings of JWS/JWT. PASETO supports the same feature set as JWS/JWT, but the design mistakes around algorithm choices were corrected. PASETO supports signing arbitrary data, and headers are concatenated (not just JSON) through the use of a PAE, or (Pre-Auth-Encoding) which is designed to be able process untrusted data as easily as possible.
Think of PASETO as a better JWS. Language/library support is growing quickly. PASETO is progressing toward an IETF standard, but it is important to note that this acceptance does not really matter from a security perspective. Many cryptographic systems are in wide use today without any ratified standards (the Signal protocol, for example), and many approved standards are now considered broken or deprecated.
GPG/PGP is an all-in-one, opinionated, cryptographic swiss-army knife. Unfortunately for GPG/PGP, the cryptography community has been moving away from these “kitchen sink” style tools to single-purpose tools that do one thing very well. GPG can be used to sign files and blobs, and supports a wide variety of algorithms and platforms.
The biggest difference between GPG/PGP and the other protocols defined above is the built-in PKI system, called “web of trust”. Explaining this system fully is out-of-scope for this blog post. It’s possible to use PGP without the web of trust, but there no real benefits to this over some of the other protocols outlined above. If you’re somehow already using WOT, please continue to do so! If you’re not, it’s best to stay away from this tooling.
If you’re using GPG for things other than signatures, consider the recommendations from this excellent blog post as replacements.
These tools represent the opposite end of the spectrum from GPG/PGP. They’re minimal and designed to sign and verify data. They do not support PKI or algorithm choices. They both use the excellent ed25519 algorithm, which does not require a hash algorithm choice. Minisign does support the “pre-hashed” mode as an option.
The signature formats are defined here for minisign, and here for signify. Minisign supports free-form “trusted comments”, which are concatenated to the message before signing. Both tools support “untrusted comments”, which are not part of the signed payload. These tools are in common use today in a few different Linux and BSD distributions, and are excellent replacements for GPG if you’re not making use of the web-of-trust or keyservers..
PKCS#7 is an older standard for storing signature data. It was first developed by RSA Laboratories in 1998 as part of RFC2315, but has been updated several times, as recently as 2009 as part of RFC5652.
PKCS#7 is based on the ASN.1-style message encoding found throughout the x.509 and WebPKI ecosystems, which makes it fairly complex to parse and work with. Library support is poor, and this envelope is more commonly used to store lists of certificates or CRLs rather than signatures and signed data. I can’t really think of any reasons to use this over something else described above, unless you need to for some compatibility reason.
Nothing At All!
It’s worth mentioning one more time that you don’t actually need to use one of these formats. They’re handy for sticking all of your signature and signature-related data into a single file or string, but you might not need to do that! If you’re embedding signature data into something else, or transmitting it over a communication channel that already exists, you might be able to just send all these elements separately.
For example, in the Cosign project we evaluated all of these and decided not to use any. We’re embedding the signature data into an OCI Descriptor, which is already a JSON object that supports a flexible mapping of strings to bytes. We just embed each element (signature, payload, cert chain) as its own key, and don’t have to deal with fitting everything into one string or blob.
Signing data is easy — the verification is hard. Specifically, distributing the correct public keys to trust to your users, and keeping them updated as you revoke or rotate keys. A few different PKI systems (public and private) exist that can be used with some of the specifications described above.
WebPKI-style x.509 certificates, CRLs and chains can technically be used with any of the formats described above, but some have more “built-in” support than others. Some JWS profiles (most notably OIDC) contain well-known header claims to refer to certificates or chains that should be used to verify a certificate as part of a signature. These are difficult to use correctly though, and often result in vulnerabilities.
Even for formats that don’t support x.509 certificates natively, it’s trivial to add this support at a different layer. Conceptually, the signature envelope needs to include an unauthenticated certificate (or chain of certificates). These do not need to be part of the signed payload, which is why they can be added to any of the above formats.
When verifying against a certificate chain instead of a public key, a client trusts some fixed “root” certificate (or pool of certificates). The signature envelope contains a chain of signed certificates, which can be verified against the root. The final leaf certificate contains a public key, which can then verify the signature itself. Again, it doesn’t matter where the certificate chain is stored, or whether it is inside or outside the signature envelope. Trust is established by walking the signature chain from the trusted roots, before the certificates are used.
TUF delegations and roots support a concept similar to certificate chains, but library support is limited (although improving). I`f you don’t need x509 compatibility for certificates, you should consider using TUF instead of x.509 certificate chains, mostly because TUF handles revocation and rotation much better than x.509 CRLs do. Bringing us to our next topic, revocation:
Revocation of certificates for web browsing is hard, revocation for certificates in code-signing is impossible. It can be done correctly with Timestamp Authorities, but that brings another system and set of keys to trust into the picture. TUF flips the model around by making everything expire with a relatively short TTL. Instead of revoking a public key, you effectively stop renewing it and instruct users to use a different one. The TTL can be configured to tradeoff between operational complexity and speed of revocation.
There’s a saying that says “Your data is only as good as your last backup, and your backup is only as good as your ability to restore.” The same is true for key rotation. If you can’t effectively rotate a signing key, you haven’t really solved the problem. Keys will get lost, deleted, or stolen. Even if you’re convinced your hardware token is secure and the key can never be extracted, the full mug of coffee sitting next to it on your desk might have other ideas. Here’s what that might look like:
Plan for rotation. Test rotation. Do it frequently.
Signature Algorithm Choices
EdDSA is by far the best algorithm. The main variant of EdDSA used in practice is ed25519. The performance is excellent, key-sizes are tiny. Signatures are deterministic and do not require random number generation, making it very hard to screw up. Hashing is built-in, although there are some variants that support pre-hashing the data. It wouldn’t even be worth discussing other algorithms, except for the one problem with ed25519: it is not FIPS-compliant so many hardware devices and cloud-providers do not support it natively. If you care about compliance, or compatibility with systems/people that do, you can’t use ed25519 yet.
ECDSA (the Elliptic Curve Digital Signature Algorithm) is the next best choice. There are a few different curves you can choose from, including a set that has been approved and standardized by NIST. Performance is fast, key sizes are small, and security is fine. These curves are broadly supported in libraries, hardware, and services. The main downside to ECDSA is that signatures require a secure random number source. Several large-scale attacks have happened because of insecure random number generation, so be careful to get this part right if you use ECDSA. Use ECDSA if you can’t use ed25519. There are several deterministic variants of ECDSA, but they aren’t widely supported or standardized yet. These don’t really make sense to use because if you don’t care about standards, you should just use ed25519.
RSA is the oldest of the commonly supported signature algorithms, so it may be required for compatibility with legacy systems. There are two variants: RSA-PSS and RSA-PKCS#1 v1.5. PKCS#1 v1.5 is considered broken because it is subject to several known attacks, but many libraries still default to it rather than RSA-PSS. Both versions require a random number source, so all the drawbacks to ECDSA are present here. Only use this if you have to.
Hash Algorithm Choices
Any of the SHA-2 functions are fine. Pick one and use it everywhere, don’t try to design in agility at the protocol level. SHA256 is the most widely supported. SHA512 is actually faster and should probably be used more often than it is, but either one is fine. Don’t use MD5 or SHA-1.
This got much longer than I intended. Hopefully it provides a useful overview of a complex space. I’m happy we’re finally starting to talk about signing and verifying software, it means we’re beginning to take supply chain security seriously as an industry.