
Privacy-Preserving Digital Identity with Abhi Shelat | a16z crypto Research Series
Audio Summary
AI Summary
Abby Shalad introduced a recent work on zero-knowledge credentials from ECDSA, aiming to address the challenges of digitizing identity and improving existing digital credential protocols. The context involves Erica, who is tired of carrying a physical government-issued ID card and concerned about the privacy implications of uploading photos of her ID online for KYC or age verification. The goal is to digitize this identity into a "digital credential" stored on a user's device, signed by the issuing government, with data private to the user and issuer.
This type of digital credential is a real-world protocol already deployed in many US states and countries like Australia, Japan, South Korea, and India, with the EU also adopting a similar mechanism through its EUD wallet project. These systems aim to solve the problem of carrying physical cards and enable selective disclosure of attributes (e.g., proving age over 18 without revealing birthdate or address).
The challenge is to improve these deployed protocols and their existing infrastructure. Current legacy protocols involve a user's device responding to a relying party's request (e.g., for age over 18) with an "MDOC" document, a pre-image for the desired attribute, and a signature. The relying party then performs several verification checks, including verifying the MDOC's signature by the state, the device's signature, and that the pre-image asserts the claimed attribute. The proposed solution is to replace these relying party checks with a zero-knowledge proof run on the user's device, with the relying party only verifying the proof.
A sample MDOC document is an encoded object signed by the issuer, stored on the user's phone. It includes "salted hashes" of attributes for selective disclosure. A salted hash is a SHA-256 hash of a pre-image containing a random salt, the attribute name (e.g., "age over 18"), and its value (e.g., "true"). When a user wants to disclose an attribute, they send the MDOC (without disclosed attributes) and the pre-image for the specific attribute. The relying party hashes the pre-image and checks if its hash is in the MDOC's list of attributes.
A critical concern with digital credentials is copying. To prevent this, "device binding" ties the digital credential to the phone's hardware. During issuance, the phone's secure element generates a public signing key, storing the secret key securely. This device public key, along with user information and live selfies, is sent to the state issuer. The issuer verifies the user's identity and signs the MDOC, importantly including the device public key in it. This signed MDOC is then sent back to the phone.
The problem with device binding is that current phone hardware (secure elements) only support RSA and ECDSA signatures for a specific set of keys, not newer cryptographic primitives like bilinear pairings or lattice-based signatures. This limits the types of zero-knowledge systems that can be used. Furthermore, the device public key in the MDOC acts as a "super cookie," allowing tracking across different service providers, which is a significant privacy concern. Even issuing thousands of MDOCs with different device keys doesn't fully solve the problem, as the issuer can still collude with service providers to link user activity. This inherent privacy problem within device binding is what the zero-knowledge system aims to solve.
The proposed solution leverages a zero-knowledge system. Unlike blockchain applications that prioritize minimizing verifier time, this context prioritizes minimizing prover time on the user's small device, with the relying party (server) acting as a powerful verifier. Key constraints and considerations for this project include:
1. **Engineering Trade-offs:** The focus is on applying existing zero-knowledge techniques to current infrastructure rather than designing from scratch.
2. **Standards Coordination:** Achieving agreement on new standards, especially for trusted parameters (e.g., a half-gigabyte public parameter across 50 US states or 100 countries), is extremely difficult. This makes trusted parameters practically impossible to deploy.
3. **Legacy Systems and Emotional Ties:** Identity standards have a long history, and deployed systems are resistant to change, even for minor performance improvements.
4. **Issuer Limitations:** State governments have limited secure hardware and cannot scale to internet-level activity. Protocols should not require issuers to be involved in every online transaction (O(internet activity)), but rather only in user-specific issuance (O(number of users)).
5. **Post-Quantum Concerns:** Some organizations are unwilling to certify or deploy systems using elliptic curves due to post-quantum security concerns, despite expert opinions suggesting quantum computers won't break current cryptography in their lifetime. This is contradictory, as the legacy signatures being proven (ECDSA) are also elliptic curve-based.
6. **Complexity vs. User Benefit:** Optimal techniques might be too complex to implement and deploy if they don't offer a substantial user benefit.
The chosen zero-knowledge system uses a recipe based on the Hyrax paper: run an interactive protocol (IP) for the computation, commit to the transcript, and then give a specialized zero-knowledge proof that the commitment contains the transcript and that the IP verifier would have accepted the computation. This approach benefits from the transcript being shorter than the original state.
Specifically, they use:
* **Interactive Protocol:** A sum check or GKR-based protocol.
* **Commitment and Zero-Knowledge Scheme:** Ligero.
The high-level architecture involves the prover committing to the private witness and a "pad" (used to encrypt/commit to the sum check transcript) using Ligero. Then, a sum check protocol runs between prover and verifier. When the sum check sends a message, it's XORed with the pad, forming a commitment. The verifier sends random challenges and records the transcript without verifying it. Finally, a Ligero proof system is used to prove that the sum check verifier, run on the recorded transcript, witness, and pad, would have accepted the computation. This involves building the sum check, Ligero, and circuit definitions, then gluing them together.
Key results and optimizations:
* **Legacy Compatibility:** The scheme works with existing legacy systems and is being deployed in Google Wallet. It's designed to be robust to new requirements.
* **Novelty Aspects:** Division of verification tasks over multiple fields and a simple MAC trick for consistency across witnesses in different fields.
* **Optimizations:** FFT optimizations for speed, reliance only on SHA-256, and using quadratic forms to make circuits smaller, reducing the number of variables in the sum check.
* **Ligero on Layered Sum Check:** Instead of committing to every multiplication in the circuit (as in pure Ligero), this approach only commits to inputs and the sum check transcript, significantly reducing commitment size.
**ECDSA Verification Example:**
Verifying an ECDSA signature involves an exponentiation with three bases and three scalars.
* The finite field of P256 is not friendly to the Number Theoretic Transform (NTT) due to insufficient roots of unity.
* Its quadratic extension, however, has ample roots of unity, allowing for efficient NTT.
* ECDSA verification (23,000 quadratic form terms, 50,000 multiplies/adds) has a 50x reduction in input size compared to the full circuit. This leads to a 23x reduction in commitment size.
* **Performance:** On a Pixel 6 Pro (a four-year-old phone), producing an ECDSA zk-proof takes ~80 milliseconds, and verification takes ~8 milliseconds. This is a significant improvement over previous techniques (~140 seconds) and recent papers (~10-15x overhead). On faster hardware like a Macintosh or modern iPhone, it's even quicker.
**SHA-256 Pre-image Proof:**
* Proving knowledge of a SHA-256 pre-image for an N-block message: 1 block takes 11-20 milliseconds on a Pixel 6 Pro.
* Pure Ligero for one block takes ~273 milliseconds, demonstrating the performance gain from the layered sum check and reduced commitment size.
**Full Legacy Protocol ZK Theorem:**
The ZK theorem proves the existence of witnesses (MDOC string, hashes, values, nonces, timestamps) for public values, subject to predicates:
* Hashing the MDOC (most expensive part).
* Verifying issuer and device public key signatures.
* Parsing the document and checking attribute values (e.g., age over 18 is true).
* Computing more hashes.
* Performing range proofs for credential validity period against the current time.
**Deployment & Performance:**
* The proof on a Pixel 6 Pro takes about 1 second; on a modern iPhone, 300-400 milliseconds. This is within the human interaction time (e.g., face scan for secure element authentication) and connection setup, making the proof generation time effectively "hidden."
* The proving time is considered "fast enough" (target budget was 3 seconds). Further optimizations aim to tolerate "silly decisions" in credential design and provide a cushion for future requirements.
* **Proof Size:** About 400 kilobytes, acceptable given that current methods send 3 megabytes. Smaller size is always better, especially for in-person Bluetooth transfers, but hash-based regimes with Merkel paths make substantial reduction difficult.
**New Features & Robustness:**
The circuit model offers flexibility for new requirements:
1. **Revocation:** Proving non-revocation is crucial. The proposed scheme involves revoked credentials sorted by identifier. The revocation authority signs consecutive pairs of IDs (R_i, R_{i+1}). A user downloads the "bucket" containing their credential and gives a constant-size ZK proof of one signature (on a pair) that their ID falls between R_i and R_{i+1}. A version identifier ensures the latest revocation list is used. This adds 20-60ms to a proof.
2. **Pseudonyms:** Cryptographically bound pseudonyms (e.g., restricting a user to 2-3 accounts on a website) can be created by applying a PRF to a secret attribute and the current context (URL, date). This has almost zero cost.
3. **Hiding the Issuer:** An "OR" proof can show the MDOC is signed by one of many public keys (PK1, PK2, PK3...). This adds minimal complexity as it's a public parameter and uses simple multiplexers within the circuit.
**Regulatory Approval:** This is the next major hurdle, involving discussions with cryptographic boards. The scheme is mentioned in the EU digital identity wallet webpage and is chosen for age verification in Europe.
**Comparison with Alternative Schemes:**
* **Crescent Scheme (Vampy):** A clever idea to rerandomize expensive proofs. However, it requires a trusted parameter, costly regulatory approval for two schemes, and a bilinear pairing (post-quantum concern). It also still needs a circuit for device binding.
* **BBS+ Family:** An anonymous credential scheme in literature for 20 years.
* One variant requires pairing (post-quantum concern).
* Another variant doesn't use pairing but requires the user to contact the issuer for every proof, which is not scalable for internet activity. This also goes against the "no phone home" privacy advocate community.
* **Index Disclosure Problem:** BBS+ signs a vector of messages (attributes). While it can prove an attribute's value, it does not hide the attribute's index in the vector. If issuers use different schemas (attribute ordering) or augment schemas over time, disclosing the index leaks privacy and allows linking, violating unlinkability. This would require all issuers worldwide to agree on an exact, unchangeable schema, which is deemed impossible.
* **Device Binding:** BBS+ doesn't inherently support device binding on existing phone hardware. There are proposals, but they face challenges, e.g., BLS curves for bilinear pairings are not large enough to embed public keys into BBS credentials.
The speaker concludes that their layered approach, while having the complexity of verifying circuits, is deployable, performant, and addresses the critical privacy and engineering concerns. The main arguments against it are complexity for average engineers/regulators, but efforts like formal verification are underway. The debate with BBS+ camps often centers on perceived simplicity versus full deployment complexity. While BBS+ might be theoretically faster in some aspects, its practical deployment issues (index disclosure, device binding, issuer involvement, trusted parameters) are significant. The current scheme is ready for deployment today, addressing immediate needs like EU age verification, whereas BBS+