
Mental Models for Real-World Cryptography and Trusted Execution Environments
Audio Summary
AI Summary
The presentation by Itai Abraham discusses mental models for Trusted Execution Environments (TEEs), focusing on the intersection of distributed computing, cryptography, and game theory. Abraham begins by exploring the fundamental trade-off between security and friction in system design. He notes that most traditional systems prioritize low friction, often at the expense of high security, which can create an illusion of safety. Bitcoin is presented as an extreme example of high security with relatively high friction. The broader blockchain movement aims for high security with lower friction, striving to create a "global computer." Abraham predicts that AI will significantly reduce friction, enabling new points in this trade-off, making previously infeasible or inefficient security measures more viable.
A core question posed is whether secure hardware is necessary when cryptography is robust. Abraham argues that without some form of physical security, cryptography alone is insufficient. In a world with "no physical security," where an adversary can access any secret, most cryptographic applications like public key cryptography, secret keeping, and signing become impossible. This leads to the conclusion that all cryptographic systems implicitly rely on some root of physical trust. Therefore, if hardware must be trusted, the discussion shifts to which hardware to trust and whether this minimal trust can be extended to build more capable, trusted hardware that does more than just keep secrets. The ultimate question is what constitutes the minimal trusted hardware and how its trustworthiness can be guaranteed, especially against backdoors in devices like secure wallets.
Abraham defines physical trust by categorizing adversaries:
1. **Remote/Network Adversary:** Can access a device remotely (e.g., over the internet).
2. **Software Adversary:** Can run local software on the device, near the secure hardware.
3. **Physical Adversary:** Has physical access to the secure hardware and can use specialized tools to extract information.
The discussion then introduces Trusted Platform Modules (TPMs) as an early form of secure hardware. TPMs are chips (sometimes virtualized) designed to:
1. Store secrets securely.
2. Provide a monotonic counter.
3. Trustedly measure other parts of the system, such as what's running on the CPU.
The ability to measure and sign these measurements with an internal secret enables "attestation." This allows verifying that a specific piece of code is running and can even release secrets only when a signed measurement matches a previous one, ensuring code integrity. While TPMs are considered secure against network and software adversaries and offer relatively good physical security, they are not deemed physically unbreakable. A major challenge with TPMs, particularly in the "trusted boot" paradigm, was the "large Trusted Compute Base (TCB)" problem. Verifying the entire system (BIOS, OS, applications) results in an enormous "circuit" to trust, making it highly susceptible to bugs. This led to systems verifying smaller parts, like those in phones, wallets, or Hardware Security Modules (HSMs).
The rise of cloud computing and virtualization introduced new security challenges related to "outsourcing computation."
1. **Integrity Problem:** Ensuring the outsourced computation does what it's told, not something else.
2. **Privacy Problem:** Preventing the cloud provider from seeing the input data or the code being executed.
TEEs emerged to address these cloud computing problems, aiming to combine economic efficiency with security. The high-level idea is to provide each virtual machine (VM) with a separate, virtualized TPM for granular integrity proofs. Additionally, TEEs offer isolation, preventing one VM from accessing another's data. This isolation is achieved through a trusted memory manager and encryption. The memory manager holds the keys and ensures that only the currently running VM can access its specific memory, which is decrypted on demand. Other VMs' memory remains encrypted. The critical trust here lies in the memory manager's isolation code.
TEEs are designed for low friction, aiming for single-digit overheads compared to regular VMs. They are primarily built to be secure against software attacks, assuming an adversary can compromise all software (BIOS, hypervisor, OS, other VMs) but cannot compromise certain firmware components like the memory manager.
Two key properties of TEEs are:
1. **Integrity:** Achieved through attestation. This requires trusting:
* The ability to measure the VM's code in a trusted manner.
* A component with a secret that can sign this measurement.
* A chain of certificates leading to a trusted root certificate from the hardware vendor.
If these are trusted, TEEs provide a publicly verifiable integrity proof, confirming what code is running on the VM.
2. **Confidentiality/Isolation:** Protects against information leakage. This relies on a trusted memory manager that controls entry and exit from the TEE. When outside the TEE, memory is encrypted or blocked. When inside, only the attested code runs. This provides a strong notion of obfuscation, making it hard to know the exact code or data being processed.
However, TEEs face significant challenges:
* **Physical Adversaries:** Current TEEs are generally weak against physical attacks.
* **Side Channels:** Micro-architectural bugs in memory manager implementations have led to numerous side-channel attacks, though many are being fixed.
* **Oblivious Computation:** Designing applications to avoid leaking information through non-oblivious memory or storage access remains crucial.
Abraham discusses two paradigms for using TEEs:
1. **Defense in Depth:** TEEs are seen as a default, low-friction security layer, akin to using HTTPS instead of HTTP. The expectation is that VMs will eventually run inside TEEs by default due to marginal costs.
2. **Security-Critical Applications:** Applications fundamentally rely on TEE security. This implies trusting the cloud service provider for physical security and the hardware vendors for correct implementation, in addition to designing applications to prevent side-channel leaks.
TEEs have various applications in the blockchain space: wallets, bridges, enhanced anonymity/privacy for wallet reads, integrity proofs, block production, MEV protection, consensus protocols, private contracts, and confidential AI inference.
A significant portion of the talk addresses consensus protocols, particularly the 33% Byzantine fault tolerance (BFT) lower bound. This lower bound, known as the "split-brain lower bound," demonstrates that with three nodes (A, B, C) and one Byzantine adversary (B), B can make A and C disagree on a decision (e.g., A decides 1, C decides 0). This is achieved by B communicating different inputs to A and C, effectively "splitting its brain." Abraham points out that this attack doesn't even require a fully Byzantine adversary; a "reboot adversary" that simply reboots and changes its input can achieve the same outcome if its identity remains the same across reboots.
TEEs can overcome this by ensuring "non-equivocation." If a private key is generated within a TEE and its public key is recorded in a trusted PKI (Public Key Infrastructure), a Byzantine software adversary (without physical access) becomes equivalent to an "omission adversary" (one that can only block messages, not alter them or see secrets). This non-equivocation property means a TEE cannot send conflicting messages for the same round, allowing for 2f+1 Byzantine fault tolerance in consensus protocols (where f is the number of faulty nodes), even with mobile adversaries, N-squared communication, and without Distributed Key Generation (DKG) in asynchrony.
However, the "reboot problem" persists: if a TEE reboots and uses the same private key, it's vulnerable to the reboot adversary. To prevent this, a TEE must generate new private/public keys upon reboot, effectively becoming a new node in the system. This requires a reconfiguration event, which can be handled by the existing system if enough non-faulty parties are present. If too many nodes reboot, potentially leading to liveness loss, an external "layer one trust anchor" is needed to record new public keys. This makes TEEs less ideal for Layer 1 blockchains but well-suited for Layer 2s, which can use the Layer 1 as their trust anchor.
Abraham highlights "confidential inference" as a promising Layer 3 application for TEEs. Companies like Microsoft, Google, and Apple are exploring TEE-based solutions for confidential AI, where sensitive queries are processed securely in the cloud. These systems often use a "transparency ledger" or a 2f+1 partial synchrony consensus protocol (often with a small number of TEEs, like 2-4) to record the most recent trusted software and models for the inference server, ensuring high availability, safety, and confidentiality of the inferences. WhatsApp also uses a similar ledger for its private cloud compute. The demand for Byzantine fault tolerance in these applications stems from the highly intimate nature of the data, requiring a higher level of trust even against cloud service providers.
The discussion then moves to TEEs and integrity proofs, drawing an analogy between integrity/liveness and confidentiality/safety. Safety and confidentiality are "N out of N" properties (one failure is bad), while liveness and integrity are "one of N" properties (one success is good enough to make progress or prove integrity). Challenges with Zero-Knowledge Proofs (ZKPs) or validity proofs mirror those of TEEs, particularly the large TCB problem when proving large circuits. Many ZKP systems rely on "security councils" (multi-signatures), which undermines the goal of a trustless system.
A proposed solution is "compositional integrity" using a "one out of two" approach. Every statement is processed by two different provers: one generates a ZKP, and the other provides a TEE attestation (a signed measurement traced to a hardware vendor's root of trust). The idea is that the probability of bugs in both systems is small. If both agree, the execution is committed. If one disagrees, fallback mechanisms like fraud proofs, security council decisions, or timed delays can be invoked. This defense-in-depth approach adds minimal friction while significantly enhancing security by mitigating potential bugs in either proving system.
Finally, Abraham addresses "accountable privacy" in the context of threshold cryptography and Multi-Party Computation (MPC). He notes that while threshold cryptography is theoretically sound, it lacks traction due to the "cryptoeconomics" problem: parties can collude secretly without detection, undermining accountability. Blockchains achieve accountability (for safety and liveness) by cryptoeconomically punishing colluding validators.
To bring accountability to privacy in MPC, Abraham suggests a model where a human staker provides the stake, but the node's secrets are held within a TEE. This prevents the staker from colluding by accessing the secret. The remaining challenge is physical security. The ultimate vision involves TEE-equipped MPC servers that use "robotic AI" to sense their environment for physical threats. If a threat is detected, the TEE not only destroys the secret but also publishes a "slashing event" that punishes the staker, making them accountable for maintaining physical security. This "TEEs plus robotic AI" approach aims to enable MPC in the real world with accountable privacy.
Key takeaways include the importance of publicly verifiable secure hardware, TEEs as a security anchor, and defense-in-depth strategies. Future directions include accountable privacy, confidential AI, and the eventual integration of robotics for physical security.