
On This Problem Rational People Do Worse
AI Summary
Newcomb’s Paradox presents a thought experiment that challenges our fundamental understanding of rationality, free will, and decision-making. The setup is straightforward: you enter a room containing a supercomputer and two boxes. Box A is transparent and clearly contains $1,000. Box B is opaque, and its contents are a mystery. You are given a choice: you can either take only Box B (one-boxing) or take both Box A and Box B (two-boxing).
The twist lies in the supercomputer’s role. It is a near-perfect predictor that has analyzed thousands of people before you with almost total accuracy. Before you even entered the room, the computer predicted what you would do. If it predicted you would take only the mystery box, it placed $1 million inside it. If it predicted you would take both boxes, it left the mystery box empty. Crucially, the prediction and the placement of the money happened in the past. The boxes are already set; nothing you do now can physically change whether the million dollars is inside Box B.
This problem divides people into two nearly equal camps, each convinced the other is acting irrationally. The "one-boxers" rely on Evidential Decision Theory. They argue that because the computer is incredibly accurate, your choice provides strong evidence of the outcome. If you choose one box, the probability is overwhelming that the computer predicted this and placed the million dollars inside. From this perspective, the expected utility of one-boxing is significantly higher than two-boxing. Since almost everyone who has one-boxed has walked away a millionaire, while those who two-boxed left with only $1,000, the "rational" choice is to take the single box.
Conversely, the "two-boxers" adhere to Causal Decision Theory. They argue that because the computer made its prediction in the past, your current decision cannot causally influence the contents of the boxes. At the moment of your choice, the million dollars is either there or it isn’t. In either scenario, taking both boxes nets you an additional $1,000. If the million is there, you get $1,001,000; if it isn’t, you get $1,000. This is known as "strategic dominance"—one strategy is always better than the other, regardless of the external state of the world. Two-boxers view one-boxing as "wishful thinking," or an attempt to change the past through current thoughts.
The paradox reveals a deep tension between these two theories of choice. It forces us to ask: what does it mean to be rational? If being "rational" (two-boxing) results in you being poorer than the "irrational" person (one-boxing), is the theory of rationality itself flawed? This is often called the "Why Ain’t You Rich?" argument. Some philosophers argue that if a game rewards "irrationality," then acting irrationally becomes the most successful—and therefore, in a sense, the most rational—path.
The problem also touches on the existence of free will. If a computer can predict your choices with 100% accuracy, it suggests that your "choice" might be predetermined. If the future is already written, the distinction between the "moment of choice" and the "past prediction" begins to blur. However, even if free will is an illusion, we must operate as if it is real to maintain a functioning society and legal system.
To understand how one might bridge the gap between these two camps, the transcript explores the concept of "pre-commitment." This is illustrated through the Prisoner’s Dilemma and the strategy of Mutually Assured Destruction (MAD). In a one-shot Prisoner’s Dilemma, the "rational" act is to defect (betray your partner). However, in an "iterated" version of the game—one played repeatedly—the most successful strategy is cooperation. Similarly, in nuclear geopolitics, a leader must pre-commit to a "worse" or even "irrational" option—retaliating with nuclear weapons even if it means global extinction—to prevent an attack from ever happening. The commitment itself creates the desired outcome.
In Newcomb’s Paradox, the only way to "win" is to be the kind of person the computer identifies as a one-boxer. This suggests that rationality isn't just about the specific act you perform in the moment, but about the rules and character traits you choose to live by. If you "wire" yourself to be a person who honors commitments and follows certain rules of cooperation, you align yourself with the best possible version of your future self.
Ultimately, Newcomb’s Paradox isn't just a riddle about money; it’s a reflection of how we navigate a world where our choices are influenced by patterns, reputations, and the predictable nature of human behavior. Whether you take one box or two depends on whether you value the evidence of a successful outcome or the causal logic of the immediate moment. While there is no consensus on the "correct" answer, the problem proves that sometimes, the most successful way to live is to commit to a set of rules that might seem irrational in isolation, but lead to the best results in the long run.