Ethics from DefinitionsVisualising the Dynamics

Chapter 2 · Section 1

A Game-Theoretic Sketch


A Game-Theoretic Sketch

The attractor argument needs formal grounding. Here is a sketch using standard game theory.

The Communication Game

Consider nn agents playing an infinitely repeated game. At each round tt, agents can:

  • Cooperate (CC): share accurate information, cost cc, mutual benefit b>cb > c
  • Defect (DD): share false information, short-term gain gg, long-term penalty pp
  • Silence (SS): communicate nothing, zero cost, zero benefit

The payoff matrix for pairwise interaction:

(CDSCbcc0Db+gggS0p0)\begin{pmatrix} & C & D & S \\ C & b-c & -c & 0 \\ D & b+g & g & g \\ S & 0 & -p & 0 \end{pmatrix}

The Iterated Result

In a one-shot game, DD dominates. But with infinite repetition and discount factor δ\delta:

VC=bc1δ,VD=g+δVS=g+0V_C = \frac{b - c}{1 - \delta}, \qquad V_D = g + \delta \cdot V_S = g + 0

Cooperation is stable when bc1δ>g\frac{b-c}{1-\delta} > g, i.e., when δ>1bcg\delta > 1 - \frac{b-c}{g}.

For sufficiently patient agents (δ1\delta \to 1), cooperation dominates.

Adding Communication Capacity

Now add a meta-game: each round of cooperation increases the agents' shared vocabulary — their ability to communicate more precise mental states. Let κt\kappa_t be the communication capacity at time tt:

κt+1=κt+α1[both cooperate at t]\kappa_{t+1} = \kappa_t + \alpha \cdot \mathbf{1}[\text{both cooperate at } t]

As κ\kappa increases, bb increases (more can be communicated), cc decreases (shared vocabulary reduces encoding cost), and pp increases (deception is more detectable in a high-fidelity channel).

This creates a positive feedback loop: cooperation → richer communication → more cooperation.

The Eventual Ethics conjecture is that this loop has a unique stable fixed point, and that it is the communication-maximizing equilibrium described in the previous section.

Whether this fixed point is reachable from arbitrary initial conditions is an open question. The Device claims only that it exists — that there is a coherent target to navigate toward.