*Peter Norvig*

# This is a note for testing purposes

# The Devil and the Coin Flip Game

If the Devil ever challenges me to a fiddle contest, I'm going down. But here is a contest where I'd have a better chance:

You're playing a game with the Devil, with your soul at stake. You're sitting at a circular table which has 4 coins, arranged in a diamond, at the 12, 3, 6, and 9 o'clock positions. You are blindfolded, and can never see the coins or the table.

Your goal is to get all 4 coins showing heads, by telling the devil the position(s) of some coins to flip. We call this a "move" on your part. The Devil must faithfully perform the requested flips, but may first sneakily rotate the table any number of quarter-turns, so that the coins are in different positions. You keep making moves, and the Devil keeps rotating and flipping, until all 4 coins show heads.

Example: You tell the Devil to flip the 12 o'clock and 6 o'clock positions. The devil might rotate the table a quarter turn clockwiae, and then flip the coins that have moved into the 12 o'clock and 6 o'clock positions (which were formerly at 3 o'clock and 9 o'clock). Or the Devil could have made any other rotation before flipping.

What is a shortest sequence of moves that isguaranteedto win, no matter what the initial state of the coins, and no matter what rotations the Devil applies?

(This same puzzle also appeared in The Riddler on 21 June 2019, with the role of the devil played by a banker. I'm not sure which is scarier.)

# Analysis

- We're looking for a "shortest sequence of moves" that reaches a goal. That's a shortest path search problem. I've done that before.
- Since the Devil gets to make moves too, you might think that this is a minimax problem: that we should choose the move that leads to the shortest path, given that the Devil has the option of making moves that lead to the longest path.
- But minimax only works when you know what moves the opponent is making: he did
*that*, so I'll do*this*. In this problem the player is blinfolded; that makes it a partially observable problem (in this case, not observable at all, but it is traditional to say "partially"). - In such problems, we don't know for sure the true state of the world before or after any move. So we should represent what
*is*known:*the set of states that we believe to be possible*. We call this a*belief state*. At the start of the game, each of the four coins could be either heads or tails, so that's 2^{4}= 16 possibilities in the initial belief state: {HHHH, HHHT, HHTH, HHTT, HTHH, HTHT, HTTH, HTTT, THHH, THHT, THTH, THTT, TTHH, TTHT, TTTH, TTTT} - So we have a single-agent shortest-path search in the space of belief states (not the space of physical states of the coins). We search for a path from the inital belief state to the goal belief state, which is
`{HHHH}`

(meaning that 4 heads is the only possibility). - A move updates the belief state as follows: for every four-coin sequence in the current belief state, rotate it in every possible way, and then flip the coins specified by the position(s) in the move. Collect all these results together to form the new belief state. The search space is small (just 2
^{16}possible belief states), so run time will be fast. - I'll Keep It Simple, and not worry about rotational symmetry (although we'll come back to that later).

# Basic Data Structures and Functions

What data structures will I be dealing with?

`Coins`

: a*coin sequence*(four coins, in order, on the table) is represented as a`str`

of four characters, such as`'HTTT'`

.`Belief`

: a*belief state*is a`frozenset`

of`Coins`

(frozen so it can be hashed), like`{'HHHT', 'TTTH'}`

.`Position`

: an integer index into the coin sequence; position`0`

selects the`H`

in`'HTTT'`

.`Move`

: a set of positions to flip, such as`{0, 2}`

.`Strategy`

: an ordered list of moves. A blindfolded player has no feedback, thus there are no decision points in the strategy.

I take the coin sequence `'HTTT'`

to mean there is an `'H'`

at the 12 o'clock position, then 3, , 6, and 9 o'clock in that order are `'T'`

.

What basic functions do I need to manipulate these data structures?

`all_moves()`

: returns a list of every possible move a player can make.`all_coins()`

: returns a belief state consisting of the set of all 16 possible coin sequences:`{'HHHH', 'HHHT', ...}`

.`rotations(coins)`

: returns a belief set of all 4 rotations of the coin sequence.`flip(coins, move)`

: flips the specified positions within the coin sequence. (But leave`'HHHH'`

alone, because it ends the game.)`update(belief, move)`

: returns an updated belief state: all the coin sequences that could result from any rotation followed by the specified flips.

Let's try out these functions:

There are 16 coin sequences in the `all_coins`

belief state. If we update this belief state by flipping all 4 positions, we should get a new belief state where we have eliminated the possibility of 4 tails (because if there had been 4 heads, you would have already won), leaving 15 possible coin sequences:

Everything looks good so far.

# Search for a Winning Strategy

The generic function `search`

does a breadth-first search starting
from a `start`

state, looking for a `goal`

state, considering possible `actions`

at each turn,
and computing the `result`

of each action (`result`

is a function such that `result(state, action)`

returns the new state that results from executing the action in the current state). `search`

works by keeping a `queue`

of unexplored possibilities, where each entry in the queue is a pair consisting of a *strategy* (sequence of moves) and a *state* that that strategy leads to. We also keep track of a set of `explored`

states, so that we don't repeat ourselves. I've defined this function (or one just like it) multiple times before, for use in different search problems.

Note that `search`

doesn't know anything about belief states—it is designed to work on plain-old physical states of the world. But amazingly, we can still use it to search over belief states: it just works, as long as we properly specify the start state, the goal state, and the means of moving between states.

The `coin_search`

function calls `search`

to solve our specific problem:

That's a 15-move strategy that is guaranteed to lead to a win. **Stop here** if all you want is the answer to the puzzle.

Or you can continue on ...

# Verifying the Winning Strategy

I don't have a proof, but I have some evidence that this strategy is the answer:

- Exploring with paper and pencil, it looks good.
- A colleague did the puzzle and got the same answer.
- It passes the
`probably_wins`

test below.

The call `probably_wins(strategy, k)`

plays the strategy * k* times against each possible starting position, assuming a Devil that chooses rotations at random. Note this is dealing with concrete, individual states of the world, like `HTHH`

, not belief states. If `probably_wins`

returns `False`

, then the strategy is *definitely* flawed. If it returns `True`

, then the strategy is *probably* good, but that does not prove it will win every time (and either way `probably_wins`

makes no claims about being a *shortest* strategy).

# Canonical Coin Sequences and Moves

Consider these coin sequences: `{'HHHT', 'HHTH', 'HTHH', 'THHH'}`

. In a sense, these are all the same: they all denote the same sequence of coins with the table rotated to different degrees. Since the devil is free to rotate the table any amount at any time, we could be justified in treating all four of these as equivalent, and collapsing them into one representative member. I will **redefine** `Coins`

so that is still takes an iterable of `'H'`

or `'T'`

characters and joins them into a `str`

, but I will make it consider all possible rotations of the resulting string and (arbitraily) choose the one that comes first in alphabetical order (which would be `'HHHT'`

for the four coin sequences mentioned here).

With `Coins`

redefined, the result of `all_coins()`

is different:

The starting belief set is down from 16 to 6, namely: {4 heads, 3 heads, 2 adjacent heads, 2 opposite heads, 1 head, and no heads}, respectively.

Now for canonical moves. The moves `{0}`

and `{1}`

should be considered the same, since they both say "flip one coin." To get that, look at the canonicalized set of `all_coins(N)`

, and for each one pull out the set of positions that have an `H`

in them and flip those positions. (The positions with a `T`

should be symmetric, so we don't need them as well.)

So again we've gone down from 16 to 6.

Let's make sure we didn't break anything and that we still get the same 15-step solution:

# Winning Strategies for N Coins

What if there are 3 coins on the table arranged in a triangle? Or 6 coins in a hexagon? To answer that, I'll generalize all the functions that have a "4" in them: `all_moves, all_coins`

, `rotations`

and `coin_search`

, as well as `probably_wins`

. In each case the chage is trivial.

Let's test the new definitions:

How many distinct canonical coin sequences are there for up to a dozen coins?

On the one hand this is encouraging; there are only 352 canonical coin sequences of length 12, far less than the 4,096 non-canonical squences. On the other hand, it is discouraging; since we are searching over belief states, that would be 2^{352} belief states, which is more than a googol. However, we should be able to easily handle up to N=7, because 2^{20} is only a million.

# Winning Strategies for N = 1 to 7 Coins

Too bad; there are no winning strategies for N = 3, 5, 6, or 7.

There *are* winning strategies for N = 1, 2, 4; they have lengths 1, 3, 15, respectively. Hmmm. That suggests ...

# A Conjecture

- For every N that is a power of 2, there will be a shortest winning strategy of length 2
^{N}- 1. - For every N that is not a power of 2, there will be no winning strategy.

# Winning Strategy for 8 Coins

For N = 8, there are 2^{36} = 69 billion belief states and if the conjecture is true there will be a shortest winning strategy with 255 steps. All the computations up to now have been less than a second, but this one should take more than a minute. Let's see:

**Eureka!** That's evidence in favor of the conjecture. But not proof. And it leaves many questions unanswered:

- Can you show there are no winning strategies for
*N*= 9? Currently,`coin_search(9)`

should take about 20 million minutes. - Can you show there are no winning strategies for
*N*= 10, 11, ...? - Can you prove there are no winning strategies for any
*N*that is not a power of 2? - Can you find a winning strategy of length 65,535 for
*N*= 16 and verify that it works? - Can you generate a winning strategy for any power of 2 (without proving it is shortest)?
- Can you prove there are no shorter winning strategies for
*N*= 16? - Can you prove the conjecture in general?
- Can you
*understand*and*explain*how the strategy works, rather than just listing the moves?

# A Proof of the Conjecture (from John Lamping)

John Lamping came up with this proof of the conjecture:

First, consider n = 3. If the three coins start out different, the devil can guarantee that they are still different after any flip you call for. If you ask for 1 flip, the devil makes the flip apply to one of the two matching coins. If you ask for 2 flips, the devil makes the flip apply to a pair of non-matching coins. If you ask for all 3 flips, they will still disagree, as well. So there is no strategy for 3 coins.

We can generalize this to any prime number of coins, p, different from 2. If the p coins start out different, the devil can still make sure that any flip will leave the coins not matching. If the flip is of all coins, they will still disagree. Suppose it is not of all coins. The devil tries the flip on the current orientation of the table. If that leaves coins different, good. If not, the devil rotates the table by one position. We show that in that position, the coins will disagree after the flip. We know that there are at least two coins that the player asked to be flipped differently. With an odd number of coins, (LFFLF, for example) we also know there must be two adjacent coins that the player asked to be flipped the same. So there must be a subsequence that is either LFF or FLL. We know that the coins initially agreed on the last two positions, because the flip led to all the coins matching. After the rotation, those two positions will be flipped differently, leading to different coins.

Now, consider any multiple of p (say 12, for example, 3 × 4). We can pick out p equally spaced points on the table (every 4th coin in the example), and apply the devil's strategy for p coins to those p coins: If those p start out different, the devil can always make sure they stay different.

So for any number of coins that has a factor other than 2, the devil wins.

Now we have to show that for any number that is a power of 2, the devil loses.
We proceed by induction. Suppose that we have a sequence of flip operations that guarantees a win for $2^n$ coins. We assume that there are $2^n - 1$ of them, $o*1, ... o*{2^n-1}$. Each operation specifies, for each of the n coins, whether to flip it or to leave alone, like LFFF. We create two new sequences, also of $2^n-1$ operations, but operating on $2^{n+1}$ coins. The first, called $d$ for double, does $o$ on the first $2^n$ coins, and then repeats the same pattern on the second $2^n$ coins. So if $o_i$ is LFFF, $d_i$ is LFFFLFFF. The second, for single, does $o$ on the first $2^n$ coins, and leaves the second $2^n$ coins alone. So if $o_i$ is LFFF, $s_i$ is LFFFLLLL. Then the following sequence of operations guarantees a win on $2^(n+1)$ coins:

$d*1, ... d*{2^n-1}, s*1, d_1, ... d*{2^n-1}, s*2, .... d_1, ... d*{2^n-1}, s*{2^n-1}, d_1, ... d*{2^n-1}$

That is, we do each operation of $s$, with a complete copy of $d$ between each $s$ operation, as well as at the beginning and at the end. The total number of operations is $2^n×(2^n - 1) + 2^n - 1 = 2^{n+1} -1.$

Before explaining why this works, lets use it to generate the sequences we know. For one coin, we win with a sequence of a single operation, F. For two coins, d is the single operation FF, while s is the single operation FL. Combining them according to the procedure gives FF, FL, FF. For four coins, d is FFFF, FLFL, FFFF, while s is FFLL, FLLL, FFLL, combining them gives FFFF, FLFL, FFFF, FFLL, FFFF, FLFL, FFFF, FLLL, FFFF, FLFL, FFFF, FFLL, FFFF, FLFL, FFFF. That is the answer that we all came up with.

To see why it works, consider all pairs of opposite coins on the table with $2^{n+1}$ coins, and consider the XOR of each pair of opposite coins. There are $2^n$ of these XORs, and there is a natural correspondence between each XOR pair and a single coin in the original game. Now, s complements the XOR of a pair exactly when o flips the corresponding coin in the original game. Meanwhile, every operation in d leaves the XOR of each pair unchanged. So, since the sequence of o operations explores all coins positions, the sequence of s operations explores all XOR states. And the interposed d operations don't interfere with that exploration. In one of those states, the XORs will all be 0, meaning that each pair of opposite coins will agree. But in that state, when opposite pairs of coins agree, d is equivalent to o, just operating on pairs of coins, rather than single coins. So it explores all possible combinations of heads and tails, given that opposites agree. That will include the state where all coins are heads.

# Visualizing Strategies

Can I understand what is going on by showing how belief states are whittled down?

The following tables shows how moves change the belief state:

We can see that every odd-numbered move flips all four coins to eliminate the possibility of `TTTT`

, flipping it to `HHHH`

. We can also see that moves 2, 4, and 6 flip two coins and have the effect of eventually eliminating the two "two heads" sequences from the belief state, and then move 8 eliminates the "three heads" and "one heads" sequences, while bringing back the "two heads" possibilities. Repeating moves 2, 4, and 6 in moves 10, 12, and 14 then re-eliminates the "two heads", and move 15 gets the belief state down to `{'HHHH'}`

.

You could call `show(strategy, 8)`

, but the results look bad unless you have a very wide (345 characters) screen to view it on. So instead I'll add a `verbose`

parameter to `play`

and play out some games with a trace of each move: