Miner Dilemma (puzzle)

math

The Problem

A miner finds himself trapped deep underground - but not without options. Three escape doors stand before him. One leads to freedom in just 3 hours.
The other two? One sends him in a loop costing 5 hours, the other a detour of 7 hours, both dropping him right back where he started.

The miner picks doors at random.

Question: How long, on average, will it take him to finally stumble into freedom?

The Math Bit

Let’s denote the expected escape time as E(T).

E(T) = E(T | safe) * P(safe)  
       + E(T | first-trap) * P(first-trap) 
       + E(T | second-trap) * P(second-trap) 

Since each door is equally likely to be chosen:

E(T) = 1/3 * ( E(T | safe) + E(T | first-trap) + E(T | second-trap) )

What happened if he opens a trap door? He would end up back at square one - wasted a few additional hours.

E(T | first-trap) = E(T) + 5
E(T | second-trap) = E(T) + 7

Substitute the original expression with above:

E(T) = 1/3 * ( 3 + E(T) + 5 + E(T) + 7)
E(T) = 15

It would take the miner 15 hours on average to escape.

Simulation

import random
import statistics

k = 100_000
trials = []
doors = [
  (True, 3),
  (False, 5),
  (False, 7)
]

for _ in range(k):
  t = 0
  while True:
    is_safe, cost = random.choice(doors)
    t += cost
    if is_safe:
      break
  trials.append(t)

average = statistics.mean(trials)
print(f"Expected time on average after {k} trials: {average:.4f} hours")

Expected time on average after 100000 trials: 15.0277 hours

Twist

If the miner remembers both trap doors he has tried, and avoids choosing them again, the expected value becomes trickier to compute — because the probabilities are no longer uniform at each step.

Here’s a simulation to estimate the escape time under this new constraint:

import random
import statistics

k = 1_000_000
trials = []
doors = [
  (True, 3),
  (False, 5),
  (False, 7)
]

for _ in range(k):
  t = 0
  index_map = [0, 1, 2]
  while True:
    choice_index = random.choice(index_map)
    index_map.remove(choice_index)

    is_safe, cost = doors[choice_index]
    t += cost
    if is_safe:
      break
  trials.append(t)

average = statistics.mean(trials)
print(f"Expected time on average after {k} trials: {average:.4f} hours")

Expected time on average after 1000000 trials: 9.0063 hours

If we consider all possible door orderings (there are 3!=6 permutations), the average time E(T) turns out to be exactly 9 hours.