
The Problem
A miner finds himself trapped deep underground - but not without options. Three escape doors stand before him. One leads to freedom in just 3 hours.
The other two? One sends him in a loop costing 5 hours, the other a detour of 7 hours, both dropping him right back where he started.The miner picks doors at random.
Question: How long, on average, will it take him to finally stumble into freedom?
The Math Bit
Let’s denote the expected escape time as E(T)
.
E(T) = E(T | safe) * P(safe)
+ E(T | first-trap) * P(first-trap)
+ E(T | second-trap) * P(second-trap)
Since each door is equally likely to be chosen:
E(T) = 1/3 * ( E(T | safe) + E(T | first-trap) + E(T | second-trap) )
What happened if he opens a trap door? He would end up back at square one - wasted a few additional hours.
E(T | first-trap) = E(T) + 5
E(T | second-trap) = E(T) + 7
Substitute the original expression with above:
E(T) = 1/3 * ( 3 + E(T) + 5 + E(T) + 7)
E(T) = 15
It would take the miner 15 hours on average to escape.
Simulation
import random
import statistics
k = 100_000
trials = []
doors = [
(True, 3),
(False, 5),
(False, 7)
]
for _ in range(k):
t = 0
while True:
is_safe, cost = random.choice(doors)
t += cost
if is_safe:
break
trials.append(t)
average = statistics.mean(trials)
print(f"Expected time on average after {k} trials: {average:.4f} hours")
Expected time on average after 100000 trials: 15.0277 hours
Twist
If the miner remembers both trap doors he has tried, and avoids choosing them again, the expected value becomes trickier to compute — because the probabilities are no longer uniform at each step.
Here’s a simulation to estimate the escape time under this new constraint:
import random
import statistics
k = 1_000_000
trials = []
doors = [
(True, 3),
(False, 5),
(False, 7)
]
for _ in range(k):
t = 0
index_map = [0, 1, 2]
while True:
choice_index = random.choice(index_map)
index_map.remove(choice_index)
is_safe, cost = doors[choice_index]
t += cost
if is_safe:
break
trials.append(t)
average = statistics.mean(trials)
print(f"Expected time on average after {k} trials: {average:.4f} hours")
Expected time on average after 1000000 trials: 9.0063 hours
If we consider all possible door orderings (there are 3!=6 permutations), the average time E(T) turns out to be exactly 9 hours.