#
Dan Dascalescu's Homepage

/summaries/ted/67 - peter donnelly shows how stats fool juries

# TED Talk #67 - Peter Donnelly shows how stats fool juries

The speaker presents an interesting problem: toss a coin until one of the following patterns appears (where H = heads, T = tails):

- HTH
- HTT

Which pattern is more likely to occur first in a series of coin tosses?

I admit I have no idea how to solve this mathematically, so I wrote a Perl script to do it brutally:

#!/usr/local/bin/perl -wusestrict;my$runs = 0;my$tosses_total = 0;$|=1;while(1) {my$tosses = '';while(1) {my$rand = int(rand(2)); $tosses .= $rand;if($tosses =~ /010$/) {# 010 = HTH, 011 = HTT$tosses_total += length $tosses;last; } } $runs++;if($runs % 10000 == 0) { print "\r$runs: avg. tries to pattern is ", $tosses_total / $runs; } }

Run the script and press Ctrl+C after you've seen what average number of coin tosses it tends towards. The results prove the speaker right: HTH appears after 10 tries on the average, while HTT appears after 8 tries. The explanation of why HTT is more frequent is pretty simple: imagine that you've tossed the coin twice and have 'HT' so far. When you toss the third time, one of the following outcomes will happen:

- you get a 'T'. If you were hoping for 'HTT', you won; if you were hoping for HTH, you'll have to start from scratch because what you got was a 'T' and 'HTH' starts with an H.
- you get an 'H'. If You were hoping for HTH, you won. If you were hoping for HTT, then you lose, but you're already one third of the way to a potential 'HTT' because you just got an 'H'. So intuitively, HTT has more chances to appear earlier than HTH.

What boggles my mind though at the moment, is why the average number of tosses to the first occurrence of any sequence always approaches an integer:

- H or T => 2 (obvious)
- HH or TT => 6
- HHH => 14
- HHHH => 30
- ... and so on, twice the number of tosses for the previous combination, + 2
- HHT => 8
- HHHT => 16
- HHHHT => 32
- HTH => 10
- HTHT => 20
- HTHTH => 42
- HTHTHT => 84

The problem is interesting and I haven't done any research online so as not to spoil it while I work on figuring it out. If you have any comments though, feel free to chip in.