Birthday Paradox

The Birthday Problem (Birthday Paradox) tries to find the probability that among a group of n persons, 2 or more of them have the same birthday. For simplicity, it ignores the existence of leap years, and the fact that births are seasonal.

Lex Fridman once posted the following tweet:

@lexfridman: In a room of 23 people, there's a 50% chance that two people have the same birthday.

I tried to calculate it using probabilities, but my brain usually finds it hard to do so. Of course, I could check the Wikipedia page and copy the calculations from it, but I prefer to use my favourite tool instead, the mighty Monte Carlo methods.

Fuck mathematics! Fuck probabilities! We have got cheap CPUs to do this stuff for us. I basically simulated the problem and calculate the probabilities myself.

We get 23 people, assign each one of them a random birthday, and check if two or more of them have the same birthday. And since we have our computers at our disposal, we can repeat this process a million times, who cares!

It took me a couple of minutes to prove Fridman's tweet, and to be precise, in a room of 23 people, there's a 50.7% chance that two people have the same birthday.

Here is the code:

  1 import numpy as np
  2 import pandas as pd
  3
  4 collisions = []
  5
  6 for i in range(1_000_000):
  7         birthdays = np.random.choice(range(1, 366), 23)
  8         collision = len(birthdays) > len(set(birthdays))
  9         collisions.append(collision)
 10
 11 print(
 12         '% of collisions: {:.1%}'.format(
 13                 pd.Series(collisions).mean()
 14         )
 15 )

Can you parallelize it?


Tarek Amr - Mar 3, 2021

Translations: [NL], [AR]