The Birthday Problem (Birthday Paradox) tries to find the probability that among a group of n persons, 2 or more of them have the same birthday. For simplicity, it ignores the existence of leap years, and the fact that births are seasonal.
Lex Fridman once posted the following tweet:
@lexfridman: In a room of 23 people, there's a 50% chance that two people have the same birthday.
I tried to calculate it using probabilities, but my brain usually finds it hard to do so. Of course, I could check the Wikipedia page and copy the calculations from it, but I prefer to use my favourite tool instead, the mighty Monte Carlo methods.
Fuck mathematics! Fuck probabilities! We have got cheap CPUs to do this stuff for us. I basically simulated the problem and calculate the probabilities myself.
We get 23 people, assign each one of them a random birthday, and check if two or more of them have the same birthday. And since we have our computers at our disposal, we can repeat this process a million times, who cares!
It took me a couple of minutes to prove Fridman's tweet, and to be precise, in a room of 23 people, there's a 50.7% chance that two people have the same birthday.
Here is the code:
1 import numpy as np
2 import pandas as pd
3
4 collisions = []
5
6 for i in range(1_000_000):
7 birthdays = np.random.choice(range(1, 366), 23)
8 collision = len(birthdays) > len(set(birthdays))
9 collisions.append(collision)
10
11 print(
12 '% of collisions: {:.1%}'.format(
13 pd.Series(collisions).mean()
14 )
15 )
Can you parallelize it?
Tarek Amr - Mar 3, 2021