Lesson 4: Probability Distributions

Links

Varying

You should, by this time, be familiar with the mathematical concept of a variable, a quantity that we either don't know, or might decide to change, that we denote with a letter or some other symbol. We use variables all the time in algebra class; we have equations to "solve for X", and we have formulas: equations with variables in them that we can plug whatever numbers into we want, like the Pythagorean Theorem.

Varying, randomly

Right now, we're working with probability — that is, randomness. Wouldn't it be great if we could combine that idea with variables somehow?[1] Mathematicians thought so, so they invented the idea of a random variable. A random variable is one whose value is affected by the result of one or more chance experiments. These can be very useful when we're reasoning about probabilities, and one obvious way we can use them with carnival games is with prize values: we can say the amount of money the player wins is a random variable.

We don't get to make up whatever value we want and assign it to a random variable (or it wouldn't be random), but we can decide what rules we use to assign it a value. For example, let's say we want to play the game Monopoly with our fellow probability and statistics students. In Monopoly, you decide how many spaces to move by rolling two six-sided dice. We can define a random variable as the number we rolled on die A, plus the number we rolled on die B:

R = D_a + D_b

Note that we're treating the two dice like variables themselves: we can do ordinary math with random variables, and the result will be a new random variable.

Probability distributions

When we're working with random variables, we usually want to be able to figure out which values it might take on, and what the probability is that the variable might take on each value. For discrete random variables, where we have a finite number of outcomes [2], we can figure out the complete set of all possible values of the random variable, and assign each one a probability. The result is called a probability distribution, because it shows us how the probability is "distributed" between the possible values. Because all of the possible outcomes are present, the probabilities all need to add up to one.[3]

Let's use our Monopoly game above as an example. There are 6^2 = 36 different combinations of the two dice that all have equal probability, but there are only eleven different values that our random variable R can possibly take on: 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, and 12. This means that at least some of the values have to correspond to more than one of the equal-probability combinations. It turns out that the closer the value is to 7, the more ways there are to add the two dice up to get that value. There's only one way to get 2, for example, so it has a probability of 1/36 — but there are six ways to get 7, so it has a probability of 6/36. It's easiest to show this probability distribution with a picture:

A picture of all the ways to add up two dice. It looks like a triangle, with six ways to build 7, five ways to build 6 and 8, four ways to build 5 and 9, three ways to build 4 and 10, two ways to build 3 and 11, and only one way to build 1 and 12.
Public-domain picture from Wikimedia Commons contributors.

Review

Let's quickly review what we've learned: