Making good decisions is synonymous with being effective in the world. The art of accurate decision-making is at the intersection of epistemic rationality (understanding the world) and instrumental rationality (taking action).

There is no pre-workshop reading or assignment. All content on this page is for the cohort meeting.

Non-facilitators should avoid reading blocks marked as **Facilitator**, such as the one below, since they contain spoilers about the problems.

## Probability Distributions [15 minutes]

This week we will study probability distributions. Thus far, we have tried to make decision-making and forecasting easier by binning the relevant variables into a small number of possible outcomes. For example, when thinking about what kind of salary you might be able to get a at a new job, we might draw bins such as:

Bin | Probability | Result |
---|---|---|

1 | 10% | <$55k |

2 | 70% | $55k-$70k |

3 | 20% | >$70k |

Large bins like these are usually a valid simplification, but in some cases, you may want to model uncertainty at a more detailed and granular level. This week we will focus on three specific distribution shapes that occur often in the real world: the *Normal distribution*, the *Pareto distribution* and the *Log-normal distribution*.

### The Normal Distribution [5 minutes]

The mean, median and mode of a normal distribution are all the same, and normal distributions are symmetric around this mean value. The shape of the distribution around this mean is defined by the standard deviation. Normal distributions follow the convenient 68-95-99.7 rule, which states that 68% of the values fall within one standard deviation, 95% fall within two standard deviations, and 99.7% of the values fall within 3 standard deviations of the mean.

Commonly observed quantities that are normally distributed include:

- Height of adult humans
- Errors in measurements made with instruments
- IQ scores
- The size of grains in a sedimentary rock sample
- The breaking stress of iron nails in a large box of nails

#### Instructions [5 minutes]

With your cohort, come up with three other quantities that you think are normally distributed.

### The Pareto Distribution [5 minutes]

The Pareto distribution is a power-law distribution, and is sometimes associated with the 80-20 Rule, aka the Pareto Principle, which states that "80% of the outcomes are due to 20% of the causes."

Commonly observed quantities that are Pareto distributed include:

- The size of cities
- The frequency of incomes in a population
- The size of companies
- The frequency of accidents by severity
- The frequency of earthquakes by strength

#### Instructions [5 minutes]

With your cohort, come up with three other quantities that you think are Pareto distributed.

### The Log-Normal Distribution [5 minutes]

The log-normal distribution is a useful tool for thinking about distributions which are highly skewed but not so skewed that they are better thought of as power-law (Pareto) distributions. Chess games may tend to be shorter rather than longer, but very rarely does a chess game last less than, say, four moves.

Log-normally distributed quantities tend to be the result of multiplicative processes. If the quantity is the result of the multiplication of two normally-distributed factors, it will be log-normally distributed. This is why stock prices tend to be log-normally distributed. They are the consequences of the multiplication of a large number of (relatively) normally distributed events.

Commonly observed quantities that are log-normally distributed include:

- The permeability of rock samples
- The length of chess games
- The duration of surgeries in an operating room
- The size of raindrops

#### Instructions [5 minutes]

With your cohort, try to think of at least one other quantity that you think is log-normally distributed. Do not get too hung up on whether a quantity might be Pareto distributed instead. Some quantities can be relatively well-understood as either Pareto or log-normal!

## Shipments at Station Seven [20 minutes]

You are the assistant trade supervisor at Station Seven. You have built a model in GetGuesstimate in order to explain to your boss why the value of shipments fluctuates so wildly. Your boss has complained that the number of items on each trade ship is normally distributed, and the price of individual items on each ship is also normally distributed, and yet the quarterly trade revenue for the station is **not** normally distributed.

Why is the revenue not normally distributed? Should you expect it to be?

What would you expect to happen to the total trade revenues if the item count per shipment were log-normally distributed instead of normally distributed, **at the same mean value of item count**?

## Break [10 minutes]

Take a ten-minute break.

## Uncertainty Quantification [45 minutes]

Each cohort member should take turns completing all of the steps below. Complete all of the steps for a single person before moving to the next person. Try to avoid spending too much time on a single person — everyone should have a chance to go.

### 1) Choose a Decision

Put forth a decision, prediction, or uncertainty that you are thinking about. The problems can be big or small. This week, **find something involving an outcome or quantity that you are uncertain about, and think about what type of distribution fits that situation best.** Everyone should put forth *something*.

### 2) Research

Set a three-minute timer on your phone. For three minutes, each cohort member should gather any research they can find about how the quantity from step 1 tends to be distributed. If you can find good data, try to fit the data with the distribution shape of your choice. The aim is to come to a better understanding of the quantity or event that you are uncertain about.

This step should be completed individually instead of in a group — each person should do their own research in silence. This helps prevent groupthink and anchoring on the first suggestion to be made. (Sharing the research is part of the next step.)

### 3) Integration

Share your findings with each other. Did the exercise shed any light on the subject? Did you learn anything that will help you in making your predictions about this subject in the future? Do you have any other insights into the use of distributions for modeling the world?

## Wrap up

If there's time remaining, decide whether what you've learned today would change the manner in which you would bin your chosen quantity when building a decision tree, and make a decision tree in line with your new understanding.