class: center, middle, inverse, title-slide .title[ # Bayes’ Theorem
🕵 ] .author[ ### S. Mason Garrison ] --- layout: true <div class="my-footer"> <span> <a href="https://psychmethods.github.io/coursenotes/" target="_blank">Methods in Psychological Research</a> </span> </div> --- class: middle # Bayes' Theorem (Revisted) --- # Introduction to Bayes' Rule - Bayes' Rule is a foundational concept in statistics that provides a framework for updating our beliefs based on new evidence. - It is widely used in fields such as medicine, finance, and social sciences to help make decisions under uncertainty. - The general formula for Bayes' Rule helps to calculate posterior probabilities by combining prior knowledge and new data. --- ## Bayes' Theorem Formula - Bayes' Theorem allows us to "flip" conditional probabilities: $$ P(A \mid B) = \frac{P(B \mid A) \times P(A)}{P(B)} $$ .pull-left[ It relates: - What we want to know: P(A|B) - What we know: P(B|A), P(A), and P(B) ] -- .pull-right[ <img src="data:image/png;base64,#Bayes_files/figure-html/unnamed-chunk-2-1.png" width="50%" style="display: block; margin: auto;" /> ] --- ## Bayes' Theorem: Intuition .pull-left-narrow[ - Starts with our initial belief about A: P(A) - Updates that belief based on new evidence B - Gives us a new, updated belief: P(A|B) This mirrors the scientific process of updating hypotheses based on evidence! ] .pull-right-wide[ <img src="data:image/png;base64,#../img/bayes_science.png" width="65%" style="display: block; margin: auto;" /> ] .footnote[van de Schoot, R., Depaoli, S., King, R. et al. Bayesian statistics and modelling. Nat Rev Methods Primers 1, 1 (2021). https://doi.org/10.1038/s43586-020-00001-2] --- class: middle # Gardner's Boy or Girl Paradox --- ## The Paradox Martin Gardner presented two seemingly similar problems in Scientific American (1959): 1. "Mr. Jones has two children. The older child is a girl. What is the probability that both children are girls?" 2. "Mr. Smith has two children. At least one of them is a boy. What is the probability that both children are boys?" - Gardner gave the answers as 1/2 and 1/3 respectively, leading to much debate. --- ## Problem 1: The Older Child is a Girl .pull-left[ "Mr. Jones has two children. The older child is a girl. What is the probability that both children are girls?" - Only the outcomes where the older child is a girl are considered: - Girl, Girl (GG) - Girl, Boy (GB) P(both girls | older is girl) = 1/2 ] .pull-right[ Sample space outcomes: | Option 1 | Option 2 | |----------|----------| | Girl | Girl | | Girl | Boy | | Option 3 | Option 4 | |----------|----------| | Boy | Girl | | Boy | Boy | ] --- ## Problem 2: At Least One Boy .pull-left[ "Mr. Smith has two children. At least one of them is a boy. What is the probability that both children are boys?" - Possible outcomes: - Boy, Boy (BB) - Boy, Girl (BG) - Girl, Boy (GB) P(both boys | at least one boy) = 1/3 ] .pull-right[ <table class="table" style="color: black; width: auto !important; margin-left: auto; margin-right: auto;"> <caption>All Outcomes</caption> <thead> <tr> <th style="text-align:left;"> Child 1 </th> <th style="text-align:left;"> Child 2 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Boy </td> <td style="text-align:left;"> Boy </td> </tr> <tr> <td style="text-align:left;"> Boy </td> <td style="text-align:left;"> Girl </td> </tr> <tr> <td style="text-align:left;"> Girl </td> <td style="text-align:left;"> Boy </td> </tr> <tr> <td style="text-align:left;"> Boy </td> <td style="text-align:left;"> Boy </td> </tr> </tbody> </table> ] --- ## The Paradox Explained The key difference lies in the information given: 1. In the first problem, we know the order (older child is a girl), which eliminates two possibilities (BB and BG). 2. In the second problem, we don't know which child is a boy, so we only eliminate one possibility (GG). This subtle difference in information leads to different sample spaces and thus different probabilities. --- ## Bayesian Perspective We can also approach these problems using Bayes' Theorem: For Problem 1: `$$P(GG \mid \text{older is G}) = \frac{P(\text{older is G} \mid GG) \cdot P(GG)}{P(\text{older is G})} = \frac{1 \times \frac{1}{4}}{\frac{1}{2}} = \frac{1}{2}$$` -- For Problem 2: `$$P(BB \mid \text{at least one B}) = \frac{P(\text{at least one B} \mid BB) \cdot P(BB)}{P(\text{at least one B})} = \frac{1 \times \frac{1}{4}}{\frac{3}{4}} = \frac{1}{3}$$` --- class: middle # Applying Bayes' Theorem --- ## Example: Testing for a Rare Disease Imagine a test for a disease that affects 1% of the population: - 99% accurate for people who have the disease (sensitivity) - 99% accurate for people who don't have the disease (specificity) .question[ If someone tests positive, what's the probability they actually have the disease? ] - We can use two approaches: a **tree diagram** to break down all possible outcomes and **Bayes' Theorem** mathematically. --- ## Setting Up the Problem Let's define our events: - D: Having the disease (Case of the Mondays) - T: Testing positive We want to know: P(D|T) -- We know: - P(D) = 0.01 (1% of population has the disease) - P(T|D) = 0.99 (99% sensitivity) - P(T|not D) = 0.01 (1 - 99% specificity) --- ## Applying Bayes' Theorem mathematically .pull-left[ `$$P(D \mid T) = \frac{P(T \mid D) \cdot P(D)}{P(T)}$$` `$$P(T) = P(T \mid D) \cdot P(D) + P(T \mid \neg D) \cdot P(\neg D) = 0.99 \times 0.01 + 0.01 \times 0.99 = 0.0198$$` `$$P(D \mid T) = \frac{0.99 \times 0.01}{0.0198} \approx 0.50 \text{ or 50\%}$$` ] -- .pull-right[ <br/> <br/> <br/> <br/> <br/> <br/> <img src="data:image/png;base64,#Bayes_files/figure-html/unnamed-chunk-5-1.png" width="90%" style="display: block; margin: auto;" /> ] --- ## Applying Tree Diagrams for Rare Disease Testing .pull-left[ - A **tree diagram** is a graphical way to represent all possible outcomes of an event, making it easier to understand conditional probabilities. - Let's define the population for simplicity: - **D**: Having the disease (1%) - **No D**: Not having the disease (99%) - We will branch out from each state (having the disease or not) to show the possible test outcomes (Positive or Negative), which depend on sensitivity and specificity. ] .pull-right[ <img src="data:image/png;base64,#tree_diagram.png" width="90%" style="display: block; margin: auto;" /> ] --- ## Interpretation Using the Tree Diagram .pull-left[ The **tree diagram** helps us visualize all possible outcomes: - **Path 1 (Disease -> Positive Test):** Probability is **0.01 * 0.99 = 0.0099** - **Path 2 (Disease -> Negative Test):** Probability is **0.01 * 0.01 = 0.0001** - **Path 3 (No Disease -> Positive Test):** Probability is **0.99 * 0.01 = 0.0099** - **Path 4 (No Disease -> Negative Test):** Probability is **0.99 * 0.99 = 0.9801** ] .pull-right[ Using Bayes' Theorem to find P(Disease | Positive Test): `$$P(D \mid T) = \frac{P(T \mid D) \cdot P(D)}{P(T)}$$` `$$P(T) = P(T \mid D) \cdot P(D) + P(T \mid Not D) \cdot P(Not D) = 0.0099 + 0.0099 = 0.0198$$` `$$P(D \mid T) = \frac{0.0099}{0.0198} \approx 0.50 \text{ or 50\%}$$` ] --- ## Wrapping Up - Both approaches—the tree diagram and Bayes' Theorem—help arrive at the same conclusion: there's a 50% chance that someone who tests positive actually has the disease. - The tree diagram provides a visual breakdown, while Bayes' Theorem offers a mathematical perspective. - This example demonstrates how rare diseases can lead to high rates of false positives, even when the test accuracy is high. --- class: middle # Bayesian Thinking in Psychology --- ## Fun Example: Updating a Belief .pull-left[ - A pet owner initially thinks their cat is limping due to an injury (80% sure). - After observing the cat immediately stop limping when the door opens, how should they update their belief? $$P(\text{injury} \mid \text{cat stops limping when door opens}) = $$ `$$\frac{P(\text{cat stops limping when door opens} \mid \text{injury}) \cdot P(\text{injury})}{P(\text{cat stops limping when door opens})}$$` ] -- .pull-right[
] --- ## Research Example: P-hacking - P-hacking: Trying multiple analyses until you get a "significant" p-value - Bayesian perspective: This is like ignoring your prior probabilities and only focusing on P(data | hypothesis) - Better approach: Consider P(hypothesis) and calculate P(hypothesis | data) --- ## Pros and Cons of Bayesian Approaches .pull-left[ Pros: - Allows incorporation of prior knowledge - Provides direct probabilities of hypotheses - Handles small samples and rare events well ] .pull-right[ Cons: - Selecting priors can be subjective - Computationally intensive for complex problems - Not as widely used (yet) in psychology ] --- Wrapping Up... --- ## Simulations - Simulations are used to understand probabilities in complex situations. - **Blackjack**: Understanding dealer's odds. - [Dealer’s Odds in Blackjack](http://demonstrations.wolfram.com/DealersOddsInBlackjack/) - **Rock Paper Scissors**: Using AI Player. - [Rock Paper Scissors with AI Player](http://demonstrations.wolfram.com/RockPaperScissorsWithAIPlayer/) - **Texas Hold'em**: Exploring probabilities of winning. - [Probabilities of Winning in Texas Hold'em](http://demonstrations.wolfram.com/ProbabilitiesOfWinningInTexasHoldEm/) - **Lottery (Adam Lamers)**: Running simulations to determine lottery odds. - [Lottery Simulator by Adam Lamers](http://adamlamers.com/lottery_simulator) - **Lottery (Ian Neath)**: Another lottery simulation. - [Lottery Simulator by Ian Neath](https://memory.psych.mun.ca/models/lottery/index.shtml) class: center, middle # Wrapping Up...