# Bayes' Theorem ## Overview Bayes' Theorem provides a mathematical framework for updating probability estimates based on new evidence. It forms the foundation of [[bayesian_inference|Bayesian inference]] and [[bayesian_statistics|Bayesian statistics]], providing a rigorous approach to reasoning under uncertainty. ## Mathematical Formulation ### Basic Formula ```math P(A|B) = \frac{P(B|A) \times P(A)}{P(B)} ``` where: - P(A|B) is the posterior probability of A given B - P(B|A) is the likelihood of B given A - P(A) is the prior probability of A - P(B) is the marginal probability of B ### Alternative Formulations #### Using Law of Total Probability ```math P(A|B) = \frac{P(B|A) \times P(A)}{P(B|A) \times P(A) + P(B|A^c) \times P(A^c)} ``` #### Multiple Hypotheses ```math P(H_i|E) = \frac{P(E|H_i) \times P(H_i)}{\sum_{j=1}^n P(E|H_j) \times P(H_j)} ``` ## Conceptual Diagrams ```mermaid flowchart LR A[Prior: P(A)] --> C{Bayes' Theorem} B[Likelihood: P(B|A)] --> C D[Evidence: P(B)] --> C C --> E[Posterior: P(A|B)] style A fill:#d4f1f9,stroke:#05386b style B fill:#dcedc1,stroke:#05386b style C fill:#ffcccb,stroke:#05386b style D fill:#ffd580,stroke:#05386b style E fill:#d8bfd8,stroke:#05386b ``` ### Interpretation Process ```mermaid graph TD A[Prior Beliefs<br>P(Hypothesis)] --> B{New Evidence<br>Observed} B --> C[Calculate Likelihood<br>P(Evidence|Hypothesis)] C --> D[Apply Bayes' Theorem] D --> E[Update to Posterior<br>P(Hypothesis|Evidence)] E --> F{More Evidence?} F -->|Yes| B F -->|No| G[Final Posterior<br>Becomes New Prior] style A fill:#d4f1f9,stroke:#05386b style B fill:#dcedc1,stroke:#05386b style C fill:#ffcccb,stroke:#05386b style D fill:#ffd580,stroke:#05386b style E fill:#d8bfd8,stroke:#05386b style G fill:#d8bfd8,stroke:#05386b ``` ### Relationship to Other Bayesian Concepts ```mermaid graph TB A[Bayes' Theorem] --> B[Bayesian Inference] A --> C[Bayesian Networks] A --> D[Bayesian Graph Theory] B --> E[Belief Updating] B --> F[Parameter Estimation] C --> G[Probabilistic Graphical Models] D --> G E --> H[Active Inference] F --> I[Bayesian Machine Learning] style A fill:#ff9999,stroke:#05386b style B fill:#99ccff,stroke:#05386b style C fill:#99ff99,stroke:#05386b style D fill:#ffcc99,stroke:#05386b style E fill:#cc99ff,stroke:#05386b style F fill:#ffff99,stroke:#05386b style G fill:#99ffff,stroke:#05386b style H fill:#ff99cc,stroke:#05386b style I fill:#ccff99,stroke:#05386b ``` ## Interpretation and Components ### Components Explanation | Component | Description | Role in Bayes' Theorem | |-----------|-------------|------------------------| | Prior | Initial belief before evidence | P(A) | | Likelihood | Probability of evidence given hypothesis | P(B\|A) | | Evidence | Observed data or information | P(B) | | Posterior | Updated belief after evidence | P(A\|B) | ### Bayesian vs. Frequentist Interpretation ```mermaid graph LR subgraph Bayesian A1[Probabilities as Beliefs] B1[Prior Information Utilized] C1[Parameters Have Distributions] end subgraph Frequentist A2[Probabilities as Frequencies] B2[No Prior Information] C2[Parameters Are Fixed] end D[Bayes' Theorem] --- Bayesian E[Sampling Theory] --- Frequentist style A1 fill:#d4f1f9,stroke:#05386b style B1 fill:#d4f1f9,stroke:#05386b style C1 fill:#d4f1f9,stroke:#05386b style A2 fill:#ffd580,stroke:#05386b style B2 fill:#ffd580,stroke:#05386b style C2 fill:#ffd580,stroke:#05386b style D fill:#ff9999,stroke:#05386b style E fill:#99ccff,stroke:#05386b ``` ## Applications ### Core Application Areas ```mermaid mindmap root((Bayes'<br>Theorem)) Machine Learning Bayesian Networks Naive Bayes Classifiers Bayesian Optimization Statistics Parameter Estimation Hypothesis Testing Model Selection Medicine Diagnostic Testing Disease Prevalence Treatment Efficacy Finance Risk Assessment Portfolio Optimization Fraud Detection Information Theory Shannon Information Kullback-Leibler Divergence Coding Theory ``` ### Example: Medical Testing ```mermaid sequenceDiagram participant P as Prior (Disease Prevalence) participant L as Likelihood (Test Sensitivity) participant E as Evidence (Test Result) participant Post as Posterior (Diagnosis) P->>Post: Initial probability of disease L->>Post: How likely test is positive if patient has disease E->>Post: Actually observed positive test result Post->>Post: Calculate P(Disease|Positive Test) Note over Post: P(D|+) = P(+|D)×P(D)/P(+) ``` ## Implementation Methods ### 1. Direct Probability Calculation ```python def bayes_theorem(prior, likelihood, evidence): """Calculate posterior probability using Bayes' theorem""" return (likelihood * prior) / evidence ``` ### 2. Log Space Calculation (for Numerical Stability) ```python import numpy as np def bayes_theorem_log(log_prior, log_likelihood, log_evidence): """Calculate posterior probability in log space for numerical stability""" return log_likelihood + log_prior - log_evidence ``` ### 3. Recursive Bayesian Updating ```python def recursive_bayesian_update(prior, likelihoods, evidences): """Update belief sequentially as new evidence arrives""" posterior = prior for likelihood, evidence in zip(likelihoods, evidences): posterior = (likelihood * posterior) / evidence return posterior ``` ## Connection to Other Bayesian Methods ### Relationship to [[bayesian_networks|Bayesian Networks]] Bayes' Theorem forms the foundational calculation within Bayesian Networks, where the conditional probability tables (CPTs) represent the likelihood terms, and belief propagation implements sequential application of the theorem. ### Relationship to [[bayesian_graph_theory|Bayesian Graph Theory]] In Bayesian Graph Theory, Bayes' Theorem governs the updates of probability distributions over graph structures, enabling probabilistic reasoning about complex relationships represented as graphs. ### Relationship to [[belief_updating|Belief Updating]] Belief updating generalizes Bayes' Theorem to sequential, possibly hierarchical settings where beliefs (priors) are continuously revised based on new evidence, forming the basis for active inference models. ## References 1. Bayes, T. (1763). An Essay Towards Solving a Problem in the Doctrine of Chances. 2. Laplace, P. S. (1774). Memoir on the Probability of Causes of Events. 3. Cox, R. T. (1946). Probability, Frequency and Reasonable Expectation. 4. Jaynes, E. T. (2003). Probability Theory: The Logic of Science. 5. MacKay, D. J. C. (2003). Information Theory, Inference, and Learning Algorithms.