# The Science Behind It Belief Tracker isn't a wellness app or a productivity gimmick. It's a small implementation of a serious body of research on **calibrated probabilistic forecasting** — a field with about seven decades of evidence behind it. This page sketches the lineage. --- ## Foundational Work **The Brier score.** - Brier, G.W. (1950). Verification of forecasts expressed in terms of probability. *Monthly Weather Review*, 78(1), 1–3. — The original paper introducing what's now known as the Brier score. Brier was a weather forecaster looking for a way to score probabilistic forecasts that rewarded both calibration and resolution. **Calibration as a learnable skill.** - Lichtenstein, S., Fischhoff, B., & Phillips, L.D. (1982). Calibration of probabilities: The state of the art to 1980. In D. Kahneman, P. Slovic, & A. Tversky (Eds.), *Judgment under uncertainty: Heuristics and biases* (pp. 306–334). Cambridge University Press. — The canonical review of early calibration research. Establishes the widespread finding of overconfidence in human probability judgments. --- ## Modern Forecasting Research **The Good Judgment Project.** - Tetlock, P.E., & Gardner, D. (2015). *Superforecasting: The art and science of prediction.* Crown Publishers. — The accessible book-length presentation of the Good Judgment Project's findings. Demonstrates empirically that a small group of "superforecasters" reliably outperform intelligence-agency analysts on geopolitical predictions, and that the skill is learnable through practice and feedback. - Mellers, B., et al. (2014). Psychological strategies for winning a geopolitical forecasting tournament. *Psychological Science*, 25(5), 1106–1115. — The peer-reviewed paper underlying the superforecaster findings. - Mellers, B., Stone, E., Atanasov, P., et al. (2015). The psychology of intelligence analysis: Drivers of prediction accuracy in world politics. *Journal of Experimental Psychology: Applied*, 21(1), 1–14. --- ## Cognitive Biases in Forecasting **Overconfidence and the planning fallacy.** - Kahneman, D., & Tversky, A. (1979). Intuitive prediction: Biases and corrective procedures. *Management Science*, 12, 313–327. - Buehler, R., Griffin, D., & Ross, M. (1994). Exploring the "planning fallacy": Why people underestimate their task completion times. *Journal of Personality and Social Psychology*, 67(3), 366–381. **Hindsight bias.** - Fischhoff, B. (1975). Hindsight ≠ foresight: The effect of outcome knowledge on judgment under uncertainty. *Journal of Experimental Psychology: Human Perception and Performance*, 1(3), 288–299. — Documents the systematic tendency to remember past predictions as more accurate than they were. The primary reason writing predictions down before knowing the outcome matters at all. --- ## Proper Scoring Rules **Why the Brier score works.** - Gneiting, T., & Raftery, A.E. (2007). Strictly proper scoring rules, prediction, and estimation. *Journal of the American Statistical Association*, 102(477), 359–378. — Establishes the mathematical foundation of "proper" scoring rules — those that incentivize honest probability reporting. The Brier score is one of two foundational proper scoring rules (the other is the log score). **Log score as an alternative.** - Good, I.J. (1952). Rational decisions. *Journal of the Royal Statistical Society, Series B*, 14(1), 107–114. — The logarithmic scoring rule. Belief Tracker uses Brier rather than log because Brier is more intuitive for users new to the concept; log penalizes confident-wrong predictions even more heavily. --- ## Calibration Training **Does practice actually help?** - Lichtenstein, S., & Fischhoff, B. (1980). Training for calibration. *Organizational Behavior and Human Performance*, 26(2), 149–171. — One of the earliest demonstrations that calibration is improvable with practice and feedback. - Moore, D.A., Tenney, E.R., & Haran, U. (2016). Overprecision in judgment. In G. Keren & G. Wu (Eds.), *The Wiley Blackwell Handbook of Judgment and Decision Making* (Vol. 2, pp. 182–209). Wiley. --- ## Applied & Adjacent Reading | Source | What it offers | |--------|---------------| | *Superforecasting* (Tetlock & Gardner) | The single most accessible introduction to the field | | *The Signal and the Noise* (Nate Silver) | Forecasting in practice, with case studies across domains | | *Thinking, Fast and Slow* (Daniel Kahneman) | The broader landscape of judgment biases | | Open Philanthropy's calibration training app | A different free tool for practicing on multiple-choice trivia | | Manifold Markets, Metaculus | Public prediction platforms with track records you can study | --- ## A Note on Methodology Belief Tracker uses the standard Brier formula: $\text{Brier} = \frac{1}{N} \sum_{i=1}^{N} (p_i - o_i)^2$ where $p_i$ is the stated probability of prediction $i$ and $o_i$ is the binary outcome (1 if correct, 0 if incorrect). Calibration bins are 10-percentage-point bands (50–59, 60–69, 70–79, 80–89, 90–99). The midpoint of each band is used as the "perfect calibration" reference. This is the standard binning used in most calibration-training literature. Belief Tracker does not currently apply Laplace smoothing or other adjustments to small-sample bins; bars in bins with few predictions are simply unreliable until more data accumulates. This is on the list of small improvements to make. --- ## Back to the Documentation Index → [Back to Introduction to Belief Tracker](Introduction%20to%20Belief%20Tracker.md)