# The Science Behind It
Belief Tracker isn't a wellness app or a productivity gimmick. It's a small implementation of a serious body of research on **calibrated probabilistic forecasting** — a field with about seven decades of evidence behind it. This page sketches the lineage.
---
## Foundational Work
**The Brier score.**
- Brier, G.W. (1950). Verification of forecasts expressed in terms of probability. *Monthly Weather Review*, 78(1), 1–3. — The original paper introducing what's now known as the Brier score. Brier was a weather forecaster looking for a way to score probabilistic forecasts that rewarded both calibration and resolution.
**Calibration as a learnable skill.**
- Lichtenstein, S., Fischhoff, B., & Phillips, L.D. (1982). Calibration of probabilities: The state of the art to 1980. In D. Kahneman, P. Slovic, & A. Tversky (Eds.), *Judgment under uncertainty: Heuristics and biases* (pp. 306–334). Cambridge University Press. — The canonical review of early calibration research. Establishes the widespread finding of overconfidence in human probability judgments.
---
## Modern Forecasting Research
**The Good Judgment Project.**
- Tetlock, P.E., & Gardner, D. (2015). *Superforecasting: The art and science of prediction.* Crown Publishers. — The accessible book-length presentation of the Good Judgment Project's findings. Demonstrates empirically that a small group of "superforecasters" reliably outperform intelligence-agency analysts on geopolitical predictions, and that the skill is learnable through practice and feedback.
- Mellers, B., et al. (2014). Psychological strategies for winning a geopolitical forecasting tournament. *Psychological Science*, 25(5), 1106–1115. — The peer-reviewed paper underlying the superforecaster findings.
- Mellers, B., Stone, E., Atanasov, P., et al. (2015). The psychology of intelligence analysis: Drivers of prediction accuracy in world politics. *Journal of Experimental Psychology: Applied*, 21(1), 1–14.
---
## Cognitive Biases in Forecasting
**Overconfidence and the planning fallacy.**
- Kahneman, D., & Tversky, A. (1979). Intuitive prediction: Biases and corrective procedures. *Management Science*, 12, 313–327.
- Buehler, R., Griffin, D., & Ross, M. (1994). Exploring the "planning fallacy": Why people underestimate their task completion times. *Journal of Personality and Social Psychology*, 67(3), 366–381.
**Hindsight bias.**
- Fischhoff, B. (1975). Hindsight ≠ foresight: The effect of outcome knowledge on judgment under uncertainty. *Journal of Experimental Psychology: Human Perception and Performance*, 1(3), 288–299. — Documents the systematic tendency to remember past predictions as more accurate than they were. The primary reason writing predictions down before knowing the outcome matters at all.
---
## Proper Scoring Rules
**Why the Brier score works.**
- Gneiting, T., & Raftery, A.E. (2007). Strictly proper scoring rules, prediction, and estimation. *Journal of the American Statistical Association*, 102(477), 359–378. — Establishes the mathematical foundation of "proper" scoring rules — those that incentivize honest probability reporting. The Brier score is one of two foundational proper scoring rules (the other is the log score).
**Log score as an alternative.**
- Good, I.J. (1952). Rational decisions. *Journal of the Royal Statistical Society, Series B*, 14(1), 107–114. — The logarithmic scoring rule. Belief Tracker uses Brier rather than log because Brier is more intuitive for users new to the concept; log penalizes confident-wrong predictions even more heavily.
---
## Calibration Training
**Does practice actually help?**
- Lichtenstein, S., & Fischhoff, B. (1980). Training for calibration. *Organizational Behavior and Human Performance*, 26(2), 149–171. — One of the earliest demonstrations that calibration is improvable with practice and feedback.
- Moore, D.A., Tenney, E.R., & Haran, U. (2016). Overprecision in judgment. In G. Keren & G. Wu (Eds.), *The Wiley Blackwell Handbook of Judgment and Decision Making* (Vol. 2, pp. 182–209). Wiley.
---
## Applied & Adjacent Reading
| Source | What it offers |
|--------|---------------|
| *Superforecasting* (Tetlock & Gardner) | The single most accessible introduction to the field |
| *The Signal and the Noise* (Nate Silver) | Forecasting in practice, with case studies across domains |
| *Thinking, Fast and Slow* (Daniel Kahneman) | The broader landscape of judgment biases |
| Open Philanthropy's calibration training app | A different free tool for practicing on multiple-choice trivia |
| Manifold Markets, Metaculus | Public prediction platforms with track records you can study |
---
## A Note on Methodology
Belief Tracker uses the standard Brier formula:
$\text{Brier} = \frac{1}{N} \sum_{i=1}^{N} (p_i - o_i)^2$
where $p_i$ is the stated probability of prediction $i$ and $o_i$ is the binary outcome (1 if correct, 0 if incorrect).
Calibration bins are 10-percentage-point bands (50–59, 60–69, 70–79, 80–89, 90–99). The midpoint of each band is used as the "perfect calibration" reference. This is the standard binning used in most calibration-training literature.
Belief Tracker does not currently apply Laplace smoothing or other adjustments to small-sample bins; bars in bins with few predictions are simply unreliable until more data accumulates. This is on the list of small improvements to make.
---
## Back to the Documentation Index
→ [Back to Introduction to Belief Tracker](Introduction%20to%20Belief%20Tracker.md)