Kahneman_et_al-Noise - Charl Rodney Laubscher

202401211543 Status: #📚 #book #antilibrary #article Tags: #reference Links: ___ # Kahneman_et_al-Noise * Author: [Daniel Kahneman, Olivier Sibony, and Cass R. Sunstein](https://www.amazon.com/Daniel-Kahneman/e/B001ILFNQG/ref=dp_byline_cont_ebooks_1) * ASIN: B08LCZFJZ2 * Reference: https://www.amazon.com/dp/B08LCZFJZ2 * [Kindle link](kindle://book?action=open&asin=B08LCZFJZ2) ## Summary Noise can be thought of as the follow-up to [[Daniel Kahneman]]’s seminal [[Thinking Fast and Slow]]. General idea is that errors in decision-making are typically the result of two factors: *bias* and *noise*. If TFaS is all about *bias*, then Noise is all about explaining what bias doesn’t cover. Bias describes a predicable and consistent tendency to favour one direction or another, while noise describes a less predictable (and less conisistent) variation in judgements, especially in those judgements we typically deem to be consistent or mechanical. It's the inconsistency we find when we expect consistency, and can't explain easily with causal stories. While bias has been studied extensively and discussed at length (thanks largely to Kahneman’s work), noise is more of a silent killer. This is because bias better fits with our preference for finding causal narratives, whereas noise requires a more statistical mindset; considering any given judgement as a recurring judgement that's been made only once. The book takes disambiguates between *judgement* and *thinking*; judgment is a form of measurement in which the instrument is a human mind. In order to control for noise, an open-minded, statistical approach is necessary — along with a pretty big dose of humility. This includes conducting *noise audits* and sticking to a [[Bias Observation Checklist]]. The book suggests the best way to think about singular judgments is to treat them as recurrent judgments that are made only once, taking something of an [[Expected Value]], or even [[Bayesian]] statistical approach in accordance with the thinking of [[Game Theory]], and the ideas like [[Annie Duke]]’s [[Resulting]]. In all decisions that are supposed to be mechanical (ie. consistent) basic algorithms will always outperform human judges, because they will reduce noise. Therefore, we should look to automate our judgements once decisions have been made. Think [[Ulysses Contract]]. ## Quotes > “TK” > — [[Author]] #quote ## Core Ideas 1. [[Noise vs Bias]] 2. [[MSE (Mean of Squared Errors)]] 3. 4. ## Worth it if… ### Further Reading, Listening, Watching 1. [Movie Name, Author] 2. [Documentary Name, Author] 3. [Podcast Name, Author] ## Chapters ### Introduction: Two Kinds of Error #### Part I: Finding Noise 1. Crime and Noisy Punishment 2. A Noisy System 3. Singular Decisions #### Part II: Your Mind Is a Measuring Instrument 4. Matters of Judgment 5. Measuring Error 6. The Analysis of Noise 7. Occasion Noise 8. How Groups Amplify Noise #### Part III: Noise in Predictive Judgments 9. Judgments and Models 10. Noiseless Rules 11. Objective Ignorance 12. The Valley of the Normal #### Part IV: How Noise Happens 13. Heuristics, Biases, and Noise 14. The Matching Operation 15. Scales 16. Patterns 17. The Sources of Noise #### Part V: Improving Judgments 18. Better Judges for Better Judgments 19. Debiasing and Decision Hygiene 20. Sequencing Information in Forensic Science 21. Selection and Aggregation in Forecasting 22. Guidelines in Medicine 23. Defining the Scale in Performance Ratings 24. Structure in Hiring 25. The Mediating Assessments Protocol #### Part VI: Optimal Noise 26. The Costs of Noise Reduction 27. Dignity 28. Rules or Standards? #### Review and Conclusion: Taking Noise Seriously #### Epilogue: A Less Noisy World - Appendix A: How to Conduct a Noise Audit - Appendix B: A Checklist for a Decision Observer - Appendix C: Correcting Predictions — location: [76](kindle://book?action=open&asin=B08LCZFJZ2&location=76) ^ref-35338 # Quotes --- We call Team B biased because its shots are systematically off target. As the figure illustrates, the consistency of the bias supports a prediction. If one of the team’s members were to take another shot, we would bet on its landing in the same area as the first five. — location: [116](kindle://book?action=open&asin=B08LCZFJZ2&location=116) ^ref-65073 --- We call Team C noisy because its shots are widely scattered. There is no obvious bias, because the impacts are roughly centered on the bull’s-eye. If one of the team’s members took another shot, we would know very little about where it is likely to hit. — location: [119](kindle://book?action=open&asin=B08LCZFJZ2&location=119) ^ref-59513 --- We introduce several noise-reduction techniques that we collect under the label of decision hygiene. — location: [189](kindle://book?action=open&asin=B08LCZFJZ2&location=189) ^ref-55485 --- mediating assessments protocol: a general-purpose approach to the evaluation of options that incorporates several key practices of decision hygiene and aims to produce less noisy and more reliable judgments. — location: [192](kindle://book?action=open&asin=B08LCZFJZ2&location=192) ^ref-20345 --- wherever there is judgment, there is noise—and more of it than you think. — location: [224](kindle://book?action=open&asin=B08LCZFJZ2&location=224) ^ref-12918 --- A noise audit—like the one conducted on federal judges with respect to sentencing—is a way to reveal noise. In such an audit, the same case is evaluated by many individuals, and the variability of their responses is made visible. — location: [414](kindle://book?action=open&asin=B08LCZFJZ2&location=414) ^ref-36399 --- The noise audits suggested that respected professionals—and the organizations that employ them—maintained an illusion of agreement while in fact disagreeing in their daily professional judgments. To begin to understand how the illusion of agreement arises, put yourself in the shoes of an underwriter on a normal working day. You have more than five years of experience, you know that you are well regarded among your colleagues, and you respect and like them. You know you are good at your job. After thoroughly analyzing the complex risks faced by a financial firm, you conclude that a premium of $200,000 is appropriate. — location: [482](kindle://book?action=open&asin=B08LCZFJZ2&location=482) ^ref-30263 --- Most of us, most of the time, live with the unquestioned belief that the world looks as it does because that’s the way it is. — location: [490](kindle://book?action=open&asin=B08LCZFJZ2&location=490) ^ref-6356 --- These beliefs, which have been called naive realism, are essential to the sense of a reality we share with other people. — location: [492](kindle://book?action=open&asin=B08LCZFJZ2&location=492) ^ref-51358 --- We hold a single interpretation of the world around us at any one time, and we normally invest little effort in generating plausible alternatives to it. One interpretation is enough, and we experience it as true. We do not go through life imagining alternative ways of seeing what we see. — location: [493](kindle://book?action=open&asin=B08LCZFJZ2&location=493) ^ref-56109 --- While each case is in some sense unique, judgments like these are recurrent decisions. Doctors diagnosing patients, judges hearing parole cases, admissions officers reviewing applications, accountants preparing tax forms—these are all examples of recurrent decisions. — location: [536](kindle://book?action=open&asin=B08LCZFJZ2&location=536) ^ref-48001 --- Unwanted variability is easy to define and measure when interchangeable professionals make decisions in similar cases. It seems much harder, or perhaps even impossible, to apply the idea of noise to a category of judgments that we call singular decisions. — location: [538](kindle://book?action=open&asin=B08LCZFJZ2&location=538) ^ref-55234 --- From the perspective of noise reduction, a singular decision is a recurrent decision that happens only once. — location: [594](kindle://book?action=open&asin=B08LCZFJZ2&location=594) ^ref-58465 --- Judgment can therefore be described as measurement in which the instrument is a human mind. Implicit in the notion of measurement is the goal of accuracy—to approach truth and minimize error. — location: [608](kindle://book?action=open&asin=B08LCZFJZ2&location=608) ^ref-31906 --- In statistics, the most common measure of variability is standard deviation, and we will use it to measure noise in judgments. — location: [630](kindle://book?action=open&asin=B08LCZFJZ2&location=630) ^ref-19910 --- The thought process you went through illustrates several features of the mental operation we call judgment: Of all the cues provided by the description (which are only a subset of what you might need to know), you attended to some more than others without being fully aware of the choices you made. Did you notice that Gambardi is an Italian name? Do you remember the school he attended? This exercise was designed to overload you so that you could not easily recover all the details of the case. Most likely, your recollection of what we presented would be different from that of other readers. Selective attention and selective recall are a source of variability across people. Then, you informally integrated these cues into an overall impression of Gambardi’s prospects. The key word here is informally. You did not construct a plan for answering the question. Without being fully aware of what you were doing, your mind worked to construct a coherent impression of Michael’s strengths and weaknesses and of the challenges he faces. The informality allowed you to work quickly. It also produces variability: a formal process such as adding a column of numbers guarantees identical results, but some noise is inevitable in an informal operation. Finally, you converted this overall impression into a number on a probability scale of success. Matching a number between 0 and 100 to an impression is a remarkable process, to which we will return in chapter 14. Again, you do not know exactly why you responded as you did. Why did you choose, say, 65 rather than 61 or 69? Most likely, at some point, a number came to your mind. You checked whether that number felt right, and if it did not, another number came to mind. This part of the process is also a source of variability across people. — location: [683](kindle://book?action=open&asin=B08LCZFJZ2&location=683) ^ref-17019 --- The Gambardi exercise is an example of a nonverifiable predictive judgment, — location: [710](kindle://book?action=open&asin=B08LCZFJZ2&location=710) ^ref-61974 --- Just as we did when you used your stopwatch to measure laps, we can compute the standard deviation of the forecasts. As its name indicates, the standard deviation represents a typical distance from the mean. In this example, it is 10 percentage points. As is true for every normal distribution, about two-thirds of the forecasts are contained within one standard deviation on either side of the mean—in this example, between a 34% and a 54% market share. — location: [834](kindle://book?action=open&asin=B08LCZFJZ2&location=834) ^ref-50015 --- we need a “scoring rule” for errors, a way to weight and combine individual errors into a single measure of overall error. Fortunately, such a tool exists. It is the method of least squares, invented in 1795 by Carl Friedrich Gauss, a famous mathematical prodigy born in 1777, who began a career of major discoveries in his teens. Gauss proposed a rule for scoring the contribution of individual errors to overall error. His measure of overall error—called mean squared error (MSE)—is the average of the squares of the individual errors of measurement. — location: [863](kindle://book?action=open&asin=B08LCZFJZ2&location=863) ^ref-23544 --- the median number, the measurement that sits between the two shorter measurements and the two longer ones. — location: [876](kindle://book?action=open&asin=B08LCZFJZ2&location=876) ^ref-47811 --- the arithmetic mean, known in common parlance as the average, — location: [877](kindle://book?action=open&asin=B08LCZFJZ2&location=877) ^ref-29021 --- The role of bias and noise in error is easily summarized in two expressions that we will call the error equations. — location: [903](kindle://book?action=open&asin=B08LCZFJZ2&location=903) ^ref-21070 --- Error in a single measurement = Bias + Noisy Error — location: [907](kindle://book?action=open&asin=B08LCZFJZ2&location=907) ^ref-18719 --- Overall Error (MSE) = Bias2 + Noise2 — location: [911](kindle://book?action=open&asin=B08LCZFJZ2&location=911) ^ref-3594 --- The error equation is the intellectual foundation of this book. — location: [963](kindle://book?action=open&asin=B08LCZFJZ2&location=963) ^ref-7368 --- While you may well be familiar with the term standard deviation, you may find a concrete description useful. Imagine that you randomly pick two judges and compute the difference between their judgments of a case. Now repeat, for all pairs of judges and all cases, and average the results. This measure, the mean absolute difference, should give you a sense of the lottery that faces the defendant in a federal courtroom. Assuming that the judgments are normally distributed, it is 1.128 times the standard deviation, which implies that the average difference between two randomly chosen sentences of the same case will be 3.8 years. — location: [1044](kindle://book?action=open&asin=B08LCZFJZ2&location=1044) ^ref-33212 --- As any defense lawyer will tell you, judges have reputations, some for being harsh “hanging judges,” who are more severe than the average judge, and others for being “bleeding-heart judges,” who are more lenient than the average judge. We refer to these deviations as level errors. — location: [1058](kindle://book?action=open&asin=B08LCZFJZ2&location=1058) ^ref-32570 --- This difference indicates that there is more to system noise than differences in average severity across individual judges. We will call this other component of noise pattern noise. — location: [1078](kindle://book?action=open&asin=B08LCZFJZ2&location=1078) ^ref-8940 --- We use the term pattern noise for the variability we just identified, because that variability reflects a complex pattern in the attitudes of judges to particular cases. One judge, for instance, may be harsher than average in general but relatively more lenient toward white-collar criminals. Another may be inclined to punish lightly but more severely when the offender is a recidivist. A third may be close to the average severity but sympathetic when the offender is merely an accomplice and tough when the victim is an older person. (We use the term pattern noise in the interest of readability. The proper statistical term for pattern noise is judge × case interaction—pronounced “judge-by-case.” We apologize to people with statistical training for imposing the burden of translation on them.) — location: [1092](kindle://book?action=open&asin=B08LCZFJZ2&location=1092) ^ref-54539 --- System Noise2 = Level Noise2 + Pattern Noise2 — location: [1106](kindle://book?action=open&asin=B08LCZFJZ2&location=1106) ^ref-30211 --- If a judge is in a good mood because something nice happened to her daughter, or because a favorite sports team won yesterday, or because it is a beautiful day, her judgment might be more lenient than it would otherwise be. This within-person variability is conceptually distinct from the stable between-person differences that we have just discussed—but it is difficult to tell these sources of variability apart. Our name for the variability that is due to transient effects is occasion noise. — location: [1119](kindle://book?action=open&asin=B08LCZFJZ2&location=1119) ^ref-58314 --- Two researchers, Edward Vul and Harold Pashler, had the idea of asking people to answer this question (and many similar ones) not once but twice. The subjects were not told the first time that they would have to guess again. Vul and Pashler’s hypothesis was that the average of the two answers would be more accurate than either of the answers on its own. The data proved them right. In general, the first guess was closer to the truth than the second, but the best estimate came from averaging the two guesses. Vul and Pashler drew inspiration from the well-known phenomenon known as the wisdom-of-crowds effect: averaging the independent judgments of different people generally improves accuracy. — location: [1198](kindle://book?action=open&asin=B08LCZFJZ2&location=1198) ^ref-47838 --- Vul and Pashler wanted to find out if the same effect extends to occasion noise: can you get closer to the truth by combining two guesses from the same person, just as you do when you combine the guesses of different people? As they discovered, the answer is yes. Vul and Pashler gave this finding an evocative name: the crowd within — location: [1214](kindle://book?action=open&asin=B08LCZFJZ2&location=1214) ^ref-51165 --- Working independently of Vul and Pashler but at about the same time, two German researchers, Stefan Herzog and Ralph Hertwig, came up with a different implementation of the same principle. Instead of merely asking their subjects to produce a second estimate, they encouraged people to generate an estimate that—while still plausible—was as different as possible from the first one. This request required the subjects to think actively of information they had not considered the first time. The instructions to participants read as follows: First, assume that your first estimate is off the mark. Second, think about a few reasons why that could be. Which assumptions and considerations could have been wrong? Third, what do these new considerations imply? Was the first estimate rather too high or too low? Fourth, based on this new perspective, make a second, alternative estimate. — location: [1222](kindle://book?action=open&asin=B08LCZFJZ2&location=1222) ^ref-1151 --- Like Vul and Pashler, Herzog and Hertwig then averaged the two estimates thus produced. Their technique, which they named dialectical bootstrapping, produced larger improvements in accuracy than did a simple request for a second estimate immediately following the first. — location: [1229](kindle://book?action=open&asin=B08LCZFJZ2&location=1229) ^ref-35388 --- The propensity to find meaning in such statements is a trait known as bullshit receptivity. (Bullshit has become something of a technical term since Harry Frankfurt, a philosopher at Princeton University, published an insightful book, On Bullshit, in which he distinguished bullshit from other types of misrepresentation.) — location: [1274](kindle://book?action=open&asin=B08LCZFJZ2&location=1274) ^ref-29417 --- mood. In one study, researchers exposed subjects to the footbridge problem, a classic problem in moral philosophy. In this thought experiment, five people are about to be killed by a runaway trolley. Subjects are to imagine themselves standing on a footbridge, underneath which the trolley will soon pass. They must decide whether to push a large man off the footbridge and onto the tracks so that his body will stop the trolley. If they do so, they are told, the large man will die, but the five people will be saved. — location: [1282](kindle://book?action=open&asin=B08LCZFJZ2&location=1282) ^ref-29992 --- We have described these studies of mood in some detail because we need to emphasize an important truth: you are not the same person at all times. — location: [1295](kindle://book?action=open&asin=B08LCZFJZ2&location=1295) ^ref-36879 --- Another source of random variability in judgment is the order in which cases are examined. When a person is considering a case, the decisions that immediately preceded it serve as an implicit frame of reference. Professionals who make a series of decisions in sequence, including judges, loan officers, and baseball umpires, lean toward restoring a form of balance: after a streak, or a series of decisions that go in the same direction, they are more likely to decide in the opposite direction than would be strictly justified. As a result, errors (and unfairness) are inevitable. Asylum judges in the United States, for instance, are 19% less likely to grant asylum to an applicant when the previous two cases were approved. A person might be approved for a loan if the previous two applications were denied, but the same person might have been rejected if the previous two applications had been granted. This behavior reflects a cognitive bias known as the gambler’s fallacy: we tend to underestimate the likelihood that streaks will occur by chance. — location: [1314](kindle://book?action=open&asin=B08LCZFJZ2&location=1314) ^ref-42621 --- They were testing for a particular driver of noise: social influence — location: [1393](kindle://book?action=open&asin=B08LCZFJZ2&location=1393) ^ref-4240 --- As Salganik and his coauthors later demonstrated, group outcomes can be manipulated fairly easily, because popularity is self-reinforcing — location: [1401](kindle://book?action=open&asin=B08LCZFJZ2&location=1401) ^ref-7403 --- Some of the studies we are describing involve informational cascades. Such cascades are pervasive. They help explain why similar groups in business, government, and elsewhere can go in multiple directions and why small changes can produce such different outcomes and hence noise. — location: [1457](kindle://book?action=open&asin=B08LCZFJZ2&location=1457) ^ref-4624 --- However, the study of juries uncovers a distinct kind of social influence that is also a source of noise: group polarization. The basic idea is that when people speak with one another, they often end up at a more extreme point in line with their original inclinations. — location: [1510](kindle://book?action=open&asin=B08LCZFJZ2&location=1510) ^ref-9621 --- A measure that captures this intuition is the percent concordant (PC), which answers a more specific question: Suppose you take a pair of employees at random. What is the probability that the one who scored higher on an evaluation of potential also performs better on the job? If the accuracy of the early ratings were perfect, the PC would be 100%: the ranking of two employees by potential would be a perfect prediction of their eventual ranking by performance. If the predictions were entirely useless, concordance would occur by chance only, and the “higher-potential” employee would be just as likely as not to perform better: PC would be 50%. We will discuss this example, which has been studied extensively, in chapter 9. For a simpler example, PC for foot length and height in adult men is 71%. If you look at two people, first at their head and then at their feet, there is a 71% chance that the taller of the two also has the larger feet. — location: [1572](kindle://book?action=open&asin=B08LCZFJZ2&location=1572) ^ref-30043 --- PC is an immediately intuitive measure of covariation, which is a large advantage, but it is not the standard measure that social scientists use. The standard measure is the correlation coefficient (r), which varies between 0 and 1 when two variables are positively related. In the preceding example, the correlation between height and foot size is about .60. — location: [1579](kindle://book?action=open&asin=B08LCZFJZ2&location=1579) ^ref-62680 --- the fact that most judgments are made in a state of what we call objective ignorance, because many things on which the future depends can simply not be known. Strikingly, we manage, most of the time, to remain oblivious to this limitation and make predictions with confidence (or, indeed, overconfidence). — location: [1593](kindle://book?action=open&asin=B08LCZFJZ2&location=1593) ^ref-34400 --- The informal approach you took to this problem is known as clinical judgment. You consider the information, perhaps engage in a quick computation, consult your intuition, and come up with a judgment. In fact, clinical judgment is the process that we have described simply as judgment in this book. — location: [1612](kindle://book?action=open&asin=B08LCZFJZ2&location=1612) ^ref-64170 --- A standard statistical method answers these questions. In the present study, it yields an optimal correlation of .32 (PC = 60%), far from impressive but substantially higher than what clinical predictions achieved. This technique, called multiple regression, produces a predictive score that is a weighted average of the predictors. It finds the optimal set of weights, chosen to maximize the correlation between the composite prediction and the target variable. The optimal weights minimize the MSE (mean squared error) of the predictions—a prime example of the dominant role of the least squares principle in statistics. As you might expect, the predictor that is most closely correlated with the target variable gets a large weight, and useless predictors get a weight of zero. Weights could also be negative: the candidate’s number of unpaid traffic tickets would probably get a negative weight as a predictor of managerial success. — location: [1624](kindle://book?action=open&asin=B08LCZFJZ2&location=1624) ^ref-53507 --- The use of multiple regression is an example of mechanical prediction. There are many kinds of mechanical prediction, ranging from simple rules (“hire anyone who completed high school”) to sophisticated artificial intelligence models. But linear regression models are the most common (they have been called “the workhorse of judgment and decision-making research”). To minimize jargon, we will refer to linear models as simple models. — location: [1632](kindle://book?action=open&asin=B08LCZFJZ2&location=1632) ^ref-17745 --- “People believe they capture complexity and add subtlety when they make judgments. But the complexity and the subtlety are mostly wasted—usually they do not add to the accuracy of simple models.” — location: [1775](kindle://book?action=open&asin=B08LCZFJZ2&location=1775) ^ref-43783 --- “There is so much noise in judgment that a noise-free model of a judge achieves more accurate predictions than the actual judge does.” — location: [1778](kindle://book?action=open&asin=B08LCZFJZ2&location=1778) ^ref-3665 --- Although nowadays these are the applications we have in mind when we hear the word algorithm, the term has a broader meaning. In one dictionary’s definition, an algorithm is a “process or set of rules to be followed in calculations or other problem-solving operations, especially by a computer.” By this definition, simple models and other forms of mechanical judgment we described in the previous chapter are algorithms, too. — location: [1786](kindle://book?action=open&asin=B08LCZFJZ2&location=1786) ^ref-37701 --- Robyn Dawes was another member of the Eugene, Oregon, team of stars that studied judgment in the 1960s and 1970s. In 1974, Dawes achieved a breakthrough in the simplification of prediction tasks. His idea was surprising, almost heretical: instead of using multiple regression to determine the precise weight of each predictor, he proposed giving all the predictors equal weights. Dawes labeled the equal-weight formula an improper linear model. His surprising discovery was that these equal-weight models are about as accurate as “proper” regression models, and far superior to clinical judgments. — location: [1797](kindle://book?action=open&asin=B08LCZFJZ2&location=1797) ^ref-59259 --- The immediate implication of Dawes’s work deserves to be widely known: you can make valid statistical predictions without prior data about the outcome that you are trying to predict. All you need is a collection of predictors that you can trust to be correlated with the outcome. — location: [1827](kindle://book?action=open&asin=B08LCZFJZ2&location=1827) ^ref-35466 --- For example, large data sets make it possible to deal mechanically with broken-leg exceptions. This somewhat cryptic phrase goes back to an example that Meehl imagined: Consider a model that was designed to predict the probability that people will go to the movies tonight. Regardless of your confidence in the model, if you happen to know that a particular person just broke a leg, you probably know better than the model what their evening will look like. — location: [1871](kindle://book?action=open&asin=B08LCZFJZ2&location=1871) ^ref-64317 --- “When there is a lot of data, machine-learning algorithms will do better than humans and better than simple models. But even the simplest rules and algorithms have big advantages over human judges: they are free of noise, and they do not attempt to apply complex, usually invalid insights about the predictors.” — location: [1975](kindle://book?action=open&asin=B08LCZFJZ2&location=1975) ^ref-49911 --- Research in managerial decision making has shown that executives, especially the more senior and experienced ones, resort extensively to something variously called intuition, gut feel, or, simply, judgment (used in a different sense from the one we use in this book). — location: [1988](kindle://book?action=open&asin=B08LCZFJZ2&location=1988) ^ref-11726 --- One review of intuition in managerial decision making defines it as “a judgment for a given course of action that comes to mind with an aura or conviction of rightness or plausibility, but without clearly articulated reasons or justifications—essentially ‘knowing’ but without knowing why.” We propose that this sense of knowing without knowing why is actually the internal signal of judgment completion that we mentioned in chapter 4. — location: [1992](kindle://book?action=open&asin=B08LCZFJZ2&location=1992) ^ref-3493 --- What makes the internal signal important—and misleading—is that it is construed not as a feeling but as a belief. This emotional experience (“the evidence feels right”) masquerades as rational confidence in the validity of one’s judgment (“I know, even if I don’t know why”). — location: [1999](kindle://book?action=open&asin=B08LCZFJZ2&location=1999) ^ref-53178 --- Confidence is no guarantee of accuracy, however, and many confident predictions turn out to be wrong. While both bias and noise contribute to prediction errors, the largest source of such errors is not the limit on how good predictive judgments are. It is the limit on how good they could be. This limit, which we call objective ignorance, is the focus of this chapter. — location: [2001](kindle://book?action=open&asin=B08LCZFJZ2&location=2001) ^ref-18701 --- What we said of noise in predictive judgments can also be said of objective ignorance: wherever there is prediction, there is ignorance, and more of it than you think. — location: [2034](kindle://book?action=open&asin=B08LCZFJZ2&location=2034) ^ref-8443 --- A good friend of ours, the psychologist Philip Tetlock, is armed with a fierce commitment to truth and a mischievous sense of humor. In 2005, he published a book titled Expert Political Judgment. — location: [2036](kindle://book?action=open&asin=B08LCZFJZ2&location=2036) ^ref-41159 --- The book became famous for its arresting punch line: “The average expert was roughly as accurate as a dart-throwing chimpanzee.” — location: [2042](kindle://book?action=open&asin=B08LCZFJZ2&location=2042) ^ref-15295 --- A more precise statement of the book’s message was that experts who make a living “commenting or offering advice on political and economic trends” were not “better than journalists or attentive readers of the New York Times in ‘reading’ emerging situations.” — location: [2043](kindle://book?action=open&asin=B08LCZFJZ2&location=2043) ^ref-17136 --- Tetlock’s experts barely exceeded this very low standard. On average, they assigned slightly higher probabilities to events that occurred than to those that did not, but the most salient feature of their performance was their excessive confidence in their predictions. Pundits blessed with clear theories about how the world works were the most confident and the least accurate. — location: [2050](kindle://book?action=open&asin=B08LCZFJZ2&location=2050) ^ref-17201 --- Our conclusion, then, is that pundits should not be blamed for the failures of their distant predictions. They do, however, deserve some criticism for attempting an impossible task and for believing they can succeed in it. — location: [2058](kindle://book?action=open&asin=B08LCZFJZ2&location=2058) ^ref-57068 --- The team discovered that short-term forecasting is difficult but not impossible, and that some people, whom Tetlock and Mellers called superforecasters, are consistently better at it than most others, including professionals in the intelligence community. — location: [2061](kindle://book?action=open&asin=B08LCZFJZ2&location=2061) ^ref-56659 --- People who believe themselves capable of an impossibly high level of predictive accuracy are not just overconfident. They don’t merely deny the risk of noise and bias in their judgments. Nor do they simply deem themselves superior to other mortals. They also believe in the predictability of events that are in fact unpredictable, implicitly denying the reality of uncertainty. In the terms we have used here, this attitude amounts to a denial of ignorance. — location: [2102](kindle://book?action=open&asin=B08LCZFJZ2&location=2102) ^ref-27206 --- The challenge is that the “price” in this situation is not the same. Intuitive judgment comes with its reward, the internal signal. People are prepared to trust an algorithm that achieves a very high level of accuracy because it gives them a sense of certainty that matches or exceeds that provided by the internal signal. But giving up the emotional reward of the internal signal is a high price to pay when the alternative is some sort of mechanical process that does not even claim high validity. — location: [2120](kindle://book?action=open&asin=B08LCZFJZ2&location=2120) ^ref-37096 --- If, as you read the first part of this chapter, you asked yourself what drives evictions and other life outcomes among fragile families, you engaged in the same sort of thinking as that of the researchers whose efforts we described. You applied statistical thinking: you were concerned with ensembles, such as the population of fragile families, and with the statistics that describe them, including averages, variances, correlations, and so on. You were not focused on individual cases. A different mode of thinking, which comes more naturally to our minds, will be called here causal thinking. Causal thinking creates stories in which specific events, people, and objects affect one another. To experience causal thinking, picture yourself as a social worker who follows the cases of many underprivileged families. You have just heard that one of these families, the Joneses, has been evicted. Your reaction to this event is informed by what you know about the Joneses. — location: [2223](kindle://book?action=open&asin=B08LCZFJZ2&location=2223) ^ref-46528 --- In the valley of the normal, events unfold just like the Joneses’ eviction: they appear normal in hindsight, although they were not expected, and although we could not have predicted them. This is because the process of understanding reality is backward-looking. — location: [2246](kindle://book?action=open&asin=B08LCZFJZ2&location=2246) ^ref-17314 --- Causal thinking helps us make sense of a world that is far less predictable than we think. It also explains why we view the world as far more predictable than it really is. In the valley of the normal, there are no surprises and no inconsistencies. The future seems as predictable as the past. And noise is neither heard nor seen. — location: [2291](kindle://book?action=open&asin=B08LCZFJZ2&location=2291) ^ref-46087 --- “Correlation does not imply causation, but causation does imply correlation.” — location: [2295](kindle://book?action=open&asin=B08LCZFJZ2&location=2295) ^ref-58289 --- Substitution of one question for another is not restricted to similarity and probability. Another example is the replacement of a judgment of frequency by an impression of the ease with which instances come to mind. For example, the perception of the risk of airplane crashes or hurricanes rises briefly after well-publicized instances of such events. In theory, a judgment of risk should be based on a long-term average. In reality, recent incidents are given more weight because they come more easily to mind. Substituting a judgment of how easily examples come to mind for an assessment of frequency is known as the availability heuristic. — location: [2411](kindle://book?action=open&asin=B08LCZFJZ2&location=2411) ^ref-64549 --- The substitution of an easy judgment for a hard one is not limited to these examples. In fact, it is very common. Answering an easier question can be thought of as a general-purpose procedure for answering a question that could stump you. Consider how we tend to answer each of the following questions by using its easier substitute: Do I believe in climate change? Do I trust the people who say it exists? Do I think this surgeon is competent? Does this individual speak with confidence and authority? Will the project be completed on schedule? Is it on schedule now? Is nuclear energy necessary? Do I recoil at the word nuclear? Am I satisfied with my life as a whole? What is my mood right now? — location: [2415](kindle://book?action=open&asin=B08LCZFJZ2&location=2415) ^ref-19444 --- This example illustrates a different type of bias, which we call conclusion bias, or prejudgment. Like Lucas, we often start the process of judgment with an inclination to reach a particular conclusion. — location: [2439](kindle://book?action=open&asin=B08LCZFJZ2&location=2439) ^ref-35545 --- Prejudgments are evident wherever we look. Like Lucas’s reaction, they often have an emotional component. The psychologist Paul Slovic terms this the affect heuristic: people determine what they think by consulting their feelings. We like most things about politicians we favor, and we dislike even the looks and the voices of politicians we dislike. — location: [2451](kindle://book?action=open&asin=B08LCZFJZ2&location=2451) ^ref-49708 --- A subtler example of a conclusion bias is the anchoring effect, which is the effect that an arbitrary number has on people who must make a quantitative judgment. In a typical demonstration, you might be presented with a number of items whose price is not easy to guess, such as an unfamiliar bottle of wine. You are asked to jot down the last two digits of your Social Security number and indicate whether you would pay that amount for the bottle. Finally, you are asked to state the maximum amount you would be willing to pay for it. The results show that anchoring on your Social Security number will affect your final buying price. In one study, people whose Social Security numbers generated a high anchor (more than eighty dollars) stated that they were willing to pay about three times more than those with a low anchor (less than twenty dollars). — location: [2458](kindle://book?action=open&asin=B08LCZFJZ2&location=2458) ^ref-24474 --- This experiment illustrates excessive coherence: we form coherent impressions quickly and are slow to change them. In this example, we immediately developed a positive attitude toward the candidate, in light of little evidence. Confirmation bias—the same tendency that leads us, when we have a prejudgment, to disregard conflicting evidence altogether—made us assign less importance than we should to subsequent data. (Another term to describe this phenomenon is the halo effect, because the candidate was evaluated in the positive “halo” of the first impression. We will see in chapter 24 that the halo effect is a serious problem in hiring decisions.) — location: [2483](kindle://book?action=open&asin=B08LCZFJZ2&location=2483) ^ref-12603 --- This chapter focuses on the role of the response scale as a pervasive source of noise. People may differ in their judgments, not because they disagree on the substance but because they use the scale in different ways. — location: [2729](kindle://book?action=open&asin=B08LCZFJZ2&location=2729) ^ref-1227 --- Variance of Judgments = Variance of Just Punishments + (Level Noise)2 + (Pattern Noise)2 — location: [2794](kindle://book?action=open&asin=B08LCZFJZ2&location=2794) ^ref-1595 --- The legendary Harvard psychologist S. S. Stevens discovered the surprising fact that people share strong intuitions about the ratios of intensity of many subjective experiences and attitudes. They can adjust a light so that it appears “twice as bright” as another, and they agree that the emotional significance of a ten-month prison sentence is not nearly ten times as bad as that of a sentence of one month. Stevens called scales that draw on such intuitions ratio scales. — location: [2819](kindle://book?action=open&asin=B08LCZFJZ2&location=2819) ^ref-24577 --- The authors of the study named the persistent effect of a single anchor “coherent arbitrariness.” — location: [2836](kindle://book?action=open&asin=B08LCZFJZ2&location=2836) ^ref-32688 --- When do you feel confident in a judgment? Two conditions must be satisfied: the story you believe must be comprehensively coherent, and there must be no attractive alternatives. Comprehensive coherence is achieved when all the details of the chosen interpretation fit with the story and reinforce each other. Of course, you can also achieve coherence, albeit less elegantly, by ignoring or explaining away whatever does not fit. It is the same with alternative interpretations. The true expert who has “solved” a judgment problem knows not only why her explanatory story is correct; she is equally fluent in explaining why other stories are wrong. Here again, a person can gain confidence of equal strength but poorer quality by failing to consider alternatives or by actively suppressing them. — location: [2912](kindle://book?action=open&asin=B08LCZFJZ2&location=2912) ^ref-59927 --- For a simple case of stable pattern noise, consider recruiters who predict the future performance of executives on the basis of a set of ratings. In chapter 9 we spoke of a “model of the judge.” The model of an individual recruiter assigns a weight to each rating, which corresponds to its importance in that recruiter’s judgments. The weights vary among recruiters: leadership may count more for one recruiter, communication skills for another. Such differences produce variability in the recruiters’ ranking of candidates—an instance of what we call stable pattern noise. — location: [2941](kindle://book?action=open&asin=B08LCZFJZ2&location=2941) ^ref-51876 --- The figure illustrates three successive breakdowns of error: error into bias and system noise, system noise into level noise and pattern noise, pattern noise into stable pattern noise and occasion noise. — location: [3026](kindle://book?action=open&asin=B08LCZFJZ2&location=3026) ^ref-31575 --- Earlier in this book, we noted that we easily make sense of events in hindsight, although we could not have predicted them before they happened. In the valley of the normal, events are unsurprising and easily explained. The same can be said of judgments. Like other events, judgments and decisions mostly happen in the valley of the normal; they usually do not surprise us. For one thing, judgments that produce satisfactory outcomes are normal, and seldom questioned. When the shooter who is picked for the free kick scores the goal, when the heart surgery is successful, or when a start-up prospers, we assume that the reasons the decision makers had for their choices must have been the right ones. After all, they have been proven right. Like any other unsurprising story, a success story explains itself once the outcome is known. — location: [3123](kindle://book?action=open&asin=B08LCZFJZ2&location=3123) ^ref-42232 --- A well-documented psychological bias called the fundamental attribution error is a strong tendency to assign blame or credit to agents for actions and outcomes that are better explained by luck or by objective circumstances. Another bias, hindsight, distorts judgments so that outcomes that could not have been anticipated appear easily foreseeable in retrospect. — location: [3132](kindle://book?action=open&asin=B08LCZFJZ2&location=3132) ^ref-37949 --- The confidence we have in these experts’ judgment is entirely based on the respect they enjoy from their peers. We call them respect-experts. The term respect-expert is not meant to be disrespectful. The fact that some experts are not subject to an evaluation of the accuracy of their judgments is not a criticism; it is a fact of life in many domains. Many professors, scholars, and management consultants are respect-experts. Their credibility depends on the respect of their students, peers, or clients. In all these fields, and many more, the judgments of one professional can be compared only with those of her peers. — location: [3238](kindle://book?action=open&asin=B08LCZFJZ2&location=3238) ^ref-58153 --- Many debates and misunderstandings arise in discussions of measures of intelligence or of general mental ability (GMA, the term now used in preference to intelligence quotient, or IQ). There are lingering misconceptions about the innate nature of intelligence; in fact, tests measure developed abilities, which are partly a function of heritable traits and partly influenced by the environment, including educational opportunities. Many people also have concerns about the adverse impact of GMA-based selection on identifiable social groups and the legitimacy of using GMA tests for selection purposes. We need to separate these concerns about the use of tests from the reality of their predictive value. Since the US Army started using tests of mental ability more than a century ago, thousands of studies have measured the link between cognitive test scores and subsequent performance. The message that emerges from this mass of research is unambiguous. As one review put it, “GMA predicts both occupational level attained and performance within one’s chosen occupation and does so better than any other ability, trait, or disposition and better than job experience.” Of course, other cognitive abilities matter too (more on this later). So do many personality traits—including conscientiousness and grit, defined as perseverance and passion in the pursuit of long-term goals. And yes, there are various forms of intelligence that GMA tests do not measure, such as practical intelligence and creativity. Psychologists and neuroscientists distinguish between crystallized intelligence, the ability to solve problems by relying on a store of knowledge about the world (including arithmetical operations), and fluid intelligence, the ability to solve novel problems. Yet for all its crudeness and limitations, GMA, as measured by standardized tests containing questions on verbal, quantitative, and spatial problems, remains by far the best single predictor of important outcomes. As the previously mentioned review adds, the predictive power of GMA is “larger than most found in psychological research.” The strength of the association between general mental ability and job success increases, quite logically, with the complexity of the job in question: intelligence matters more for rocket scientists than it does for those with simpler tasks. For jobs of high complexity, the correlations that can be observed between standardized test scores and job performance are in the .50 range (PC = 67%). As we have noted, a correlation of .50 indicates a very strong predictive value by social-science standards. — location: [3275](kindle://book?action=open&asin=B08LCZFJZ2&location=3275) ^ref-62834 --- However, this measure fails to capture differences in achievement within these groups. Even among the top 1% of people as measured by cognitive ability (evaluated at age thirteen), exceptional outcomes are strongly correlated with GMA. Compared with those who are in the bottom quartile of this top 1%, those who are in the top quartile are two to three times more likely to earn a doctoral-level degree, publish a book, or be granted a patent. In other words, not only does the difference in GMA matter between the 99th percentile and the 80th or 50th, but it still matters—a lot!—between the 99.88th percentile and the 99.13th. — location: [3303](kindle://book?action=open&asin=B08LCZFJZ2&location=3303) ^ref-37235 --- In another striking illustration of the link between ability and outcomes, a 2013 study focused on the CEOs of Fortune 500 companies and the 424 American billionaires (the top 0.0001% of Americans by wealth). It found, predictably, that these hyper-elite groups are composed of people drawn from the most intellectually able. But the study also found that within these groups, higher education and ability levels are related to higher compensation (for CEOs) and net worth (for billionaires). Incidentally, famous college dropouts who become billionaires, such as Steve Jobs, Bill Gates, and Mark Zuckerberg, are the trees that hide the forest: whereas about one-third of American adults have earned a college degree, 88% of billionaires did so. — location: [3308](kindle://book?action=open&asin=B08LCZFJZ2&location=3308) ^ref-23363 --- The conclusion is clear. GMA contributes significantly to the quality of performance in occupations that require judgment, even within a pool of high-ability individuals. The notion that there is a threshold beyond which GMA ceases to make a difference is not supported by the evidence. This conclusion in turn strongly suggests that if professional judgments are unverifiable but assumed to reach for an invisible bull’s-eye, then the judgments of high-ability people are more likely to be close. If you must pick people to make judgments, picking those with the highest mental ability makes a lot of sense. — location: [3313](kindle://book?action=open&asin=B08LCZFJZ2&location=3313) ^ref-63290 --- Regardless of mental ability, people differ in their cognitive style, or their approach to judgment tasks. Many instruments have been developed to capture cognitive styles. Most of these measures correlate with GMA (and with one another), but they measure different things. — location: [3325](kindle://book?action=open&asin=B08LCZFJZ2&location=3325) ^ref-9894 --- One such measure is the cognitive reflection test (CRT), made famous by the now-ubiquitous question about the ball and the bat: “A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost?” Other questions that have been proposed to measure cognitive reflection include this one: “If you’re running a race and you pass the person in second place, what place are you in?” CRT questions attempt to measure how likely people are to override the first (and wrong) answer that comes to mind (“ten cents” for the ball-and-bat question, and “first” for the race example). Lower CRT scores are associated with many real-world judgments and beliefs, including belief in ghosts, astrology, and extrasensory perception. The scores predict whether people will fall for blatantly inaccurate “fake news.” They are even associated with how much people will use their smartphones. — location: [3327](kindle://book?action=open&asin=B08LCZFJZ2&location=3327) ^ref-9594 --- Thinking Assessment, which focuses on critical thinking skills, including both a disposition toward rational thinking and a set of learnable skills. Taking this assessment, you would be asked questions like this: “Imagine that a friend asks you for advice about which of two weight-loss programs to choose. Whereas one program reports that clients lose an average of twenty-five pounds, the other program reports that they lose an average of thirty pounds. What questions would you like to have answered before choosing one of the programs?” If you answered, for instance, that you would want to know how many people lost this much weight and whether they maintained that weight loss for a year or more, you would score points for applying critical thinking. People who score well on the Adult Decision Making Competence scale or on the Halpern assessment seem to make better judgments in life: they experience fewer adverse life events driven by bad choices, such as needing to pay late fees for a movie rental and experiencing an unwanted pregnancy. — location: [3349](kindle://book?action=open&asin=B08LCZFJZ2&location=3349) ^ref-30224 --- As we use the term, judgment should not be confused with “thinking.” It is a much narrower concept: judgment is a form of measurement in which the instrument is a human mind. Like other measurements, a judgment assigns a score to an object. The score need not be a number. — location: [5263](kindle://book?action=open&asin=B08LCZFJZ2&location=5263) ^ref-40157 --- Many people earn a living by making professional judgments, and everyone is affected by such judgments in important ways. Professional judges, as we call them here, include football coaches and cardiologists, lawyers and engineers, Hollywood executives and insurance underwriters, and many more. — location: [5268](kindle://book?action=open&asin=B08LCZFJZ2&location=5268) ^ref-9546 --- Some judgments are predictive, and some predictive judgments are verifiable; we will eventually know whether they were accurate. This is generally the case for short-term forecasts of outcomes such as the effects of a medication, the course of a pandemic, or the results of an election. But many judgments, including long-term forecasts and answers to fictitious questions, are unverifiable. The quality of such judgments can be assessed only by the quality of the thought process that produces them. Furthermore, many judgments are not predictive but… — location: [5272](kindle://book?action=open&asin=B08LCZFJZ2&location=5272) ^ref-4137 --- people who make judgments behave as if a true value exists, regardless of whether it does. They think and act as if there were an invisible bull’s-eye at which to aim, one that they and others should not miss by much. The phrase judgment call implies both the possibility of disagreement and the expectation that it will be… — location: [5277](kindle://book?action=open&asin=B08LCZFJZ2&location=5277) ^ref-12988 --- We say that bias exists when most errors in a set of judgments are in the same direction. Bias is the average error, as, for example, when a team of shooters consistently hits below and to the left of the target; when executives are too optimistic about sales, year after year; or when a company keeps reinvesting money in failing projects that it should write off. Eliminating bias from a set of judgments will not eliminate all error. The errors that remain when bias is removed are not shared. They are the unwanted divergence of judgments, the unreliability of the measuring instrument we apply to reality. They are noise. Noise is variability in judgments that should be identical. We use the term system noise for the noise observed in… — location: [5282](kindle://book?action=open&asin=B08LCZFJZ2&location=5282) ^ref-2692 --- The mean of squared errors (MSE) has been the standard of accuracy in scientific measurement for two hundred years. The main features of MSE are that it yields the sample mean as an unbiased estimate of the population mean, treats positive and… — location: [5290](kindle://book?action=open&asin=B08LCZFJZ2&location=5290) ^ref-54057 --- MSE does not reflect the real costs of judgment errors, which are often asymmetric. However, professional decisions always require accurate predictions. For a city facing a hurricane, the costs of under- and overestimating the threat are clearly not the same, but you would not want these costs to influence the meteorologists’ forecast of the storm’s speed and trajectory. MSE is the… — location: [5292](kindle://book?action=open&asin=B08LCZFJZ2&location=5292) ^ref-14859 --- Bias and noise make equal contributions to overall error (MSE) when the mean of errors (the bias) is equal to the standard deviations of errors (the noise). When the distribution of judgments is normal (the standard bell-shaped curve), the effects of bias and noise are equal when 84% of judgments are above (or below) the true value. This is a substantial bias, which will often be detectable in a professional context. When the bias is smaller than one standard deviation, noise is the bigger source of overall error. — location: [5308](kindle://book?action=open&asin=B08LCZFJZ2&location=5308) ^ref-41902 --- The large role of noise in error contradicts a commonly held belief that random errors do not matter, because they “cancel out.” This belief is wrong. If multiple shots are scattered around the target, it is unhelpful to say that, on average, they hit the bull’s-eye. — location: [5319](kindle://book?action=open&asin=B08LCZFJZ2&location=5319) ^ref-36670 --- System noise can be broken down into level noise and pattern noise. Some judges are generally more severe than others, and others are more lenient; some forecasters are generally bullish and others bearish about market prospects; some doctors prescribe more antibiotics than others do. — location: [5326](kindle://book?action=open&asin=B08LCZFJZ2&location=5326) ^ref-10375 --- Level noise is the variability of the average judgments made by different individuals. The ambiguity of judgment scales is one of the sources of level noise. Words such as likely or numbers (e.g., “4 on a scale of 0 to 6”) mean different things to different people. — location: [5328](kindle://book?action=open&asin=B08LCZFJZ2&location=5328) ^ref-21886 --- System noise includes another, generally larger component. Regardless of the average level of their judgments, two judges may differ in their views of which crimes deserve the harsher sentences. Their sentencing decisions will produce a different ranking of cases. We call this variability pattern noise (the technical term is statistical interaction). — location: [5331](kindle://book?action=open&asin=B08LCZFJZ2&location=5331) ^ref-24712 --- The main source of pattern noise is stable: it is the difference in the personal, idiosyncratic responses of judges to the same case. — location: [5333](kindle://book?action=open&asin=B08LCZFJZ2&location=5333) ^ref-23512 --- This stable pattern noise reflects the uniqueness of judges: their response to cases is as individual as their personality. The subtle differences among people are often enjoyable and interesting, but the differences become problematic when professionals operate within a system that assumes consistency. — location: [5339](kindle://book?action=open&asin=B08LCZFJZ2&location=5339) ^ref-44035 --- Pattern noise also has a transient component, called occasion noise. We detect this kind of noise if a radiologist assigns different diagnoses to the same image on different days or if a fingerprint examiner identifies two prints as a match on one occasion but not on another. — location: [5343](kindle://book?action=open&asin=B08LCZFJZ2&location=5343) ^ref-8913 --- The judges’ cognitive flaws are not the only cause of errors in predictive judgments. Objective ignorance often plays a larger role. Some facts are actually unknowable—how many grandchildren a baby born yesterday will have seventy years from now, or the number of a winning lottery ticket in a drawing to be held next year. Others are perhaps knowable but are not known to the judge. People’s exaggerated confidence in their predictive judgment underestimates their objective ignorance as well as their biases. — location: [5348](kindle://book?action=open&asin=B08LCZFJZ2&location=5348) ^ref-50082 --- There is a limit to the accuracy of our predictions, and this limit is often quite low. Nevertheless, we are generally comfortable with our judgments. What gives us this satisfying confidence is an internal signal, a self-generated reward for fitting the facts and the judgment into a coherent story. — location: [5352](kindle://book?action=open&asin=B08LCZFJZ2&location=5352) ^ref-39633 --- Psychological biases are, of course, a source of systematic error, or statistical bias. Less obviously, they are also a source of noise. When biases are not shared by all judges, when they are present to different degrees, and when their effects depend on extraneous circumstances, psychological biases produce noise. For instance, if half the managers who make hiring decisions are biased against women and half are biased in their favor, there will be no overall bias, but system noise will cause many hiring errors. Another example is the disproportionate effect of first impressions. — location: [5360](kindle://book?action=open&asin=B08LCZFJZ2&location=5360) ^ref-2653 --- Large individual differences emerge when a judgment requires the weighting of multiple, conflicting cues. Looking at the same candidate, some recruiters will give more weight to evidence of brilliance or charisma; others will be more influenced by concerns about diligence or calm under pressure. When cues are inconsistent and do not fit a coherent story, different people will inevitably give more weight to certain cues and ignore others. Pattern noise will result. — location: [5371](kindle://book?action=open&asin=B08LCZFJZ2&location=5371) ^ref-1082 --- Whenever something goes wrong, we look for a cause—and often find it. In many cases, the cause will appear to be a bias. Bias has a kind of explanatory charisma, which noise lacks. If we try to explain, in hindsight, why a particular decision was wrong, we will easily find bias and never find noise. Only a statistical view of the world enables us to see noise, but that view does not come naturally—we prefer causal stories. — location: [5379](kindle://book?action=open&asin=B08LCZFJZ2&location=5379) ^ref-23668 --- In most fields, a judgment may never be evaluated against a true value and will at most be subjected to vetting by another professional who is considered a respect-expert. Only occasionally will professionals be faced with a surprising disagreement, and when that happens, they will generally find reasons to view it as an isolated case. — location: [5387](kindle://book?action=open&asin=B08LCZFJZ2&location=5387) ^ref-57013 --- There is reason to believe that some people make better judgments than others do. Task-specific skill, intelligence, and a certain cognitive style—best described as being actively open-minded—characterize the best judges. — location: [5392](kindle://book?action=open&asin=B08LCZFJZ2&location=5392) ^ref-50864 --- One strategy for error reduction is debiasing. Typically, people attempt to remove bias from their judgments either by correcting judgments after the fact or by taming biases before they affect judgments. We propose a third option, which is particularly applicable to decisions made in a group setting: detect biases in real time, by designating a decision observer to identify signs of bias (see appendix B). — location: [5396](kindle://book?action=open&asin=B08LCZFJZ2&location=5396) ^ref-27188 --- Our main suggestion for reducing noise in judgment is decision hygiene. — location: [5399](kindle://book?action=open&asin=B08LCZFJZ2&location=5399) ^ref-26825 --- Decision hygiene is as unglamorous as its name and certainly less exciting than a victorious fight against predictable biases. There may be no glory in preventing an unidentified harm, but it is very much worth doing. — location: [5402](kindle://book?action=open&asin=B08LCZFJZ2&location=5402) ^ref-374 --- The goal of judgment is accuracy, not individual expression. This statement is our candidate for the first principle of decision hygiene in judgment. — location: [5408](kindle://book?action=open&asin=B08LCZFJZ2&location=5408) ^ref-28314 --- To be clear, personal values, individuality, and creativity are needed, even essential, in many phases of thinking and decision making, including the choice of goals, the formulation of novel ways to approach a problem, and the generation of options. But when it comes to making a judgment about these options, expressions of individuality are a source of noise. — location: [5412](kindle://book?action=open&asin=B08LCZFJZ2&location=5412) ^ref-58871 --- Think statistically, and take the outside view of the case. We say that a judge takes the outside view of a case when she considers it as a member of a reference class of similar cases rather than as a unique problem. This approach diverges from the default mode of thinking, which focuses firmly on the case at hand and embeds it in a causal story. — location: [5421](kindle://book?action=open&asin=B08LCZFJZ2&location=5421) ^ref-28709 --- Structure judgments into several independent tasks. This divide-and-conquer principle is made necessary by the psychological mechanism we have described as excessive coherence, which causes people to distort or ignore information that does not fit a preexisting or emerging story. Overall accuracy suffers when impressions of distinct aspects of a case contaminate each other. — location: [5430](kindle://book?action=open&asin=B08LCZFJZ2&location=5430) ^ref-53964 --- Resist premature intuitions. We have described the internal signal of judgment completion that gives decision makers confidence in their judgment. — location: [5439](kindle://book?action=open&asin=B08LCZFJZ2&location=5439) ^ref-63845 --- This principle inspires our recommendation to sequence the information: professionals who make judgments should not be given information that they don’t need and that could bias them, even if that information is accurate. — location: [5444](kindle://book?action=open&asin=B08LCZFJZ2&location=5444) ^ref-27843 --- Obtain independent judgments from multiple judges, then consider aggregating those judgments. The requirement of independence is routinely violated in the procedures of organizations, notably in meetings in which participants’ opinions are shaped by those of others. — location: [5448](kindle://book?action=open&asin=B08LCZFJZ2&location=5448) ^ref-30610 --- Favor relative judgments and relative scales. Relative judgments are less noisy than absolute ones, because our ability to categorize objects on a scale is limited, while our ability to make pairwise comparisons is much better. — location: [5455](kindle://book?action=open&asin=B08LCZFJZ2&location=5455) ^ref-50121 --- Similarly, the best way to think about singular judgments is to treat them as recurrent judgments that are made only once. That is why decision hygiene should improve them, too. — location: [5463](kindle://book?action=open&asin=B08LCZFJZ2&location=5463) ^ref-5789 --- How to Conduct a Noise Audit — location: [5499](kindle://book?action=open&asin=B08LCZFJZ2&location=5499) ^ref-27489 --- A noise audit requires a substantial amount of work and much attention to detail because its credibility will surely be questioned if its findings reveal significant flaws. — location: [5506](kindle://book?action=open&asin=B08LCZFJZ2&location=5506) ^ref-15539 --- Alongside the consultant (who may be external or internal), the relevant cast of characters includes the following: — location: [5509](kindle://book?action=open&asin=B08LCZFJZ2&location=5509) ^ref-22987 --- Project team. The project team will be responsible for all phases of the study. If the consultants are internal, they will form the core of the project team. If the consultants are external, an internal project team will work closely with them. — location: [5511](kindle://book?action=open&asin=B08LCZFJZ2&location=5511) ^ref-58552 --- Clients. A noise audit will only be useful if it leads to significant changes, which requires early involvement of the leadership of the organization, which is the “client” of the project. You can expect clients to be initially skeptical about the prevalence of noise. — location: [5515](kindle://book?action=open&asin=B08LCZFJZ2&location=5515) ^ref-25646 --- Judges. The clients will designate one or more units to be audited. The selected unit should consist of a substantial number of “judges,” the professionals who make similar judgments and decisions on behalf of the company. — location: [5519](kindle://book?action=open&asin=B08LCZFJZ2&location=5519) ^ref-39535 --- Project manager. A high-level manager in the administrative staff should be designated as project manager. — location: [5524](kindle://book?action=open&asin=B08LCZFJZ2&location=5524) ^ref-21850 --- Construction of Case Materials The subject matter experts who are part of the project team should have recognized expertise in the task of the unit (e.g., setting premiums for risks or evaluating the potential of possible investments). They will be in charge of developing the cases that will be used in the audit. — location: [5529](kindle://book?action=open&asin=B08LCZFJZ2&location=5529) ^ref-37562 --- A questionnaire should be prepared for each case, to provide a deeper understanding of the reasoning that led each judge to a judgment of that case. The questionnaire should be administered only after the completion of all cases. It should include: Open questions about the key factors that led the participant to her response. A list of the facts of the case, allowing the participant to rate their importance. Questions that call for an “outside view” of the category to which the case belongs. For instance, if the cases call for dollar valuations, participants should provide an estimate of how much below or above average the case is compared to all valuations for cases of the same category. “What is the estimated cost of getting an evaluation wrong in either direction (too high or low) by a specified amount (e.g., 15%)?” — location: [5538](kindle://book?action=open&asin=B08LCZFJZ2&location=5538) ^ref-50451 --- Once the executives accept the design of the noise audit, the project team should ask them to state their expectations about the results of the study. They should discuss questions such as: “What level of disagreement do you expect between a randomly selected pair of answers to each case?” “What is the maximum level of disagreement that would be acceptable from a business perspective?” “What is the estimated cost of getting an evaluation wrong in either direction (too high or low) by a specified amount (e.g., 15%)?” The answers to these questions should be documented to ensure that they are remembered and believed when the actual results of the audit come in. — location: [5550](kindle://book?action=open&asin=B08LCZFJZ2&location=5550) ^ref-28502 --- Administration of the Study The managers of the audited unit should be, from the beginning, informed in general terms that their unit has been selected for special study. However, it is important that the term noise audit not be used to describe the project. The words noise and noisy should be avoided, especially as descriptions of people. A neutral term such as decision-making study should be used instead. — location: [5556](kindle://book?action=open&asin=B08LCZFJZ2&location=5556) ^ref-10490 --- The intent of the exercise should be described to the participants in general terms, as in “The organization is interested in how [decision makers] reach their conclusions.” It is essential to reassure the professionals who participate in the study that individual answers will not be known to anyone in the organization, including the project team. If necessary, an outside firm may be hired to anonymize the data. It is also important to stress that there will be no specific consequences for the unit, which was merely selected as representative of units that perform judgment tasks on behalf of the organization. — location: [5561](kindle://book?action=open&asin=B08LCZFJZ2&location=5561) ^ref-7004 --- [[Bias Observation Checklist]] 1. APPROACH TO JUDGMENT 1a. Substitution ____“Did the group’s choice of evidence and the focus of their discussion indicate substitution of an easier question for the difficult one they were assigned?” ____“Did the group neglect an important factor (or appear to give weight to an irrelevant one)?” 1b. Inside view ____“Did the group adopt the outside view for part of its deliberations and seriously attempt to apply comparative rather than absolute judgment?” 1c. Diversity of views ____“Is there any reason to suspect that members of the group share biases, which could lead their errors to be correlated? Conversely, can you think of a relevant point of view or expertise that is not represented in this group? 2. PREJUDGMENTS AND PREMATURE CLOSURE 2a. Initial prejudgments ____“Do (any of) the decision makers stand to gain more from one conclusion than another?” ____“Was anyone already committed to a conclusion? Is there any reason to suspect prejudice?” ____“Did dissenters express their views?” ____“Is there a risk of escalating commitment to a losing course of action?” 2b. Premature closure; excessive coherence ____“Was there accidental bias in the choice of considerations that were discussed early?” ____“Were alternatives fully considered, and was evidence that would support them actively sought?” ____“Were uncomfortable data or opinions suppressed or neglected?” 3. INFORMATION PROCESSING 3a. Availability and salience ____“Are the participants exaggerating the relevance of an event because of its recency, its dramatic quality, or its personal relevance, even if it is not diagnostic?” 3b. Inattention to quality of information ____“Did the judgment rely heavily on anecdotes, stories, or analogies? Did the data confirm them?” 3c. Anchoring ____“Did numbers of uncertain accuracy or relevance play an important role in the final judgment?” 3d. Nonregressive prediction ____“Did the participants make nonregressive extrapolations, estimates, or forecasts?” DECISION 4a. Planning fallacy ____“When forecasts were used, did people question their sources and validity? Was the outside view used to challenge the forecasts?” ____“Were confidence intervals used for uncertain numbers? Are they wide enough?” 4b. Loss aversion ____“Is the risk appetite of the decision makers aligned with that of the organization? Is the decision team overly cautious?” 4c. Present bias ____“Do the calculations (including the discount rate used) reflect the organization’s balance of short- and long-term priorities?” — location: [5589](kindle://book?action=open&asin=B08LCZFJZ2&location=5589) ^ref-55186 ---