* What you are measuring may not be exactly what you think you are measuring
* Your measurement method may omit a portion of the item you seek to quantify, or include a bit of some other item.
* You underestimate the amount of error and noise in the data
* Every measurement includes errors, both systematic and random (“noise”). You must separate these errors from the item you wish to characterize (“signal”). If you do not, you are likely to mistake the change in the noise (especially random variations) for a change in the signal of interest.
* Failing to understand that the answer from the measurement might just be wrong
* The answer provided by a measurement or a test might just be wrong; there are false positives (the test says “yes” when the correct answer is “no”) and false negatives (the test says “no” when the correct answer is “yes”).
* Missing that the answer depends on a conditional probability
* You base your analysis and eventual decision on too few parameters, failing to notice the essential interconnection between parameters. This may come about through discarding some parameters, or failing to measure them.
* Assuming independence between measurements ignoring sequential effects
* The expected outcome for 1000 people who each bet once in a casino is very different from that for a single person who bets 1000 times, because in the latter the initial conditions change with each instance.
* Using ineffective or weak statistics as a basis for evaluation
* The most common statistic in use is comparing a current measurement to a single prior measurement, and inferring a trend (and a cause for that trend) from that change.
* Using data outside its range of applicability
* We often collect real data, and then we extrapolate that result to different conditions. This works sometimes. For example, based on an experiment where you cooled water from 70°F to 60°F to 50°F to 40°F, what would you predict about continuing to cool the water to 30°F?
* Not collecting repeated data on a meaningful time frame
* There is no point in making measurements at a rate significantly faster than the underlying phenomena can actually change. If you did so, almost all of the change that you detected would just be noise, not an actual change in the signal of interest.
* Poor selection of data
* Selecting data that are not truly representative of your operating conditions.
* Changing the data or the measurement approach during collection
* If you change the way you collect the data mid-stream, there may be no valid way to compare data collected by the first method from data collected by the second method. And, of course, doing things that allow
the data actually to change (e.g. the temperature at which we collect samples is allowed to vary, etc.) invalidates comparison too
* The limit of the utility of examples (the problem of induction)
* In the real world, no number of observations of a positive phenomenon constitutes a proof that that phenomenon is always true. Yet a single negative observation constitutes a proof that it is not always true.
* Attribution bias
* We attribute our successes to skills and knowledge, rather than to random chance. Conversely, we attribute failures to random events, rather than lack of skill or knowledge.
* Path dependence
* We “fall in love” with the path we used to arrive at an answer, and refuse to adjust that answer even when there is contradictory evidence.
* The fallacy of the silent evidence
* We see what appears to be a compelling set of evidence in favor of a proposition, but we fail to notice that all of the contradictory evidence has been omitted from our sample.
* Round-trip error
* The tendency to confuse the condition “no evidence of flaws in our system” with the condition “there is evidence that there are no flaws in our system.” For example, this latter condition would correspond to a patient truly being “cancer-free,” a very different condition than that of merely no longer presenting any visible indications of having cancer.
* The narrative fallacy
* Correlation does not prove causation. The tendency to create stories for everything, even when not justified, is called the narrative fallacy. Just because a story is appealing does nothing to establish its correctness; an appealing story may be completely wrong.
* Failing to recognize the existence and significance of outliers: the problem of scale
* Many engineered systems experience sampling or input rates that are orders of magnitude beyond normal human experience; people have no useful intuitions about such large sample sets. Many statistical techniques call for discarding outliers; but we must focus on the potential of outliers and, through design strategies, try to prevent them from disrupting our system.
* The tendency to believe what you want; the tendency to ignore evidence, and to explain away evidence that tells the “wrong story”
* Humans have a strong tendency to interpret all new evidence in a way that supports their selected explanation. They are also quick to discount and eliminate evidence that appears to contradict their selected explanation.