Texas sharpshooter fallacy

Written by Guido Jansen in
December 2017

So around a year ago, a friend of mine was looking for a house to buy somewhere around Amsterdam. Something that turned into a great hobby of his, sometimes spending hours (and hours) on housing websites. During one of these searches he found a single street with 6 houses for sale. So he told me: “That is not a good sign, there is probably something wrong with that neighborhood.”

Statistics is not something that everyone is equally enthusiastic about and not somethign that comes naturally to us all. However, what does come naturally and what is much easier for our brains to understand are stories.

To to help my team understand and remember statistical constructs I've started telling them stories about statistics and the first one is about the Texas Sharshooter.

The human brain is basically a pattern recognition machine: even when we have deficient data, our brain tries to find patterns so it can make decisions on that. This strategy made sense in a world where we lived in caves and didn’t have that much data. Our brain had a “better safe than sorry” strategy. But that doesn’t mean our brain is right. The clustering of houses for sale could also just be a coincidence without a cause.

This line of reasoning is known as the texas sharpshooter fallacy. This 'false cause' fallacy is coined after a marksman (who I assume was in or from Texas) shooting randomly at barns and then painting bullseye targets around the spot where the most bullet holes appear, making it appear as if he's a really good shot. But clusters naturally appear by chance and don't necessarily indicate that there is a causal relationship.

Photo by Tanner Boriack on Unsplash

The same thing happened to my friend: his brain saw the clustering of houses for sale, assumed a pattern and even conjured up a reason for it: namely that there would be something wrong with the street. But just as with the bullet holes on the barn, the clustering of houses for sale could also be a coincidence. And even if it’s not, there can be many other reasons for it other than a bad neighborhood.

So this is the Texas sharpshooter fallacy, also known as the clustering illusion. And it appears everywhere around us. You’ve probably seen many of these cases in the newspaper. For example when journalists find clusters of people having cancer they often quickly assume it has to be something in the environment like water or air pollution. Or when McDonald's shows you their research that shows that out of their top 5 countries, 3 of them are in the top 10 healthiest countries on earth, and therefore their food is also healthy.

To our brain, this makes perfect sense, which makes this a very tricky fallacy. Without any hypotheses these journalists take data, look for clusters and make up a reason. But like with all data, outliers will always randomly appear and that doesn’t necessarily mean that there is a significant link.

So don’t go around looking for bullet holes and paint bullseyes around them. So when you are working in conversion optimization, this means for example that when we look at Google Analytics, there will always be outliers and clusters. If you have no hypothesis and weren’t looking for these clusters, you will always need a follow-up study with new data to confirm if you are looking at a significant connection or just a random fluke.

So I hope this story will help you be critical about the data you see in your work. Next time I will have another fallacy story for you.

And by the way: when my friend asked the estate agent about the street with 6 houses for sale the estate agent started laughing and she said: “It’s a very long street” :)

Recent posts
Optimization hierarchy of evidence
Optimization hierarchy of evidence

A hierarchy of evidence (or levels of evidence) is a heuristic used to rank the relative strength of results obtained from scientific research. I've created a version of this chart/pyramid applied to CRO which you can see below. It contains the options we have as optimizers and tools and methods we often use to gather data.

[EN] Datascience can do what?
[EN] Datascience can do what?

This is a bonus episode with Emily Robinson (Senior Data Scientist at Warby Parker) en Lukas Vermeer (Director of Experimentation at Booking.com). In her earlier session that day, Emily said that real progress starts when you put your work online for others to see and comment on which in this case was about Github. Someone from the audience wondered how that works out in larger companies where a manager or even a legal department might not be overly joyous about that to say the least so I asked Emily about her thoughts on that. Recorded live with audience pre-covid-19 at the Conversion Hotel conference in november 2019 on the island of Texel in The Netherlands. (oorspronkelijk gepubliceerd op https://www.cro.cafe/)