I’ve got to admit. I never realized the problem is as big as it is. With the amount of content that is currently produced every single day, and the high velocity at which some of it spreads across social media, we need eye-catching headlines to attract attention. To be able to generate these ‘clickbait’ headlines we find ourselves a perfect partner; data visualizations. All it takes nowadays is a single visualization of data from a less-than-reputable source. You release the beast it into the wild (read: share the visual with a list of followers), and the world is being polluted with false narrative. The most disturbing thing is; the data doesn’t even have to be bad. The only thing you have to do is present it in a misleading way. Easy as pie!
A quick Google search provides you with a Wikipedia page, a Reddit community, and hundreds, if not thousands of articles about how graphs can be used to misinform you and me. Hard to grasp is that the study of how graphs, charts, maps, and diagrams can be used to deceive, has remained within the boundaries of academic circles in statistics, cartography, and computer science (Bihanic, 2015). However, visual journalists and information graphics designers need to be part of this debate. We need good data visualizations as it is easier for the brain to comprehend an image versus words or numbers (Cukier, 2010),
Of course, I do understand that creating a high-quality visual is a profession in its own right. The creators might not be trying to actively deceive you. The “misuse of graphical material” might have been completely unintentional. Following professional ethics codes in journalism and graphic design, knowing the truth and hiding it, or conveying it in a way that distorts it is outright inadmissible (Bihanic, 2015).
The Enliven Project created the infographic shown below. The aim is to provide insight into the relatively small number of false rape accusations. By assuming ‘one rape per rapist’ and representing each rape as a single ‘man’ symbol, the chart damages its own cause. When in fact, a rapist has an average of up to six victims. You could also argue that the number of unreported rapes is overestimated. No wonder this data visualization went viral. Sarah Beaulieu dedicated a whole page on her website to elaborate on the matter.
So far, so good. As the example shows, unintentional misleading visualizations do exist, but what about intentional misleading visualizations? As I’ve mentioned before, the internet is full of articles and videos that teach you how to defend yourself against misleading statistics. But why do we need those?
Graphs can be fundamentally misleading about underlying data, and design choices can skew viewers’ perceptions, leading them toward incorrect conclusions (Jones, 2006). Take for example the results of a study by Beattie and Jones, 2002. They indicate that sub-optimal slope parameters may produce distorted judgments of corporate financial performance by users. The researchers found that financial graphs with large slope parameters in particular are likely to be perceived as portraying stronger growth compared to those with small slope parameters. What this means is that with a small almost unnoticeable tweak, you are able to convey a totally different meaning.
The lie factor
The “Lie Factor” is a value which describes the relation between the size of effect shown in a graphic and the size of effect shown in the data. The closer this value is to 1, the better the visualization is understanding the actual effect. Vice versa, when the lie factor is greater than 1 the actual effect in the visualization is being exaggerated.
Have a look at the image below from the New South Wales ministry of health that shows the increase in the number of nurses from 2008 to March 2013. One could argue that a ministry is a trustworthy institution, right? Do you also notice something weird going?
At first glance the image portrays a huge growth in the recruitment of nurses. When you take a closer look, you notice four stick people represent 43,000 nurses, while 28 sticks people stand in for an additional 3,000 nurses. This makes a 7 percent increase look like a 7000 percent increase. I won’t bore you with boring calculations. Nevertheless, it is worth mentioning that the lie factor for this visualization is a staggering 95,9?!
WHAT TO DO ABOUT IT?
It should be clear to you by now that it is actually quite easy to create a misleading visualization. Even ‘trustworthy’ institutions make use of these shady misleading practices. And I was wondering if we need advice on how to defend ourselves against misleading visualizations… I would like to give you a few recommendations to stay sharp when you’re presented with either graphs, charts, maps, and diagrams.
MANIPULATING THE Y-AXIS
Make sure you check the y-axis of a graph as it is one of the most common ways data is manipulated in visuals. It makes something that is not significant seem like quite a difference. This is called a ‘truncated graph’. One of the first things that is removed is the baselines or y-axis as can be seen in the example to fool you into thinking February experienced a drastic increase in conversion rate.
Be prepared for blown out scales of a graph to minimize or maximize a change. This phenomenon is called ‘axis changing’ and is almost the opposite of truncating data. Axis and baselines are included but are changed so much that they lose meaning. Look at the next example about climate change. Why are temperatures like -10 degrees up to 110 degrees included? Of course, to make the line as flat as possible and make you believe climate change is no real thing. If you have a look at the ‘fixed’ graph next to it you have a better understanding of what is actually happening.
CHERRY PICKING DATA
Be aware of visuals misleading you through skewed data, wherein only certain parts are included that sheds a positive light on their viewpoint. Take a look at the two examples below. The left graph portrays to cover a long period as many points in time are added. In reality it is only 10 years. You might think that the UK national debt is at an all-time high. However, when you analyze the right example, you know you’re being fooled!
USING THE WRONG GRAPH
Watch out for the use of the wrong graph as it can create a misleading data visualization. Most probably Microsoft made this mistake on purpose to give us the feeling that Microsoft Edge is almost 50% faster than Firefox and 25% faster than Chrome. When you check the right graph you notice that in reality the difference in browser speed is only marginal.
GOING AGAINST CONVENTIONS
Don’t be surprised to see misleading graphs and charts that alter long-held conventions or associations. Think of using green for losses, and red for profits. It would make no sense to anyone. In the example below they used a light color for high levels and a dark color for low levels. With some common sense you would rather flip this color schema around.
I hope you’ve been thought at school that correlation doesn’t imply causation. Nevertheless, because of all clickbait headlines you’ve read over the last couple of years it might have fell out of your head. The next graph might be an obvious one, but still, you’re warned!
WHAT IS THE POINT OF IT ALL?
The internet is bloated with examples of misleading data visualization. The recommendation you read might actually come in handy one day. Don’t you think it is a frustrating thing that we have to teach ourselves how to guard against misleading data visualizations? Shouldn’t we expect professionals and institutions to be trustworthy instead of spreading misleading narrative? Frustratingly, we can’t expect this misconduct to never happen again. I suggest that we immediately start teaching kids in high school how to defend themselves against future misconducts of data visualization. What do you think? Or can you think of a better solution?
Beattie, V., & Jones, M. J. (2002). The impact of graph slope on rate of change judgments in corporate reports. Abacus, 38(2), 177–199. https://doi.org/10.1111/1467-6281.00104
Bihanic, D. (2015). New challenges for data design. New Challenges for Data Design, 1–447. https://doi.org/10.1007/978-1-4471-6596-5
Cukier, K., (2010). A special report on managing information. The Economist, 394 (8671), 3e18.
Jones, G.E. (2006). How to Lie with Charts, Second Edition (Santa Monica, CA: LaPuerta).