A few days ago, I came across this ‘infographic’ done by the people at Hunch and Column Five Media that compared “PC People” with “Mac People.”
The same ‘infographic’ soon started making its rounds across the internet, popping up in my Google Reader and Twitter feeds and showing up on Reddit. I found myself thinking more and more about it, and kept finding myself bothered by it.
After dwelling on it for a while, I realized what it was that bothered me: it was the entire ‘infographic.’ It was pretty much wrong. No, I’m not talking about the data or the findings. Those are correct as far as I can tell, or at least as correct as it can be given the creators made no errors in collecting and reporting information. What was wrong was the representation of the data.
Why the infographic is wrong
Again, let me re-iterate, I’m not complaining about the data. I understand that Hunch, the company, created a sort of web survey and collected responses from hundreds of thousands of its users, took this information, did some calculations, and came to several conclusions. I’m not going to talk about the lack of transparency in the process. There is an obvious selection bias in the respondents, but I’m not even being that picky. Also, I swear I’m not trying to be a nerd and nitpicking at how they might or might not have violated some sort of academic standard by not conducting confidence tests or something like that. (These are the sort of things I could talk days about if you got me going. But I’m not going to.)
Instead of all of that, I want to talk about two things: Usefulness and framing.
Usefulness
First off, let me define what I mean by usefulness in this context. There are several factors that I believe lend to making a useful statistic. One example is testability A statistic is testable if you wanted to and given the right context, you can go out and do your own experiments and easily verify whether or not you can support that statistic. Another factor is its explanatory power. In other words whether or not you can use a statistic to make inferences about the world, and perhaps make predictions on how the world behaves.
Several statistics shown in this ‘infographic’ fail in this respect. Let me point you to some examples of what I’m talking about. Take this gem: “PC people are 23% more likely to say they seldom throw parties. Mac people are 50% more likely than PC people to say they throw frequent parties.”
First, let’s evaluate its testability. If you’re anything like me, you probably had difficulty constructing how they acquired their results. The wording alone is difficult enough to understand. How are you supposed to interpret a statement like: “PC people are more likely to seldom throw parties?” Thinking on it a little bit, you might eventually come to the conclusion that the question “How often do you throw parties?” was presented to self-identified PC and Mac users, with two possible choices: “seldomly,” and “frequently.” Eventually, after some mental gymnastics, you might be able to reproduce the test and duplicate the results to some extent. Regardless, it wasn’t the easiest process.
As for the explanatory power, it’s pretty clear that this statistic falls flat. The only meaningful bit of information you can gleam from this quote is that PC people are more likely to respond that they seldom throw parties, etc. The 23% figure is pretty meaningless if you think about it, i.e. there’s very little application of the fact that “PC people are 23% more likely to say they seldom throw parties” to real life. Again, if you were willing to perform some mental gymnastics, you may be able to produce something meaningful, but to most people, 23% is just a number that does very little to explain relevant information.
Instead, an example of a testable and explanatory statistic would be the following: “Mac users throw twice as many parties as PC users.” As a reader, I can look at that statistic and think to myself, “No, I don’t think that’s quite right,” and conduct my own experiment to verify or discredit the results. I would just have to find on average how often Mac and PC users throw parties. In addition, the statistic can be directly applied to real life: I can go, “Oh, I’m a PC user, and I throw 2 parties a year. If I extrapolate, that means that based on this statistic Mac users throw 4 parties a year.”
Framing
Really, though, all that stuff about usefulness is kind of pointless in comparison to what is perhaps the bigger issue with this particular ‘infographic:’ the framing.
John Gruber of Daringfireball.net, reposted this a few weeks ago. He said, about the ‘infographic:’ “Nothing terribly surprising, but interesting results nonetheless.” In other words, the findings of the ‘infographic’ basically challenged none of his beliefs, and yet he finds it so interesting that he feels the need to share it with his audience of at least 6,000/day. His response, as well as the response of many others who shared this ‘infographic,’ indicated to me that something was going on. People obviously saw through the faults of the ‘infographic’ and found it essentially arousing enough interest for them to forward it on. Francis Bacon wrote once that: “The human understanding when it has once adopted an opinion (either as being the received opinion or as being agreeable to itself) draws all things else to support and agree with it.”
In other words, people were eating the ‘infographic’ up because it is confirming a lot of stereotypes that are held by many internet users, particularly Mac enthusiasts. I’ve noticed that certain Mac enthusiasts, whether they admit or not, like to imagine the world as some sort of distribution graph on a single axis of coolness, sophistication, and intelligence, and they imagine the graph to look something like this:

The funny thing is, they seem to forget that Mac users are vastly outnumbered by PC users. If anything, the graph, if it exists, looks something a bit more like this:

As a matter of fact, the graph would probably be infinitely more accurate if we revealed an underlying variable here:

The axis could have easily been age, as most older computer users are more used to PCs, which would also explain away the political and social aspects of the entire ‘infographic.’ Or it could be income, because people with more money tend to purchase more expensive computers which have high production values (Macs) rather than cheaper ubiquitous computers (PCs).. All the sudden, the infographic becomes a lot less interesting. Old people tend to be more conservative than young people. They also aren’t as tech savvy as young people. People with higher incomes are more likely to have completed a 4-year college degree.
Doing even a little bit of data journalism would probably begin to explain away at nearly all of the differences outlined. It just doesn’t have the same impact, does it? Instead of confirming our suspicions and beliefs of the superiority of Mac users, the statistics presented by this chart are instead mundane facts that basically profile people by socioeconomic groupings. The original ‘infographic’ had been framed in such a way that it presented one version of reality that wasn’t painting the whole picture.
Without realizing this, people who are usually otherwise discerning and intelligent see this infographic, arrogantly chuckle to themselves at how superior they are, and forward it along to their social network.
Conclusion
It peeves me to see the following in Hunch’s footnotes: “Yes, Poindexter, we know that correlation does not imply causation.” As important as it is for readers to always view things in a critical eye, responsible representation of data is a task that should lay with the creator. If this were not the case, we would be bombarded with so much misinformation we wouldn’t know who or what to believe.
As designers, we are constantly looking down on things from our high horses, being exasperated at clueless clients and ridiculing non-designers for their lack of design understanding or aesthetic taste. But perhaps in the same way that someone with a copy of Photoshop and Dreamweaver shouldn’t immediately start making websites without the proper experience, designers with access to an Excel spreadsheet should not be so quick to create an ‘infographic.’ Good data journalism requires a good understanding of statistics and data analysis and presentation, and unless you can be confident that what you are creating is correct, right, or useful, perhaps you should take a minute and evaluate your creation.