Saturday 1 November 2014

The misuse of statistics

"All things are subject to interpretation. Whichever interpretation prevails at a given time is a function of power, not truth" - Friedrich Nietzche

We are surrounded by numbers and statistics. They inform us, they impress us... but do we truly understand the information we are being impressed by? In my experience, statistics is taught as an add-on to other subjects - especially maths or some social science -  and is rarely considered to be a useful subject in its own right. In fact statistics are often dismissed as nothing but lies. It has taken me to nearly reach my mid-twenties to truly appreciate the power that statistics can have. In particular, the interpretations thereof that are proffered by the media. In this post I am going to try to highlight why statistics deserve not only more attention than they receive but also more credibility. First, I'll outline a few examples of common ways in which statistics are used to misrepresent information. Then I want to give some concrete examples, in this case I will focus on a poll I came across earlier in the week that made some somewhat startling claims based on the data collected. And finally, I want to make it clear that this is not a problem that faces only the general public, it occurs in professional life too. In particular, it can occur where, arguably, it matters most - in research itself. I don't want anyone to finish this post and continue to believe that statistics are useless lies, despite their misuse. Instead, I hope to show that statistics are a fantastic tool if they are presented in a transparent and useful manner. Also, having a bit of cynicism and being a little critical of information presented to you is one of the most powerful tools you have at your disposal.

Examples of misusing statistics
One of my pet hates is the average being used as a definite, useful data point. The average often doesn't represent any data point in the dataset, it is subject to outliers and it's representativeness is dependent on how large the sample is. If I take the average of two sets of numbers;

2 + 2 + 3 + 1 + 3 + 2 + 2 + 3 + 2 = 20

1 + 3 + 2 + 1 + 1 + 6 + 1 + 4 + 1 = 20

The average of each of these sets is the same (2.222, which is not a number found in either data set) but this number is now representing two very different data sets (we need a standard deviation or some such to tell us more about what the data looks like). It is worth being sceptical of averages. Not only because they are the most commonly presented statistic, but because techniques such as those described below are sometimes used to manipulate the averages you are presented with before you even get a chance to be sceptical about the data itself.

Discarding unfavourable data
In essence, this means that data that was collected that didn't fit the desired or predicted outcome has been neglected. This could mean it was removed from the final presentation that has made its way into the public, or it may have been removed earlier in the analysis process. An even simpler example is when several studies are done, but only those who produce the desired effect are published. So if 20 studies are conducted and 16 produce a 'positive' outcome, strictly speaking the so-called 'success rate' is 80%. However, those 4 studies that produced a 'negative' outcome may never make into the public eye. 

This concept is related to cherry picking, although this term generally refers to discarding alternative interpretations or theories in favour of those which support an idea you already hold or are arguing for. This often happens when literature reviews are being written, or political claims are being made.

Loaded questions

Questions which prime someone to answer in a particular manner are called loaded questions, and they are a real problem in some social sciences. Some social sciences make use of questionnaires to gather data, social psychology and criminology are two examples where questionnaire data is very important. However, it is very difficult to interpret a result if it is clear that those who filled out the questionnaire were being nudged in a certain direction by the wording of the questions. Some of the most common examples include using phrases such as "do you agree..." and "is it true that..." which of course prime the individual to want to agree with the statement. One of the questions suggested for the question in the recent Scottish independence referendum was "do you agree that Scotland should be an independent country". This was quickly thrown out in favour of the chosen wording, to avoid this exact problem.


Overgeneralization
This happens when a study is conducted using a sample from which the results cannot be generalised to others, but they are generalised anyway. If I study attitudes to government among members of the Young Greens, a group who are made up of adolescents and young folk who support the Green Party of England and Wales, then any results I obtain cannot be generalised to members of the Conservative Party aged 50+. It would be naive to claim that the results of my study represent attitudes to government of anyone other than the specific population I chose, and yet this happens a lot. It is always worth considering who took part in a study, and whether they can be said to be generally representative.

Biased samples
This is related to the above issue, however in this case a sample is chosen for testing who are more or less likely to produce the outcome considered desirable. If I have an idea that the general population is angry with the political class of the day because they feel their environmental and energy policies are failing, I would be committing the error of a biased sample if I specifically sought out the Young Greens to take part in my study, at the expense of including members of the Conservative party. Again, this happens a lot. 

False causality



It is important to remember what the statistics being used are actually telling you. Correlation is a very useful tool, for example there is a very high correlation between smoking and lung cancer - but this does not mean that smoking predicts lung cancer. Other types of studies can show this, but alone the correlation is not enough, and you should be wary of accepting it as such. Another related issue is when a third variable, sometimes called a mediating or moderating variable (these are actually different things, but that is unimportant for now), that actually explains a good deal of the relationship between two things is ignored. For example, people in the UK buy more ice cream when it is summer in Italy. This is an example of a correlation between two things that needs a middle step to explain the relationship - in this case, of course it is the presence of the sun and hot weather. 

Proof of the null hypothesis
Testing for a null result comes along with all sorts of issues. In fact, I will leave this one out for now and write a full post about it at a later date. This is a quagmire of issues, and for those interested is a worthy endeavour. However I am aiming in this post to explain the general misuse of statistics, not to discuss statistical theory. Suffice to say, trying to prove that something didn't happen because it shouldn't have happened (and that is an interesting effect) is generally less convincing as evidence, because of the way standard statistics work. 

A recent example in the media
This post was inspired by a poll I found while reading the news a couple of days ago. Before going into this example, I want to make it abundantly clear that SciPhi is not affiliated with any political party and we hold our own political views. All videos and posts are made an in impartial manner and we are not interested in changing any of our viewers or readers minds on political matters. However, for this post political examples are very useful, because they highlight where statistics appear to be used in strange ways in order to provide information to support a given agenda. In particular, we are a UK based group and as such these examples are all centered around UK-based politics.


So, the poll claimed that 54 out of the 59 seats in Westminster allocated to Scottish Members of Parliament (MP) will be filled by Scottish National Party (SNP) candidates after next year's UK general election. This was based on an Ipsos Mori poll commissioned by Scottish TV (STV) where 1029 participants completed a poll asking them who they would vote for if the election were tomorrow. Now, of course I don't have access to the full details about the study however based on the information I do have, I would raise several points:



1. Leading questions: the question forces participants to make a rushed decision, which is not representative of how people decide who to vote for in a general election.


2. 1029 is a small sample: yes it is about the same as other polls such as this, but it is still not enough. When the eligible voting population of Scotland is over 3.5 million, one must wonder whether 1029 people can truly represent that diversity.

3. Where did these people live? Seats in Westminster are allocated to the MP who got the most votes in a given constituency, not an overall majority. Unless these people were spread nicely across all 59 available seats then claiming that a pattern of seat allocation can be presented from this study is plain wrong. Also see point 4...

4. The SNP turned out only to have 52% of the vote: we have a first past the post system, which in itself skews the number of seats allocated to parties (the government in Westminster is not representative of voters in the UK for this reason)

The main issues here are over-generalisation and misrepresentation. STV cannot claim anything about the distribution of the results, and therefore cannot claim anything about the allocation of seats - as such, they shouldn't have done it. And if they do have a roughly even number of people from each constituency vying for each seat, that number will be tiny (17.44 people per seat available (this shows another interesting thing about averages, they don't have to be whole numbers: can you have 0.44 of a person?)) and so the results would still be unreliable.  The results are therefore misrepresentative, because what they say the results show is not true. However, STV were biased towards a Yes vote during the independence campaigns (in contrast to the British Broadcast Company (BBC)) so perhaps this is of no real surprise. Again, be cautious and consider the agenda of the body presenting the data.

Examples in research

This will be a short section, although I will come back to this in my planned future post concerning null results. It is not entirely uncommon for researchers to use the wrong statistical analysis in their work, which can begin a chain of misinterpretation. If you read academic journals or any scientific literature, be sure to try to get a sense for the statistical analysis that has been done. Statistics are the backbone of the argument being put forward, and the interpretation must be compatible with the data presented. Examples of issues include:


1. The statistics do not follow the experimental design. More about this another time, this is a stats intensive argument and this is not the right place or time to discuss it in full.


2. The interpretation does not follow the analysis: if something has not been shown to be statistically interesting, do not claim that it is so when discussing the results. If something nears statistical significance, discuss why but do not make grand claims about the data or result.

3. Effect sizes are not reported: these tell us about how much practical importance a result has, and it is important to put them in. If something is not statistically significant but has a large effect size, it is still an interesting effect to discuss. Some journals now require the presentation of effect sizes.

I included this section only to remind everyone that statistics are powerful things, and can change the outcome of a study - or at least the outcome that is presented to a wider audience - in profound ways. However, it should be appreciated that the majority of researchers are highly professional and appreciate the importance of providing useful and representative information. The scientific community has recognised that the misuse of statistics is a problem, and there have been several discussions about how to tackle this (read more here and here).


I'll leave it there

This post has been a bit heavy, but I hope I have demonstrated the importance of being critical about statistical evidence. Statistics is under-rated as a discipline, and although it appears tedious to learn about, that understanding puts you in a position of great power. If anyone has any examples of the misuse of statistics, please post them in the comments below. Some discussion of the issues raised here can be found here and here. I said at the beginning I wanted to make it clear that statistics are not just lies. By highlighting where they can be abused, I feel that it is clear that statistical analysis can also powerfully demonstrate the truth of a matter. Like any tool, it is about the implementation not the tool itself. Statistics are robust and scientifically credible tools for making sense of data. Provided that data has been collected in a responsible manner and statistics are applied in a proper way, then they are among the most credible additions to an argument. It is the misuse of statistics we must be wary of, not statistics as a whole.