I read Tim Harford’s book, How to Make the World Add Up with great interest. I am a mathematician at heart and enjoy working with, reading, and looking at the numbers that make up the world. Yet I know that, too often, these numbers are misrepresented.
Harford’s book starts by telling us about how to lie with statistics. I thought that was a clever way to introduce us to the topic, a way to look at it from a slightly askance angle. He describes how numbers can confuse and work against us. The book covers 10 rules, and he wraps up with a final Golden Rule that summarises the rules. (Spoiler alert: The Golden Rule is “be curious”).
Each chapter is written with great stories, from experience and from history, that illustrate the challenges and difficulties of statistics. He discussed how they can misrepresent and mislead and, importantly, how they can also inform.
The book asks us to search our feelings to consider whether you believe these numbers to be real, or false, which sets one on a journey of validating their accuracy (beware confirmation bias). By referring to our own experience we can question what appears right and wrong and what needs to be checked. He says “Often, that will require us to ask a few smart questions”. We should be checking for all three, what appears right, wrong, or uncertain.
He talks in detail about jumping to the numbers to quickly – premature enumeration. He cites how numbers can hide detail as easily as they can expose it. He asked us to consider what is actually being measured by the statistics that we are faced with. That brought home to me what happens when we do not understand the numbers or when do we not understand the basis of the numbers. We learn quickly how these insights invite us to question the basis of almost every claim.
For example, during the Brexit campaign the lobby group, Leave means Leave called for a five year freeze on “unskilled” immigration. It’s was hard to which numbers they were using to support this policy. Even the definitions aren’t clear. What does “unskilled” actually mean? Of course, political arguments, often use loose language in order to present statistics. In this case, unskilled turned out to mean somebody who did not have a job offer for a rule that would pay at lease £35,000 a year. That’s a level that will exclude most nurses, primary school teachers, paralegals, many research scientists and so on. Is that really what “unskilled” means to the ordinary person?
In another rule Harford invites us to seek the backstory to understand exactly what was being tested. He uses an example of the difference between many versus fewer choices. What seemed to arise in the numbers, is an indication that fewer choices results in more sales. Yet there are many other factors at play, and when we dig underneath the story and looked more deeply at the underlying truth. He also reminds us in rule six how our choices can be affected by other people, we tend to follow the herd, rightly or wrongly.
He also shows us how easy it is to make mistakes by taking the results of specific being extrapolated to a more generic population. For example, a 1950s experiment relied on American college students as its test subjects. When it extrapolated to the general population to suggest a truth for them the reality diverged. Of course, the general population includes but is not exclusively American college students. The consequence of that is that we draw the wrong conclusions. It’s important to consider how to address missing data. Even the national census, which seeks to get responses from the whole population to avoid the extrapolation issue. Yet, it still does not include all members of the population. There are groups, the very sick, the homeless, for whom completing a census is either difficult or impossible. Yet when Government draws conclusions about the whole population it is not quite the same thing. How likely is it that public policy will take appropriate consideration of the very sick, or the homeless if it relies on census data alone?
Harford moves on to talking about algorithms and the impact that algorithms have on the presentation of data. The algorithms are often opaque. The collection of rules has become so complicated that it isn’t possible necessarily to understand all of the choices that are being made by computers. There was a real clarity for me around the difference between alchemy, and science. In alchemy the methodology is hidden, and because it is hidden it cannot be scrutinised. A key part of science is the ability to scrutinise and repeat and for that the methodology has to be explicit and transparent. In today’s world the algorithms decide how we will be presented with data, which data to present, and even the conclusions that we should draw. That methodology is alchemical in nature.
Harford then looks at some of the typical assumptions that can be made, which assume “statistical bedrock”. For example, just because a statistic comes from an organisation that has historically produced good results, does not mean that this particular result is a good one. The Office for National Statistics in the UK is one such organisation, independent, specialist, but not always right. Worse, such organisations are likely to defend the data they present. Even so the presence of such independent organisations are really important in the world, both for cross checking, and for holding others to account for their narratives.
Harford also comments on the beauty of misinformation, as well as the beauty of accurate and informative information. He uses examples from Florence Nightingale and William Farr. This chapter also looks at the ways that data can be presented to make it more powerful.
I was struck by the chapter that discusses Irving Fisher. Fischer was perhaps one of the greatest economists who ever lived, yet one of the least well known. He compares his career to that of John Maynard Keynes. The key difference in their careers was that Keynes was willing to change his mind when his information changed, Fisher stuck to his guns. In the 1930s depression one was wiped out, and one was shown to be worth listening to. For his ability to both predict, and adapt to change Keynesian economics remains a key part of the foundations of economic theory, even today.
Halford also discusses an experiment, run by Canadian born psychologist, Philip Tetlock. Tetlock looked at predictions from those who are in a position to make or ought to be required to make good predictions in the area of politics and geopolitics and economics. He designed questions, for which the outcome would come to pass, so that he could validate the forecasts with hindsight. What he discovered was that those in a position where they are required to make predictions tend to stick to them, just like Irving Fisher, even when the evidence starts to move against them. In general, a group drawn at random from the population is at least as good if not better at predictions. They are also better at acknowledging that they got it wrong, and using that ‘miss’ to adapt and develop their predictions going forward. It was this work that led to the identification of so called ‘super-forecasters’.
Finally, the golden rule, in the final chapter, to be curious. Always looking under the surface of those beautiful charts and graphs (with each of the ten ‘rules’ in mind) to keep an open mind to adapt to the answers as they appear.
This is not a statistics manual at all. The stories are illuminating and made it a delightful, informative and entertaining read. It was creative in the manner in which it was written. I wasn’t surprised by that. Having listened to Tim Hartford over many years on the statistics show on radio 4 “More or Less” I know that he has been putting these rules into practice himself.
I thoroughly recommend the book and hope you enjoy it as much as I did.