A short(ish) intro
I love sport. Watching, reading about, talking about, and thinking about sport takes up more of my time and emotional energy than I like to admit. What I really love is trying to understand how sports work: not individual games, but over the long-term. How can I learn more about what makes teams win and what makes teams lose?
Sport by the Numbers is a newsletter all about using data to better understand those things. Away from lazy narratives and over-analysing random outcomes, how can we use long-term data to uncover trends, patterns, and insights that influence the games that we love? I will be focusing on the sports that I seriously follow: football, cricket, American football, Formula 1, and golf. The first post will be coming on Sunday February 20th, focusing on the performance of England’s test cricket team.
I hope to publish new pieces approximately every month. I say “hope to” because, honestly, I don’t know how this is going to go. I might find that I don’t have the knowledge to produce answers to the questions that I want to investigate. Either way, I hope this is an interesting and informative exercise for myself and for you readers.
If you’re reading this, I’d love to hear from you! If you have suggestions for how the newsletter can improve, how my code isn’t spot on (spoiler alert: it won’t be), or anything else, please get in touch. I would particularly love any suggestions for topics and questions to investigate. Comment on my posts or tweet/DM me @UtdTait.
Keep reading for a longer intro focused on my view of the problems with conventional sports analysis, and more about my motivation for starting this newsletter. Enjoy and please subscribe!
Long Intro - The State of Sports Analysis
From the coverage of sport on mainstream TV channels, sport fans can be forgiven for thinking that analysing their favourite team is something with no place for science. After all, after every bad performance we hear that the other team just “wanted it more”, that the talented, mercurial player is “too lazy”, or that the team simply didn’t show enough “heart and fight.” Statistics, especially those that go beyond the surface level, are largely non-existent.
While these mental factors no doubt play a part in explaining the events of any particular game, they are not everything. It is an easy and, frankly, lazy way of explaining the various, and often random, sequences of events that determine the outcome of sports. As anyone who has ever played a sport knows, no one goes onto the pitch deciding not to put in their full effort that day. To the extent that mental blocks matter, they are often caused by events that play out during the game, rather than the other way around. We focus on them because they are easy to understand and easy to talk about, creating easily digestible content on TV or online. After all, media companies have an incentive to get views and clicks, not necessarily provide in-depth information.
Randomness is bad…or good?
This form of analysis also places too much emphasis on any one single game. For all of our human desire to understand every event immediately after it takes place, sport is something that is inherently beset by randomness. Almost every element of a single game is random; whether a player shoots an inch inside the post or an inch outside can determine whether a football team wins or draws; whether a wide receiver gets both feet down in bounds or one toe is skimming the sideline can determine the Super Bowl (and has); a cricket team using up all of their reviews can deny them from winning a test match after an umpire’s misjudged LBW decision (sorry, Australians).
As much as we might try, single games are extremely hard to make sense of because they are, at heart, a set of random events. Without randomness, the more talented team would win every game and we wouldn’t bother watching. It is that randomness, the knowledge that anything can happen, that makes sport such an engaging spectacle. And yet it is also this randomness that makes analysing single games so difficult.
That is not to say that we shouldn’t review and analyse individual games. Every game is filled with extraordinary moments that deserve to be looked at repeatedly. Extrapolating from those moments, though, and drawing broad conclusions from them, is far more difficult.
Making Sense of a Random World
To make sense of sport, we have to look more broadly, at long-term patterns. While that is more difficult and more time consuming, it is ultimately how we can best understand what leads to wins and losses. While single games are beset by randomness, long-term trends overcome those random individual events and can help us make sense of the sports that we love. Data and analytics in sport seek to do just that.
In recent years, there has been a huge rise in the use of data and analytics to analyse sports. Much of this work stemmed from the Moneyball revolution in baseball, which showed the power of using data, rather than prevailing and unquestioned conventional wisdom, to make decisions. Baseball is a sport ideally suited to using data because each pitch can effectively be thought of as an independent experiment. Every pitch can be added to a huge database of pitches, from which patterns and insights can be observed.
Other, more continuous sports like football make it harder to gather data. But people and companies much smarter than myself have made amazing headway into creating and using statistics of a team or player’s underlying performance, rather than relying on random outcomes. Mainstream coverage of sports like football, cricket, and, particularly, American football are increasingly using this data to inform viewers, but progress is slow.
So What am I Subscribing to, Anyway?
This is a newsletter about using this type of data to shed light on some of the biggest questions that we have about our favourite sports. I will be looking at trends in sports and trying to make sense of what is happening, away from media narratives and perceived wisdom, towards evidence that is based on numbers. I don’t claim to be an expert here; I have never played sport at an elite level, nor am I a whizz when it comes to data science. What I am is fanatical about sport, competent with data, and wishing to investigate questions that consume my thoughts about various sports that I spend my time watching. And if I’m thinking about these questions anyway, I thought that I might as well do something productive with those thoughts, find out the answers for myself, and share them with others.
Approximately every month I will publish a new piece using data to shed light on an interesting question. Sometimes this will be testing whether a widely accepted theory is true. Other times it will be testing how accurate statistics that we rely on really are. The unifying theme will be an unflinching focus on using numbers to find out.
Alongside every new article, I will publish any coding that I have done to produce my work. Fair warning: I am farrrrr from an expert at coding. In fact, my foray into using Python only started in the last few months. In truth, part of my desire to start this newsletter is to give myself an opportunity to practice coding. So it won’t be perfect! If you see any errors, please let me know. I won’t be offended, this is a work in progress for me and I don’t expect to get it right.
Other than pointing out holes in my code, get in touch to suggest new topics for me to write about! I can’t promise that I will be able to investigate every single one, but I will do my best. The main sports I am focusing on are those that I spend my time watching: football, cricket, American football, Formula 1, and golf. If you have any topics in those sports that you would like to be investigated, let me know! A suggestion can be about a specific team, a specific player, or as broad as you like, as long as it can be investigated using data. Comment on a post or tweet/DM me @UtdTait.
Please subscribe to Sport by the Numbers if you haven’t already! If you enjoy reading my stuff, please consider listening to my takes on sport, too. I am a big Manchester United fan and co-host a weekly podcast about them (named the Manchester United Weekly Podcast. I know, creative, right?). If you made it this far, thank you! (Looking at you, Mum). I hope you all enjoy.