Sports Analytics: An Introduction
Sports analytics is becoming an incredibly popular avenue for my students to break into the industry.
And it doesn’t seem like this firehose of excitement is going to end anytime soon. Case in point: I spend countless Saturday mornings during the Fall and Spring semesters meeting with potential students and their parents during Visitation Events in an effort to convince them that Kutztown University of Pennsylvania is the right educational fit for them.
Over the course of the last couple of years, the following questions are asked multiple times, from different families and students, at EVERY.SINGLE.EVENT: do you teach sports analytics in the program? And how do I break into the field of sports analytics?
The answer is always the same: we are creating the course (I am currently looking at the paperwork for it on my desk).
The point here is this: the interest in sports analytics as a field of study and a path of employment is more popular now than ever.
But, despite this incredible growth in interest, there are still plenty of question marks regarding how one should break into the field and make a name for themselves.
Sports Analytics Education/Degree Options
The way I see it, there are three ways to go about getting yourself an education in sports analytics: (1.) getting a formal college degree; (2.) taking a few courses as part of a different degree; (3.) self-teaching.
There are certainly pros and cons to each option. And I will be honest and tell you this: I am part of the “self-taught” group, meaning I never received formal training in sports analytics.
Rather, I purchased some books, taught myself how to code in the R language, and was off to the races after that.
But, of course, that route might not be the best for everybody. And as competition increases for jobs in sports analytics, it is likely that a more formal training in the field will be necessary.
With that in minds, let’s take a closer look at each above option.
Sports Analytics: Formal Education
Receiving a formal education in sports analytics provides limited options at this point.
This is the case because there are just a few colleges/universities in the nation that provide an undergraduate degree in sports analytics.
The biggest one, of course, is Syracuse University. I am actually quite jealous of their program and what they have been able to build from the ground up.
For example, as part of your formal education, you will take classes in:
- Sport Data Analysis
- Sport Economics
- Price Theory in Sports
- Database and Programming
- Web Scraping
Those courses all sound amazing. But do not forget that any sports analytics education is going to be ground in mathematics and statistics (as it should be!).
You can see the full list of courses you would be required to take in this program here: Sports Analytics Major Requirements.
Calculus I, Calculus II, Calculus III, Probability and Statistics, Macroeconomics.
As you can see, the coursework for a sports analytics degree isn’t exactly on the light side. And that is important to keep in mind because, personally, I hate calculus. I understand it. I can do it. But I hate it.
Like, loathe it.
I am not convinced I could sit through three different courses of calculus on my way to a sports analytics degree.
And, remember, there is literally one sports analytics program in the entire country right now. (**I will keep this updated as new programs form). If you don’t feel like moving to snowy Syracuse, New York, then you are likely out of luck until more colleges and universities following the sports analytics trend.
Taking a Few, Select Courses
If you are not currently a student at Syracuse University, then you are most likely going to fall within this option.
And it is tough for me to talk about this with any kind of specifics, because there are so many variations in programs.
That said: you could piece together an informal analytics education by picking and choosing your free electives wisely. Take statistics, calculus, etc. Take a few programming course from the computer science department.
If you are lucky, the sport management program at your school might offer an introductory analytics course where you can become introduce to R and Python.
Doing it in this fashion may not be as immersive as a fully-blown program, but it is honestly better than nothing.
If this still isn’t for you though – perhaps you graduated from college a decade ago like myself – then you probably have to go the route of self-teaching, though there are still avenues for an education that I will touch upon.
I will be totally honest here: this is the route that I am most partial to because it is the route I took to carve out a spot in the sports analytics field.
It may seem overwhelming to go this route, but I am hoping to put down an easy-to-follow path for you to follow. As well, every product suggestion I make here is simply that: a suggestion.
But the courses and books I suggest here fast-tracked my learning process.
And I say that because it forced me to just start coding.
The courses and books forced me to think analytically about sports.
And the courses and books gave me the ability to get data to work and practice with.
Going forward, we are going to explore several things about the self-teaching route:
- Helpful books
- Helpful courses
Before charting your own course in the world of analytics, it is important to base yourself and learn some of the basics.
Thankfully there are some outstanding books to help you get started.
That said: if you were to go onto Amazon and start searching for books on the topics of analytics in RStudio or sports analytics, you will likely be quickly overwhelmed.
I know I was. And I have a bookshelf full of books about R and analytics that were underwhelming to show it.
Because of that, I have distilled my list of recommended books down to just four. Each one, I feel, provides an outstanding education on different elements of RStudio and sports analytics.
I want to provide some thoughts about each book as well.
- O Reilly Media
- Wickham, Hadley (Author)
- English (Publication Language)
- 520 Pages - 01/10/2017 (Publication Date) - O'Reilly Media (Publisher)
R for Data Science is the holy bible of learning the R programming language. It is written by Hadley Wickham, who is the Chief Scientist at RStudio. It simply does not get anymore authoritative than that.
What it lovely about this book is that it provides all the data sets you need to get up and running. At first, you work with flight schedule data as well as automobile data.
It isn’t exactly sport. And it might not be the most exciting thing to dive into as you learn.
But the data is perfect for a beginner and Hadley does an outstanding job of walking you through the earliest stages of data science.
Of course, the book moves onto more advanced things, especially using the TidyVerse package.
However, when it comes to getting a beginner’s education, there is simply no other book that does the job quite like R for Data Science.
- O Reilly Media
- Chang, Winston (Author)
- English (Publication Language)
- 416 Pages - 01/15/2013 (Publication Date) - O'Reilly Media (Publisher)
The next step after learning the basic functions of the R language is to be able to take your data and put it into graphical form.
That is the true beauty of R.
In any case, you will be using a package called ggplot. The R Graphic’s Cookbook by Winston Chang is an outstanding source to really dive into the nitty, gritty of plotting in the R language.
The purpose of sports analytics is to break down complex situations into easy to understanding snipped of text and graphics.
Thankfully ggplot makes that easy … and this “cookbook” helps you along the way.
- Marchi, Max (Author)
- English (Publication Language)
- 360 Pages - 12/03/2018 (Publication Date) - Chapman and Hall/CRC (Publisher)
This is my personal favorite, because it is what got me started in the world of sports and the R programming language.
First, I need to say that Analyzing Baseball Data with R is written by some of the foremost experts in the field.
Max Marchi is a baseball analyst for the Cleveland Indians.
Jim Albert is a professor of statistics at Bowling Green.
Benjamin Baumer teaches statistics and data science at Smith College and was previously an analyst for the New York Mets.
The book they have produced takes you step-by-step through the world have analyzing baseball data with the R programming language.
Some of the small guides I have put together were inspired by this book. For example, my lesson on creating a graphical pitching spray chart. Or my overview of using ggplot to make a simple, but elegant, graph.
In short, if there is one book you should purchase, it is this one. Analyzing Baseball Data with R made the process simple.
That said: it is just a tad bit dated. A lot of the data the book uses is older. As well, it covers a lot of the Lahman Database when, really, most of the data you would want to use can easily be grabbed off of Baseball Savant or FanGraphs. It is a small issue to have to deal with.
The above course from DataCamp is one that I simply cannot recommend enough.
To close this out, I will ask that you simply promote yourself and the work that you do.
Create a Twitter, for example. Put some cool graphs together, add some hashtags to a Tweet, and start putting yourself out there.
Indeed, tag me in a Tweet with your work (@BradCongelio) and I will be sure to retweet it to my audience as well.
Latest posts by Brad Congelio (see all)
- NFL Analytics: Exploring Early-Season Defensive EPA with nflfastR - September 25, 2020