Sabermetrics: Using a ggplot Image To Make Beautiful Graphs

Using a ggplot image can make a boring, mundane plot in RStudio truly pop off of the page. While doing so may seem exceedingly difficult, I promise you that it is not.

In my case, I am a big Pittsburgh Pirates fan and I wanted to throw together a graph showcasing the silly season that Steven Brault is currently having by doing a quick ‘geom_bar’ graph of staff ERA color-coded by pitching role (ie., reliever or starter).

By using the FanGraphs leaderboard search, I quickly pulled the information for all of the Pirates’ starters. And, as mentioned, Steven Brault is sitting there with a 0.00 ERA despite three starts and seven-innings pitched.

His ERA as a reliever is considerably worse.

Clearly, his talent and stats don’t reflect that 0.00 ERA. It is simply a combination of fluke circumstances that have made it possible.

To start the process, I put together the following dataset in Excel: Pirates ERA on my GitHub.

As you can see, it is straight forward. A player’s name. Their role. A URL to their headshot image and a number indicating the ERA.

For further reference: I pulled the headshots from this site: MLB Player Headshots.

Unfortunately, there is not an easy way to scrape the pictures. There is an API that USA Today uses but it is extremely pricey. Because of that, I do it manually.

Once the data is imported into RStudio, I used the following code:

ggplot(piratesera, aes(x = reorder(Name, -era), y = era, fill = role)) +
  geom_bar(stat = "identity") +
  geom_image(aes(image = url), size = 0.05, nudge_y = .4) +
  scale_y_continuous(limits = c(0.0, 10.00)) +
  theme_bw() +
  scale_fill_manual(values = c("#27251F", "#FDB827")) +
  theme(panel.background=element_rect(fill="#bcbcbc")) +
  theme(panel.grid.major = element_blank()) +
  theme(panel.grid.minor = element_blank()) +
  theme(plot.background=element_rect(fill="#ffffff")) +
  theme(panel.border=element_rect(colour="#ffffff")) +
  theme(axis.text.x=element_text(angle = 90, vjust = 0.5, size=11,colour="#535353",face="bold")) +
  theme(axis.text.y=element_text(size=11,colour="#000000",face="bold")) +
  theme(axis.title.y=element_text(size=11,colour="#000000",face="bold",vjust=1.5)) +
  theme(axis.title.x=element_text(size=11,colour="#000000",face="bold",vjust=-.5)) +
  labs(title = "Pittsburgh Pirates - Staff ERA",
       caption = "Min. 5 IP  |  Data: FranGraphs",
       fill = "Role") +
  theme(plot.title=element_text(face="bold",hjust=-.03,vjust= 0,colour="#3C3C3C",size=20)) +
  ylab("ERA") +
  xlab("Player")

If you are a regular ggplot user, you will recognize right away that use of ‘geom_image.’ This package is a vital part of the process. So, if you don’t have it, install it.

install.packages("ggimage")

When you run that code, you come up with this plot:

There are a few things that I automatically don’t like about it:

  1. I had to grey the background of the panel in order to match the background of the headshot. I thought it would be OK, but it looks like garbage. As well, Richard Rodriguez’s background is a different color grey than the rest. And that drives me nuts.
  2. I don’t like the grey background in general. I like clean, crisp plots so I ideally would like it white.

Part of the ggplot process is problem-solving in order to make the graphs look exactly like you want them to.

In this instance, the problem is the picture background. Luckily, this site makes it super easy to remove background (without the use of Photoshop/Gimp): Photo Background Remover.

After running all the photos through that to remove the grey background, I quickly reupload the photos and change then URL on the dataset.

As well, the following changes were made to the code:

ggplot(piratesera, aes(x = reorder(Name, -era), y = era, fill = role)) +
  geom_bar(stat = "identity") +
  geom_image(aes(image = url), size = 0.05, nudge_y = .4) +
  scale_y_continuous(limits = c(0.0, 10.00)) +
  theme_bw() +
  scale_fill_manual(values = c("#27251F", "#FDB827")) +
##CHANGED PANEL BACKGROUND TO WHITE
  theme(panel.background=element_rect(fill="#ffffff")) +
##DELETED BOTH GRID.MAJOR AND GRID.MINOR LINES TO PLACE GRID BACK ON PLOT
  theme(panel.grid.major = element_blank()) +
  theme(panel.grid.minor = element_blank()) +
  theme(plot.background=element_rect(fill="#ffffff")) +
  theme(panel.border=element_rect(colour="#ffffff")) +
  theme(axis.text.x=element_text(angle = 90, vjust = 0.5, size=11,colour="#535353",face="bold")) +
  theme(axis.text.y=element_text(size=11,colour="#000000",face="bold")) +
  theme(axis.title.y=element_text(size=11,colour="#000000",face="bold",vjust=1.5)) +
  theme(axis.title.x=element_text(size=11,colour="#000000",face="bold",vjust=-.5)) +
  labs(title = "Pittsburgh Pirates - Staff ERA",
       caption = "Min. 5 IP  |  Data: FranGraphs",
       fill = "Role") +
  theme(plot.title=element_text(face="bold",hjust=-.03,vjust= 0,colour="#3C3C3C",size=20)) +
  ylab("ERA") +
  xlab("Player")

After those couple changes, the completed plot is as follows:

As you can see, it is a much better looking plot.

Adding images to any ggplot is made very simple through the use of the geom_image command. By simply adding a path (either on your computer or to a URL) within the dataset, you can instruct ggplot to add the images.

The following two tabs change content below.

Brad Congelio

An Assistant Professor in the College of Business at Kutztown University of Pennsylvania, Brad Congelio uses data science and analytics to investigate the sport industry.

Latest posts by Brad Congelio (see all)

Leave a Comment

Follow Me on Twitter

I am always talking about RStudio, data science, and sports analytics on Twitter - especially those subjects that aren't quite enough for blog posts on my site. Click below to follow me and join the conversation.