Can MLB pitching statistics be used to visualize differences in pitch types?
Major League Baseball (MLB) teams can have up to 13 pitchers on their roster during the regular season, and four to six of those are usually starting pitchers. Our goal is to use MLB data to see if there are numerical differences between different types of pitches and pitchers.
Specifically, we wondered if we could look at MLB pitching statistics to visualize the differences in the types of pitches thrown by two successful starting pitchers, Yu Darvish and Jon Lester.
As background, MLB pitchers will specialize in different types of pitches thrown (e.g. fastball, slider, curveball, etc.). These pitch types are achieved in part by the different speeds at which the baseball is thrown, and the rate of spin (“spin rate”) of the baseball.
To answer our question we:
- Used data from a GitHub user who gathered pitching data for the 2012-20 baseball seasons from the website baseball-reference.com.
- Created scatter plot graphs — a type of data visualization that plots points as the relationship between two variables (or sets of data) using Cartesian coordinates.
In the first plot, we can see that Yu Darvish has thrown 10 different pitch types. These vary with respect to both speed and spin rate, with the fastest pitch (two seam and four seam fastball types) reaching 98.8 miles per hour (MPH), and the highest spin rate for a pitch (change-up type) reaching 3,042 revolutions per minute (RPM).
In the second plot, we can see that Jon Lester has thrown five different pitch types. He doesn’t throw quite as hard as Yu Darvish, with his four-seam fastball pitch type reaching 96.1 MPH. The maximum spin rate achieved by a Jon Lester pitch is 3,482 RPM. This is higher than any of Yu Darvish’s pitches, but seems to be abnormal and may be an outlier.
Reflect on what you see
Look and interact with the scatter plot graphs above. When you mouse-over the scatter plots, you’ll notice more information appears.
Think about the following questions:
- What do you notice about the scatter plot graphs?
- What do you wonder about the data?
Use the fill-in-the-blank prompts to summarize your thoughts:
- “I used to think_______”
- “Now I think_______”
- “I wish I knew more about_______”
- “These data visualizations remind me of _______”
Share your reflections
Below are some ways to share your thoughts about the data visualization. If you want to publish your reflections online, make sure you check in with your teacher or guardian.
- Blogging (there are many free blogging tools)
- Creating videos (here’s a guide on creating YouTube videos)
- Drawings (Google Drawings is one tool you could explore)
Learn how we visualized the data
Go to our walk-through (in Jupyter notebook format) to see how we used the data science process to create the line graph. (This process included formulating the question, gathering the data, analyzing the data with code, and creating the visualizations.)