This Week in Sports Analytics:
In a nod to one of my favorite shows as a kid, This Week in Baseball, I wanted to take some time every week to review interesting developments in Sports Analytics.
I wrote this up before I realized StatSheetStuffer.com compiled their own weekly roundup. I will do my best to try to make sure there isn’t too much overlap and certainly refer to theirs every week as well!
At MLB we’re looking to hire a manager for the Machine Learning team. We have a lot of exciting projects coming up and we’re putting together a great team (biased here) of data scientists, machine learning engineers, and computer vision experts. It’s a great opportunity to help build the future of Statcast.
Justin Jacobs took a look at using Manifold Learning to analyze styles of players. He walks through the rebounding responsibilities for the Detroit Pistons.
The CMU Sports Analytics club, in a project led by Sarah Mallepalle released next-gen scrapy, a NFL tracking data web scraping repository. This will allow the public to get access to all of the passing tracking data and they have future plans to include wide receiver route locations.
Sean J. Taylor, Manager of Core Stats team at Facebook, took a quick look at play calling efficiency of NFL coaches. As he demonstrates, many coaches are too conservative on second down, often optimizing for first downs when they should be optimizing for yardage gained.
Joe Gallagher, Data Scientist at vTime, took a look at adjusting xG per possession with an eye towards improving on possessions containing multiple shots. His analysis of the England/Tunisia world cup game using free data from StatsBomb was particularly interesting.
Hockey Graphs had a magnificent 3 part explainer on how they developed their WAR statistic for hockey.
Part 1 discussed the history of WAR, Part 2 discussed their statistical plus-minus based strategy, and Part 3 reviewed their replacement level assumptions and finalized methodology. The data for the model can be found on their Evolving Hockey website.
Rocío Joo et al. compiled a list of 57 movement tracking packages within R and provided recommendations. While the wealth of packages is impressive, they recommend “integration over proliferation” to encourage researchers to evaluate how a new package fits within the ecosystem(h/t Luke Bornn).