Install via:
devtools::install_github("Puntalytics/puntr")
library(puntr)
library(tidyverse) # always a good idea to do this toopuntr with NFL data
For speed, we’ve already scraped (using nflfastR) and saved punting data for the 1999-2020 seasons. The easiest thing to do is download the puntr-data repo here, and then point puntr::import_punts() to your local copy of the data. You can also download the data directly each time; this takes around 15 minutes.
Import, clean, and calculate as follows:
#punts_raw <- import_punts(1999:2020, local=TRUE, path=your_local_path) # recommended
punts_raw <- import_punts(2018:2020) # This takes ~15 minutes
punts_cleaned <- trust_the_process(punts_raw) # clean
punts <- calculate_all(punts_cleaned) # calculate custom Puntalytics metrics ## Calculated RERUN: 0.455 sec elapsed
## Calculated SHARP: 4.643 sec elapsed
## Calculated pEPA: 1.137 sec elapsed
You now have a dataframe punts where each row is a punt, and each column is a stat relevant to punting (including our custom metrics).
puntr calculates stats using 3-year rolling averages, to avoid any artifacts of anomalous seasons. For this reason, puntr::calculate_all() requires a dataframe containing at least 3 seasons and 1000 punts. Note in the above example that three seasons are used. Note in the below example that three seasons are used for the calculation, after which all but the most recent seasons are filtered out.
The kind folks at nflfastR have set up a convenient SQL-y way to scrape data for in-progress seasons. Rather than redo all of that work, we’ll just share here the code we use for this purpose:
#install.packages("DBI")
#install.packages("RSQLite")
library(DBI)
library(RSQLite)
library(nflfastR)
library(puntr)
update_db()
connection <- dbConnect(SQLite(), "./pbp_db")
pbp <- tbl(connection, "nflfastR_pbp")
punts <- pbp %>% filter(punt_attempt==1) %>%
filter(season %in% 2019:2021)
collect() %>%
trust_the_process() %>%
calculate_all() %>%
filter(season == 2021)
dbDisconnect(connection)Note: New in puntr 1.3, the functions
are now deprecated, in favor of
To compare punters, use
punters <- by_punters(punts)to get a dataframe where each row is a punter, and each column is an average stat for that punter. The most common standard and Punt Runts stats are included by default, but you can add whatever you like by passing additional arguments to dplyr::summarize(). For example:
punters_custom <- by_punters(punts, longest_punt = max(GrossYards))Let’s take a look at some of the columns in this data frame:
punters %>%
arrange(desc(pEPA)) %>%
select(punter_player_name, Gross, Net, pEPA) %>%
rmarkdown::paged_table()To compare punter seasons, instead use
punter_seasons <- by_punter_seasons(punts)And finally, to compare punter games, use
punter_games <- by_punter_games(punts)which gives every unique punter game a row.
Note: If a career, season, or game you’re looking for is missing from your dataframe, try changing the threshold = parameter to require fewer punts.
These dataframes - punts, punters, punter_seasons and punter_games - should serve as a good starting point for any custom analysis you’d like to do, be that using built-in puntr metrics, or your own.
puntr with college data
NOTE: puntr was successfully migrated from cfbscrapR to cfbfastR in version 1.2.2 NOTE: The by_ family of summary functions have not yet been tested for cfbfastR data, but might work.
puntr can also handle punting data for college football, piggybacking off of the scraping abilities of the cfbfastR package. You need at least 3 seasons worth of data to run calculate_all(). Import and clean as follows:
college_punts <- import_college_punts(2019:2021) %>% # import (calls cfbfastR behind the scenes)
college_to_pro() %>% # rename columns to those used by nflfastR
calculate_all() # calculate as with NFL dataNow we can use the same create_miniY as above to compare college punter seasons (create_mini and create_miniG would also work here, of course.)
miniY_college <- create_miniY(college_punts)