Install via:
devtools::install_github("Puntalytics/puntr")
library(puntr)
library(tidyverse) # always a good idea to do this too
puntr
with NFL data
For speed, we’ve already scraped (using nflfastR
) and saved punting data for the 1999-2020 seasons. The easiest thing to do is download the puntr-data
repo here, and then point puntr::import_punts()
to your local copy of the data. You can also download the data directly each time; this takes around 15 minutes.
Import, clean, and calculate as follows:
#punts_raw <- import_punts(1999:2020, local=TRUE, path=your_local_path) # recommended
punts_raw <- import_punts(2018:2020) # This takes ~15 minutes
punts_cleaned <- trust_the_process(punts_raw) # clean
punts <- calculate_all(punts_cleaned) # calculate custom Puntalytics metrics
## Calculated RERUN: 0.455 sec elapsed
## Calculated SHARP: 4.643 sec elapsed
## Calculated pEPA: 1.137 sec elapsed
You now have a dataframe punts
where each row is a punt, and each column is a stat relevant to punting (including our custom metrics).
puntr
calculates stats using 3-year rolling averages, to avoid any artifacts of anomalous seasons. For this reason, puntr::calculate_all()
requires a dataframe containing at least 3 seasons and 1000 punts. Note in the above example that three seasons are used. Note in the below example that three seasons are used for the calculation, after which all but the most recent seasons are filtered out.
The kind folks at nflfastR
have set up a convenient SQL
-y way to scrape data for in-progress seasons. Rather than redo all of that work, we’ll just share here the code we use for this purpose:
#install.packages("DBI")
#install.packages("RSQLite")
library(DBI)
library(RSQLite)
library(nflfastR)
library(puntr)
update_db()
connection <- dbConnect(SQLite(), "./pbp_db")
pbp <- tbl(connection, "nflfastR_pbp")
punts <- pbp %>% filter(punt_attempt==1) %>%
filter(season %in% 2019:2021)
collect() %>%
trust_the_process() %>%
calculate_all() %>%
filter(season == 2021)
dbDisconnect(connection)
Note: New in puntr 1.3
, the functions
are now deprecated, in favor of
To compare punters, use
punters <- by_punters(punts)
to get a dataframe where each row is a punter, and each column is an average stat for that punter. The most common standard and Punt Runts stats are included by default, but you can add whatever you like by passing additional arguments to dplyr::summarize()
. For example:
punters_custom <- by_punters(punts, longest_punt = max(GrossYards))
Let’s take a look at some of the columns in this data frame:
punters %>%
arrange(desc(pEPA)) %>%
select(punter_player_name, Gross, Net, pEPA) %>%
rmarkdown::paged_table()
To compare punter seasons, instead use
punter_seasons <- by_punter_seasons(punts)
And finally, to compare punter games, use
punter_games <- by_punter_games(punts)
which gives every unique punter game a row.
Note: If a career, season, or game you’re looking for is missing from your dataframe, try changing the threshold =
parameter to require fewer punts.
These dataframes - punts
, punters
, punter_seasons
and punter_games
- should serve as a good starting point for any custom analysis you’d like to do, be that using built-in puntr
metrics, or your own.
puntr
with college data
NOTE: puntr
was successfully migrated from cfbscrapR
to cfbfastR
in version 1.2.2 NOTE: The by_
family of summary functions have not yet been tested for cfbfastR
data, but might work.
puntr
can also handle punting data for college football, piggybacking off of the scraping abilities of the cfbfastR
package. You need at least 3 seasons worth of data to run calculate_all()
. Import and clean as follows:
college_punts <- import_college_punts(2019:2021) %>% # import (calls cfbfastR behind the scenes)
college_to_pro() %>% # rename columns to those used by nflfastR
calculate_all() # calculate as with NFL data
Now we can use the same create_miniY
as above to compare college punter seasons (create_mini
and create_miniG
would also work here, of course.)
miniY_college <- create_miniY(college_punts)