Youth hockey stats template excel spreadsheet

By Dave MacPherson | Posted Aug 19, 2020

In July 2019, I launched Pick224, a hockey analytics website containing data from over 15 hockey leagues across the world with a focus on junior hockey and the NHL draft. To compile this data, I first scrape all the individual game summaries of each league, then clean the data, total up all the stats, and join the results with each player’s biographical information.

Since launching, I’ve written blogs on how to scrape hockey data, whether from HTML web pages or sites that use JSON. I’ve since gotten a lot of questions from scouts and others who work in hockey asking about the next steps: cleaning and summarizing data.

If you have some experience with Excel or Google Sheets and don't want to start learning a programming language like R or Python, but still want to get shit done, this post is for you. Excel and Google Sheets are powerful tools, and it’s okay if you’re not ready to learn a programming language. After you get a good understanding of the things you can do with data, you can then decide if you would like to do those things more efficiently and effectively using R or Python. You may also want to start with R or Python if you're working on a project with more data than Google Sheets or Excel can handle.

In the following sections, I’ll walk through the whole process of turning raw hockey data into the summarized player totals you see on Pick224, all using Google Sheets. When I first launched Pick224, my database was entirely built using the methods below.

I recommend following along and working through the formulas. Here’s a file containing only the raw data that you can make a copy of and follow along with, and here’s a copy of a final file, after having worked through everything below.

The raw data

To prepare for this write-up, I wrote some scrapers to pull everything I needed from the Czech U20 league’s 2020 season. This data includes: