How to download YouTube videos for ML training data

Before you can train a model you need data. Before you have data you need footage. This is how I got it.

Why highlight reels, not full broadcasts

The obvious source for hockey footage is full game broadcasts. Three hours of video, multiple goals per game, all the context you could want.

The problem is signal density. In a three-hour broadcast, a goal happens maybe five or six times. Each goal animation is visible for roughly five seconds. That’s maybe thirty seconds of positive footage in ten thousand seconds of broadcast - and you still have to find it.

Highlight reels solve this. A ten-minute Sportsnet highlight reel contains every goal from a game, plus replays. The ratio of useful footage to total footage is much better. And they’re easy to find - YouTube has them for almost every NHL game going back years.

The tradeoff is variety. Highlight reels are edited — cuts, replays, and pacing that don’t reflect a real broadcast — but for getting a first working model they’re the right call. I can expand to full broadcasts later if I need to.

Why Sportsnet

I focused on Sportsnet highlights for a few reasons.

First, it’s what I watch. The goal light is for my living room and I watch most games on Sportsnet. The model needs to work on the feed I actually use.

Second, Sportsnet has a distinctive goal animation that fires consistently. It’s visually loud — hard to miss, hard to confuse with other broadcast elements. That’s a good detection target.

Third, I have access to Prime Video games too, and I downloaded Prime highlights as well, but Prime coverage was still relatively new when I started collecting data — not all teams had played on Prime yet, so the coverage was uneven. Sportsnet has years of consistent footage across every team. That’s where I focused.

The download script

I wrapped yt-dlp in a small Python script to keep the footage organized. The core tool is yt-dlp — a command-line YouTube downloader. You can install it with:

pip install yt-dlp

The bare command to download a video is:

yt-dlp -f "best[ext=mp4]" -o "output.mp4" "https://youtube.com/..."

But I was downloading sixty-something videos across two broadcasters and needed them named consistently for the frame extractor to work properly. So I built a small interactive script around it:

import subprocess
import os

BROADCASTERS = {
    "1": "Sportsnet",
    "2": "Prime"
}

def select_broadcaster():
    while True:
        print("Select broadcaster:")
        print("  1) Sportsnet")
        print("  2) Prime")
        choice = input("Enter 1 or 2: ").strip()
        if choice in BROADCASTERS:
            return BROADCASTERS[choice]
        else:
            print("\n❌ Invalid choice. Enter 1 or 2.\n")

def main():
    print("=== YouTube Highlight Downloader ===")

    broadcaster = select_broadcaster()
    team1 = input("Enter Team 1: ").strip().replace(" ", "_")
    team2 = input("Enter Team 2: ").strip().replace(" ", "_")
    date = input("Enter Date (YYYY-MM-DD): ").strip()
    url = input("Enter YouTube Video URL: ").strip()

    output_dir = os.path.join("Highlights", broadcaster)
    os.makedirs(output_dir, exist_ok=True)

    filename = f"{team1}_{team2}_{date}.mp4"
    output_path = os.path.join(output_dir, filename)

    command = [
        "yt-dlp",
        "-f", "best[ext=mp4]",
        "-o", output_path,
        url
    ]

    print("\nRunning command:")
    print(" ".join(command))
    print()

    try:
        subprocess.run(command, check=True)
        print(f"\n✅ Download complete! Saved as: {output_path}")
    except subprocess.CalledProcessError as e:
        print("\n❌ Error downloading video")
        print(e)

if __name__ == "__main__":
    main()

It prompts for broadcaster, teams, date, and URL, then downloads the video into a structured folder — Highlights/Sportsnet/ or Highlights/Prime/ — with a consistent filename. That naming convention matters because the frame extractor reads from these folders directly.

By the end of this process I had 60 Sportsnet highlight reels and 8 Prime reels sitting in organized folders, ready to process.

That’s the easy part done.

Next: extracting 210,000 frames from 68 videos — and why I’m only saving a 400×100 pixel crop of each one.