Announcing the Winners of ‘The NFL Fantasy Football’ Data Challenge

This data challenge took NFL player performance data and fantasy points from the last 6 seasons to calculate forecasted points to be scored in the 2024 NFL season that began Sept. 7, 2023.

Ocean Protocol Team
Ocean Protocol

--

AI / ML offers tools to give a competitive edge in predictive analytics, business intelligence, and performance metrics. Fantasy Football is a popular pastime for a large amount of the world, we gathered data around the past 6 seasons of player performance data to see what our community of data scientists could create.

The task at hand was the following:

  • Conduct EDA (Exploratory Data Analysis)
  • Implement Feature Engineering
  • Build a Model to predict next season’s scores
  • Evaluate the Model’s performance in metrics like RMSE, MAE (Mean Absolute Error)
  • Submit a Final Prediction

Winners

The winning reports are available to download in the Ocean Market and are summarized below:

1st place ($2500): Mohammad Jamali

Report + Prediction is available for download via: https://market.oceanprotocol.com/asset/did:op:79219d53c70e285c0ad607fa71fadcc170491f972220de91747520c71416daa4

Mohammad took gold in this challenge to predict the scores of NFL players in fantasy football. This report took the data set provided in the challenge, as well as external data feeds and alternative sources. In the link above, you will find great detail in data visualization, script explanation, use of neural networks, and several different iterations of predictive analytics for each category of NFL player.

2nd Place ($1500): Goblin

This report is available for download via: https://market.oceanprotocol.com/asset/did:op:3d642c18c5d3e044a2b9c501af7b5bb91f0258e48f22c9eaaa63f464bc51b803

Prediction dataset: https://market.oceanprotocol.com/asset/did:op:f38ec66f009d8ff88f8a10d394d38804da3d51bc5610ae6b21166547e50f7f2f

Goblin’s report differentiated from others off the bat by immediately identifying missing values in the dataset provided for this challenge. The predictions above include value factors of players' years of experience, age, BMI, and team to influence the task of predicting player performance scores. Another advantage this report gives is yearly positional averages for the dominant point scorers on a given fantasy team (Quarterback, Running Backs, Wide receivers) which supports the prediction report seen above for download. Due to this approach, Goblin was able to detect a quantifiable difference between mobile and pocket quarterbacks, dual-threat running backs, and versatile wide receivers.

3rd Place ($100): Goat

Report: https://market.oceanprotocol.com/asset/did:op:efa3dc4a52022d75f8dec153e1f63a2537ff49b72af1ab8f00b0bd6819f4e5ab

Predictions:

https://market.oceanprotocol.com/asset/did:op:c65bb5e735e670f63b100b1c26844b335f6b8f105a07358de505d81be87f62e6

The visuals above shows 2 key metrics:

1) Distribution of Fantasy Points Per Game: The majority of players score fewer fantasy points per game, with a peak around 0–5 points. There’s a long tail, indicating that only a few players score exceptionally high points per game.

2) Player Experience (Age) vs. Fantasy Points Per Game: There’s no clear linear relationship between age and fantasy points per game. However, we do see a wide distribution of points across all age groups, with younger players (around age 25) having some of the highest scores.

Goat displayed fascinating feature engineering in this data challenge and chose to base the predictive analytics model on the following attributes:

completions_per_game and attempts_per_game

passing_yards_per_game

passing_tds_per_game

Interceptions_per_game

Passing_air_yards_per_game

passing_yards_after_catch_per_game (YAC)

Rushing_attempts_per_game

Rushing_yards_per_game

Rushing_tds_per_game

Receptions_per_game

Targets_per_game

Receiving_yards_per_game

Receiving_tds_per_game

Receiving_yards_after_catch_per_game

receiving_air_yards_per_game

For the performance of this model, initially, 5-fold cross-validation was performed on the data spanning from 2016 to 2022. This method was chosen to rigorously assess and fine-tune each model’s performance using a comprehensive range of hyperparameters. By leveraging cross-validation, we ensured the model’s assessment wasn’t reliant on a singular data split. Instead, it was evaluated across multiple subsets, bolstering the robustness and reliability of the resulting performance metrics.

The conclusions and predictions found in this report can be most beneficial to people who are looking for a competitive edge in their fantasy football league or player performance wagers.

Summary

This challenge showed a great experiment testing machine learning tactics applied to a real-world entertainment industry. Do you think other sports entertainment industries can benefit from predictive analytics brought through by a data challenge with Ocean Protocol?

Reach out to us in our discord #data-science-hub channel: https://discord.gg/yFRPH9PCN4

To see past, current, and future data challenges sponsored by Ocean, please visit https://oceanprotocol.com/earn/data-challenges

About Ocean Protocol

Ocean was founded to level the playing field for AI and data. Ocean tools enable people to privately & securely publish, exchange, and consume data.

Follow Ocean on Twitter or Telegram to keep up to date. Chat directly with the Ocean community on Discord. Or, track Ocean progress directly on GitHub.

--

--