GLOBAL RESEARCH SYNDICATE
No Result
View All Result
  • Login
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights
No Result
View All Result
globalresearchsyndicate
No Result
View All Result
Home Data Analysis

Grantchester United, Creating FPL Predictions With Data Science

globalresearchsyndicate by globalresearchsyndicate
September 24, 2020
in Data Analysis
0
Grantchester United, Creating FPL Predictions With Data Science
0
SHARES
24
VIEWS
Share on FacebookShare on Twitter

Fantasy Premier League is kind of a big deal here at The18. Friendships are strained over trash talk and sometimes it seems like people put more effort into perfecting each week’s lineup than their day-to-day duties. 

When I joined The18 this year I had plenty of soccer knowledge but knew nothing about the arbitrary rules of Fantasy Premier League. But I had an idea: using the latest in data science and machine learning to derive the best possible fantasy team to conquer The18’s FPL league.

For those not in the know, Fantasy Premier League is a massive online competition where you essentially pick your best Premier League team (according to some constraints of course) and score points when your players do good things (score goals, assists, clean sheets, etc.) and get deducted points when your players do bad things (yellow/red cards, miss penalties, get a ton of goals scored on them, etc.). Therefore, the objective is to obtain as many points as possible on a weekly and season-long basis. Pretty simple right?

It’s actually not. In addition to the abundance of rules set out that restrict your ability to pick the entire Liverpool or Man City squad, you have to make decisions such as who to captain (double points), who to put on the bench, and who to transfer and/or drop from your team. Lots of responsibility, I know.

Anyways, this will serve as the first installment in a blog-like series where I will chronicle my attempt to build a machine learning (or “artificial intelligence” if you want to really bring out the buzzwords) system that will dominate Fanstasy Premier League. I’m coming into this entirely new to FPL, so join me for the ride as I attempt to bring a data-driven perspective to the beautiful game.

On a brief side note, perhaps the hardest part of this whole process is coming up with a witty yet not try-hard team name that provides a small chuckle yet doesn’t come across as researched. My original team name of “G-Money” understandably received a bit of blowback, so now I’m Grantchester United for the sole purpose of using my name as a pun.

Just a quick heads up — for those of you with a more technical background or who prefer to stare at hastily written Python code, the project is on Github as well.

Week 1 — Just Me and My Intuition

This was not the best of starts and perhaps even more motivation for building a system that can pick my team for me. I picked my first week’s team with absolutely zero statistical reasoning and managed to captain Bruno Fernandes (who didn’t have a game in Week 1), start Kyle Walker (also no game in Week 1) and drop a pretty penny on Firmino, who scored me a whopping one point. Time to whip out the math.

Week 2 — Doing Stuff With Numbers

Project Details and Data Exploration

A good way to kick off any data science project is locating viable sources of data — without this fun little step I should just call it quits now. In an effort to quickly produce a working minimum viable product, I’ll start with this tremendous FPL repository, which contains historical and consistently updated data sets with all the FPL statistics you can imagine. However, moving forward, I won’t restrict myself to just this data source but will also be considering any potential data that exists on the web that I believe may help in building this system.

Speaking of the system, the objective today is simply to build something that can predict total points scored for the next week.

Generalized Information System

Pretty much anything can be structured into the above paradigm, but now that we have a clearly defined output (total points for next week), and a roughly defined input (data of some type), it’s now time to engineer a system that can map the inputs to outputs.

So, the proposed system for this week is a linear model that predicts points for next week based solely on FPL data from the previous week. I’ll track performance in relation to a baseline model, which, in layman’s terms, is one with near-stupid reasoning that we use to compare our more sophisticated model to. Essentially, if our ML/AI model is not beating this simple baseline, why are we even going through all this trouble?

Also, a linear model just means that the variables I use as inputs will each be multiplied by some positive or negative number before all being added up. This sum will serve as the prediction for a player’s total points in the following week (games are generally played on a weekly basis by the way).

Before we get into predictions, let’s first take a look at the data and see what insights it holds. For clarity, data is provided for each player and includes notable variables such as:

  • Player Name
  • Bonus Points
  • Clean Sheet
  • Creativity Score
  • Influence Score
  • Threat Score
  • Goals Conceded
  • Goals Scored
  • Assists
  • Own Goals
  • ICT Index Score
  • Penalties Saved
  • Penalties Missed
  • Red Cards
  • Number Selected by
  • Total Points in the Week Before
  • Transfers In
  • Transfers Out
  • Value
  • Home/Away Game

We can visualize the relationships in the data with a correlation matrix. Values near 1 and -1 mean the two variables are positively and negatively correlated, respectively, while values around 0 indicate no correlation at all.

The above image is giving me a seizure — so for the sake of my health and in order to get a better feel for the variable we’re predicting, let’s look at scatter plots of the top four most-correlated variables to total points earned in the following week (the y-axis in all the plots).

Although nothing here is overly eye-catching, it’s still interesting to see how different variables relate to points in the following week. The ICT Index, which is the second-most highly coordinated input variable, is a metric designed by Fantasy Premier League to give insight to a player’s value and is essentially just an aggregation of three other scores: Influence, Creativity, Threat.

Model Building

Summary statistics are cool and all I guess, but real fun lies in predicting stuff. The details of the model fitting process can be found in the Github repository, but I found the best model (so far) to be a simple Multiple Linear Regression using the full set of variables. Linear Regression just finds some coefficient value for every input variable, so we can think of the coefficient values as measures of “importance.” This model beats the simple baselines I put out — namely predicting last week’s points as next week’s points, as well as just predicting the average points across all players as the point prediction for every individual player.

Now that it’s been proven to capture at least some insight, let’s take a look at the coefficients (aka “importances”): 

 

A lot of this is pretty intuitive. For example, red cards have a large negative coefficient value and are the most influential variable in predicting next week’s points. If a player gets a red card in the prior week, he’ll obviously score zero points the next week due to the inablity to play. Similarly, missing or having penalties saved in the prior week negatively impacts points in the following week — this could be due to a player being benched after having whiffed a PK or perhaps the player’s confidence is just a lot lower after a game where he missed a penalty or two. Interestingly, total points in the previous week are not really influential in the prediction of points in the following week.

Not to get too deep, but this might actually say something about human psychology as well. It would seem as if bad events (missing penalties and own goals) have much more of an impact on future performance than positive events (such as goals scored and assists). As a player, and especially a player in arguably the best league on the planet, confidence is an extremely fragile thing. It’s not unfathomable to think how a bad game could have a lasting impact to the next week (or even further).

Predicting in the 2020-2021 Season

Given my overly simplified model, let’s start applying this freshly trained thing to this year’s data. I’ll use the first week of the 2020-21 season to predict this past week’s points. The input for this model is the data from the first game week of this EPL season, and I’m attempting to predict the points for Week 2 (which already happened but the model doesn’t know that).

First off, on average, my model is off by an absolute error of about 1.4 points while a simple baseline of predicting points from Week 1 as a player’s points for Week 2, on average, misses the mark by an absolute error of roughly 1.7 points. Although not an astounding difference, it’s still nice to know that the model has “learned” something from the data.

Take a look at my predictions compared to the actual points scored:

We can see that the model is generally predicting in the right direction, but is lacking some nuance and not capturing all the fluctuations. Honestly, this could turn out to be an impossible task because there are so many factors that we can’t account for and the game of soccer is largely a matter of chance and luck. However, I’m going to press forward in an effort to unlock the mystery that is the Fantasy Premier League.

In Conclusion

Google definitely won’t be knocking at my door anytime soon with this approach, but nevertheless I feel like I’m walking away with at least a bit more intuition for what’s going on. 

In the future, I plan on adding in variables like opponent difficulty and recent performances and potentially predicting things like monthly total points instead of a single week at a time since FPL doesn’t allow unlimited transfers. Maybe I’ll also ramp up the sophistication and borrow concepts from Game Theory, Optimization and Reinforcement Learning to find optimal playing strategies. I could also just get bored or receive a ton of blowback from this and just drop my FPL dreams altogether, so who knows really. 

Thanks for following along — see ya next time.

Related Posts

How Machine Learning has impacted Consumer Behaviour and Analysis
Consumer Research

How Machine Learning has impacted Consumer Behaviour and Analysis

January 4, 2024
Market Research The Ultimate Weapon for Business Success
Consumer Research

Market Research: The Ultimate Weapon for Business Success

June 22, 2023
Unveiling the Hidden Power of Market Research A Game Changer
Consumer Research

Unveiling the Hidden Power of Market Research: A Game Changer

June 2, 2023
7 Secrets of Market Research Gurus That Will Blow Your Mind
Consumer Research

7 Secrets of Market Research Gurus That Will Blow Your Mind

May 8, 2023
The Shocking Truth About Market Research Revealed!
Consumer Research

The Shocking Truth About Market Research: Revealed!

April 25, 2023
market research, primary research, secondary research, market research trends, market research news,
Consumer Research

Quantitative vs. Qualitative Research. How to choose the Right Research Method for Your Business Needs

March 14, 2023
Next Post
St. George lawyer selected to 2020 Rising Stars list, an honor received by less than 2.5% of Utah attorneys – St George News

St. George lawyer selected to 2020 Rising Stars list, an honor received by less than 2.5% of Utah attorneys – St George News

Categories

  • Consumer Research
  • Data Analysis
  • Data Collection
  • Industry Research
  • Latest News
  • Market Insights
  • Marketing Research
  • Survey Research
  • Uncategorized

Recent Posts

  • Ipsos Revolutionizes the Global Market Research Landscape
  • How Machine Learning has impacted Consumer Behaviour and Analysis
  • Market Research: The Ultimate Weapon for Business Success
  • Privacy Policy
  • Terms of Use
  • Antispam
  • DMCA

Copyright © 2024 Globalresearchsyndicate.com

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settingsACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT
No Result
View All Result
  • Latest News
  • Consumer Research
  • Survey Research
  • Marketing Research
  • Industry Research
  • Data Collection
  • More
    • Data Analysis
    • Market Insights

Copyright © 2024 Globalresearchsyndicate.com