Skip to content

Using Machine Learning to predict playoff likelihood of any NFL team based on their salary allocation

Notifications You must be signed in to change notification settings

maxsealey/NFL-MachineLearning-Capstone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NFL Playoff Prediction Machine Learning Application

Senior Capstone Project by Max Sealey

(B.S. in Computer Science conferred 06/26/24)


Table of Contents

  1. Project Overview
    • Project Requirements
    • Project Proposal
    • Machine Learning Overview
    • Future Improvements
    • Tech Stack
      • Languages
      • Libraries
      • Dataset
      • IDE
  2. Local Environment Setup Instructions
  3. Command Line Interface (CLI) Version
  4. Graphic User Interface Version (GUI) Version (in-progress)

Project Overview

Project Requirements


Design and develop a fully functional data product (application) addressing your identified business problem or organizational need.

The deliverables include the application and a written report, also located in this repository. The report contains a Letter of Transmittal to Commissioner Goodell, a project proposal plan, and a post-implementation report.

Data Methods – provide one descriptive method that discerns relationships and characteristics of the past data in at least three forms of visualization. Also, provide one nondescriptive where a decision or trend could be inferred. The descriptive method should be in the domains of cluster or association analysis, and the others could include pruning algorithm, discriminate analysis, regression analysis (linear, logistic), Bayesian methods, neural network, or support vector machines.
Datasets – The use of dataset(s) is a critical element and involves the gathering and measuring of information on targeted variables in a systematic fashion. This could be student collected (Please consider IRB ramifications.) or publicly accessible such as websites (e.g. Kaggle.com), governmental (e.g. Department of Labor), or software related (e.g. GitHub.com).
Analytics – Using the given data, your application needs to enable decisions to be formulated or support for given trends to be provided.
Data Cleaning – if applicable, create a function that will make the data usable prior to actually being used by the application. Things such as featuring, parsing, cleaning, and wrangling the datasets.
Data Visualization – You need at least three real-time (e.g. using the GUI/dashboard) formats to visualize the data in a graphic format. Look at things like charting, mapping, color theory, plots, diagrams, or other methods (tables must include heat mapping).
Real-Time Queries – As part of your GUI enable users to access and manipulate data real-time including data maintenance. This does not deal with data “freshness” but with the query response time being in seconds.
Adaptive Element – if appropriate for the business need, provide the implementation of machine-learning methods and algorithms to enable the application to improve with experience.
Outcome Accuracy – provide functionalities that evaluate the accuracy of the information/outcomes given by the application. What are the parameters for valid output data and how will those be checked by the application?
Dashboard – include a user-friendly, functional dashboard that enables the query and display of the data, as well as other functionality described in this section. This could be stand-alone, CLI, Web-based, or a mobile application interface.

Project Proposal


My application aims to utilize machine learning to assist the NFL in predicting the playoff likelihood of any NFL team based on how they allocate their salaries by position. The dataset consists of salary cap data from 2013-2022.

Machine Learning Overview


The ML model chosen is a Random Forest Classification model, a subsect of the Supervised Learning branch of machine learning. The features that I used to make the model included the percentage of the cap allocated to the QB position, the percentage of the cap allocated to the offense (as a whole), and the percentage of the cap allocated to the defense (as a whole). I then split the data into training and testing subsets (70/30 split) and fit it to a RandomForestClassifier model imported from scikit-learn. Then I had the model make predictions on random samples of the testing data, the results of which were stored and used to formulate the accuracy score, classification report, and the confusion matrix.

Future Improvements


The main improvement I want to make is to increase the accuracy of the application, and that likely includes introducing more data and adjusting the parameters of the ML model.

Other improvements include:
  1. Switching to a Regression model
  2. Develop into a web application (frontend UI and backend API)
  3. Introduce Ensemble methods
  4. More methods to monitor reliability

Tech Stack


Languages: Python, SQL (database)

Libraries: Pandas, Scikit-learn, Matplotlib, NumPy, Seaborn, SQLite

Dataset: NFL Salary Cap Spending 2013-2022 (link)

IDE: PyCharm 2023.1.12 (Community)

Local Environment Setup InstructIons


These instructions assumes that git is installed on your computer and you have a basic knowledge of git and terminal navigation.

1. Clone the repository to your local machine.
git clone <ssh key>
2. Open in your chosen IDE. I recommend PyCharm since that is what was used to develop this program.

3. Install 'pip' if you don't already have it.

4. Navigate to the project directory and run the following command:
pip install scikit-learn matplotlib numpy seaborn pandas
5. Run the program on main.py

Command Line Interface (CLI)


Welcome, and thank you for using my program.
View the classification report, accuracy score, and confusion matrix.
Classification Report & Accuracy Score
Confusion Matrix
Displays the Main Menu for User Interaction
Option 1: Make a prediction
Option 2: Pie Chart Visualization
Pie Chart Example
Option 3: Bar Chart Visualization
Bar Chart Example
Option 4: Line Graph Visualization
Line Graph Example

About

Using Machine Learning to predict playoff likelihood of any NFL team based on their salary allocation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages