Skip to content

A comprehensive Jupyter notebook project that uses Support Vector Machines (SVM) for the classification of breast tumors into malignant or benign categories. The notebook includes data exploration, visualization, model training, and evaluation, providing insights into breast cancer diagnosis using machine learning.

Notifications You must be signed in to change notification settings

mouraffa/Cancer_Classification_SVM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

Breast Cancer Classification using SVM

Project Overview

This repository contains a Jupyter notebook that demonstrates the classification of breast tumors into malignant or benign categories using the Support Vector Machines (SVM) algorithm. The project covers various stages of a machine learning pipeline, including data exploration, visualization, model training, and evaluation.

Dataset

The dataset used in this project is sourced from the UCI Machine Learning Repository. It consists of 569 samples with 30 feature variables each. These features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass, describing characteristics of the cell nuclei present in the image.

Steps Covered

  1. Problem Statement: Understanding the objective of the project.
  2. Data Importing: Loading necessary libraries and the dataset.
  3. Data Visualization: Exploring data distributions and relationships through plots.
  4. Model Training: Using the SVM algorithm to train the model on the dataset.
  5. Model Evaluation: Assessing the model's accuracy and performance on test data.
  6. Model Improvement: Suggestions and steps to improve the model's accuracy.

Tools and Libraries

  • Python: The project is entirely written in Python.
  • Pandas: For data manipulation and analysis.
  • NumPy: For numerical operations.
  • Matplotlib & Seaborn: For data visualization.
  • Scikit-learn: For implementing the SVM algorithm and other related machine learning operations.

How to Use

  1. Clone this repository to your local machine.
  2. Ensure you have the required libraries installed. You can install them using pip: pip install pandas numpy matplotlib seaborn scikit-learn
  3. Open the Jupyter notebook to view and run the project.

Future Enhancements

  • Implement other classification algorithms to compare performance.
  • Deep dive into feature engineering for better accuracy.
  • Use techniques like cross-validation for more robust model evaluation.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

About

A comprehensive Jupyter notebook project that uses Support Vector Machines (SVM) for the classification of breast tumors into malignant or benign categories. The notebook includes data exploration, visualization, model training, and evaluation, providing insights into breast cancer diagnosis using machine learning.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published