top of page

Wine Quality Prediction Machine Learning

  • Writer: Priank Ravichandar
    Priank Ravichandar
  • Sep 16, 2024
  • 1 min read

Assigning scores to wines by implementing machine learning algorithms to evaluate the factors influencing wine quality.



ree

Summary of Machine Learning

Dataset

The data was downloaded from the UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/wine+quality


The two datasets are related to red and white variants of the Portuguese “Vinho Verde” wine. Find more details here.


These datasets can be viewed as classification or regression tasks. The classes are ordered and not balanced (e.g. there are many more normal wines than excellent or poor ones). Outlier detection algorithms could be used to detect the few excellent or poor wines. Also, we are not sure if all input variables are relevant. So it could be interesting to test feature selection methods.


Goal

The objective is to classify the data into the various quality score categories. Three machine learning models will be trained and tested to determine which will yield the best results:

  • Support Vector Machines (SVM)

  • Decision Trees

  • Random Forest


Tools

Python, Support Vector Machines (SVM), Decision Tree, Random Forest


Insights

Given a random sample of 100 wines from each dataset, the Random Forests model performed the best on the filtered wine dataset across all categories, which makes sense since it was trained on a subset of that data.


Although the model excludes wines with quality scores of 3 and 9, those wines can be considered outliers since there is insufficient data to accurately predict those quality scores using a machine learning model.


Overall, the Random Forests model does a good job of predicting wine quality, regardless of the type of wine.

bottom of page