Supervised Learning

What supervised learning means, how to build a model, and the most common types you’ll come across.

What is Supervised Learning?

The goal is for the model to learn a relationship or mapping from inputs to outputs, so it can accurately predict the label when given new, unseen inputs.

For example, if you’re building a model to predict whether someone will buy a product, the training data might include previous user interaction data, including features like:

Age
Gender
Income browsing history
Along with a label showing whether that person actually made a purchase

Once trained, the model can be given a new user (with their age, income, and browsing behavior), and it will output a prediction on how likely that user is to buy the product.

Basic Steps for Creating a Supervised ML Model:

Data Cleansing and Preparation
1. Start by handling missing values, removing errors or outliers, and transforming raw inputs into useful features. (See the Data Preprocessing page for details.)

Split your data into training and testing sets
1. Create a data partition that uses 70%-80% of the data for training and 20-30% for testing.
2. Even better, use cross validation to get a stronger sense of how your model generalizes on the whole dataset.

Train your chosen model on the training data
1. Choose a suitable supervised learning algorithm and fit it to your training data so it can learn the relationships between the different features and labels.

Evaluate the Model on the Testing Data
1. Use your trained model to make predictions on the test set and evaluate its performance using metrics like accuracy, AUC, or F1 score. (For more on this, see the Model Evaluation page.)

Tune and Improve the Model
1. To improve your model, you can tweak its hyperparameters, try out different types of models, or remove features that aren’t helpful.
2. The goal is to ensure your model performs well on new, unseen data—not just the training data—so it doesn’t overfit or become too tailored to the specific patterns in the training set