Decision Trees Explained
A concise overview of decision trees as a supervised learning classification algorithm.
Understanding Decision Trees
A decision tree is a classification model that predicts a class for a data point by walking through a series of questions about its known features.
For example, if we want to predict whether a user will purchase the new iPhone, the decision tree will ask a series of questions about the person’s attributes—such as age, income, or whether they already own an iPhone, and use those answers to reach a decision.
This process is based on patterns learned from training data that show which types of users have purchased an iPhone in the past.


Creating a Decision Tree
We start out by calculating which attribute has the highest information gain, meaning, which attribute is most predictive of whether or not someone will buy an iPhone (Outcome #1).
In this case, it is the attribute: owns an Iphone — this will become our root node.




Because every person who currently owns an iPhone went on to purchase the new one in our training data, we can conclude that, according to the model, there is a 100% likelihood that someone who owns an iPhone will buy the new one.
On the other hand, since only 1 out of 4 people who don’t own an iPhone ended up buying the new one, this group shows mixed outcomes.
That means there’s still uncertainty, so the decision tree needs to keep splitting—asking more questions to better separate the results and improve the prediction.
Next, we split the data again using the feature with the second-highest information gain: age.




Since everyone who doesn’t own an iPhone and is over 35 chose not to buy the new one, and the one person under 35 did, this split results in pure groups. That means the decision tree reaches its final nodes here: if a user doesn’t own an iPhone and is over 35, predict they won’t buy the new one; if they’re under 35, predict that they will.
In larger datasets, this process can continue across many more features and layers, allowing the tree to capture increasingly complex patterns in the data before arriving at a final decision.
Connect with Me!
LinkedIn: caroline-rennier
Email: caroline@rennier.com