Outlook	Humidity	Wind	Play?
Sunny	High	-	No
Sunny	Normal	-	Yes
Overcast	-	-	Yes
Rain	-	Strong	No
Rain	-	Weak	Yes

How to Build an Decision Tree from Data

ID3 algorithm (by J. Ross Quinlan)
top-down
greedy approach (compare gradient descent)
- greedy heuristics just optimize the decision the next step, they often end up in local minima

1. Select the best attribute (feature in X) to split the data → A
2. Assign A as the decision attribute (test case) for the node
3. For each value (category) of A, create a new descendant of the node.
4. Sort the training examples to the appropriate descendant node leaf
5. If examples are perfectly classified, then stop else iterate over the new leaf nodes

2.8.3 Decision Trees

Learning objectives

Example: The Decision to play tennis

https://www.cs.cmu.edu/afs/cs/project/theo-20/www/mlbook/ch3.pdf

https://www.cs.cmu.edu/afs/cs/project/theo-20/www/mlbook/ch3.pdf

https://www.cs.cmu.edu/afs/cs/project/theo-20/www/mlbook/ch3.pdf

Decision Trees also work with continuous features

Decision Boundaries

https://towardsdatascience.com/decision-tree-models-934474910aec

How to Build an Decision Tree from Data

What is the best attribute?

https://www.hackerearth.com/practice/machine-learning/machine-learning-algorithms/ml-decision-tree/tutorial/

Information Entropy

Here's great explanation https://www.youtube.com/watch?v=b6VdGHSV6qg

Example Tossing a fair Coin

Example: The Data

Task

Over-fitting

https://www.kaggle.com/code/dansbecker/underfitting-and-overfitting

Pruning

https://www.semanticscholar.org/paper/Study-of-Various-Decision-Tree-Pruning-Methods-with-Patel-Upadhyay/025b8c109c38dc115024e97eb0ede5ea873fffdb

Task

When to use trees?

2.8.4 Supervised Learning: Advanced Trees

Learning objectives

Ensemble methods

https://quantdare.com/what-is-the-difference-between-bagging-and-boosting/

https://quantdare.com/what-is-the-difference-between-bagging-and-boosting/

Random Sampling (Bagging)

https://upload.wikimedia.org/wikipedia/commons/3/36/Sampling_with_replacement_and_out-of-bag_dataset_-_medical_context.jpg

Boosting

https://quantdare.com/what-is-the-difference-between-bagging-and-boosting/

Prediction

https://quantdare.com/what-is-the-difference-between-bagging-and-boosting/

Random Forrest (Bagging)

Variable importance

eXtreme Gradient Boosting (Boosting)

Day	Weather	Temperature	Humidity	Wind	Play?
3	Sunny	Mild	Normal	Strong	Yes
4	Cloudy	Mild	High	Strong	Yes
5	Rainy	Mild	High	Strong	No
6	Rainy	Cool	Normal	Strong	No
8	Sunny	Hot	High	Strong	No
10	Rainy	Mild	High	Strong	No