id	body height	femur length	sex	ethnicity
1	185	49	male	caucasian
2	176	43	female	asian
3	179	33	female	african american
	...	...	...	...

id	body height	femur length	is_female	is_asian	is_caucasian
1	185	49	0	0	1
2	176	43	1	1	0
3	179	33	1	0	0
	...	...	...	...	...

id	body height	femur length	is_female	is_male	is_asian	is_caucasian	is_african_american
1	185	49	0	1	0	1	0
2	176	43	1	0	1	0	0
3	179	33	1	0	0	0	1
	...	...	...	...	...	...	...

Imagine researching penguin in the Antarctic caused Your scale to freeze an break. However, You still want to get an estimate of the penguins weight.

Luckily, Your colleagues left You with a data set (train) with all the other variables you can measure.

Create a multiple linear regression model to predict the penguins weight. Compare the Mean Squared Error of the model in the test set with your colleagues.

https://www.scoopnest.com/user/AFP/1035147372572102656-do-you-know-your-gentoo-from-your-adelie-penguins-infographic-on-10-of-the-world39s-species-after

2.4 Multiple Regression

2.4.1 Multiple Linear Regression

Learning objectives

Need für Multiple Regression

Interpreting Multiple Regression

Take care when comparing different models

Visualization

2.4.2 Variable Selection

Model Selection

Collinearity

https://medium.com/analytics-vidhya/new-aspects-to-consider-while-moving-from-simple-linear-regression-to-multiple-linear-regression-dad06b3449ff

Variance inflation factor (VIF).

$R_j^2$ is the result of regressing $j$ on all other predictors

Case Study

https://www.scoopnest.com/user/AFP/1035147372572102656-do-you-know-your-gentoo-from-your-adelie-penguins-infographic-on-10-of-the-world39s-species-after

Case Study

2.4.3 Qualitative Predictors and further Extensions

Learning objectives

Qualitative Predictors

Predictors with only Two Levels

Interpretation of $\beta_2$

Qualitative Predictors with more than two Levels

Example: Input Features that are not numerical

Dummy Encoding

One-Hot Encoding

Example of a Regression Table with Dummy Encoding

Case Study

2.4.4 Extensions of the Linear Model

Removing the Additive Assumption

by adding Interaction Terms

Interactions with qualitative Predictors

Modeling non-linear Relationships with Linear Models

Linear Regression Model can capture non linear Relationships

Case Study: Data Science Project with Linear Regression

https://www.scoopnest.com/user/AFP/1035147372572102656-do-you-know-your-gentoo-from-your-adelie-penguins-infographic-on-10-of-the-world39s-species-after

2.4 Multiple Regression

2.4.1 Multiple Linear Regression

Learning objectives

Need für Multiple Regression

Interpreting Multiple Regression

Take care when comparing different models

Visualization

2.4.2 Variable Selection

Model Selection

Collinearity

https://medium.com/analytics-vidhya/new-aspects-to-consider-while-moving-from-simple-linear-regression-to-multiple-linear-regression-dad06b3449ff

Variance inflation factor (VIF).

Rj2R_j^2Rj2​ is the result of regressing jjj on all other predictors

Case Study

https://www.scoopnest.com/user/AFP/1035147372572102656-do-you-know-your-gentoo-from-your-adelie-penguins-infographic-on-10-of-the-world39s-species-after

Case Study

2.4.3 Qualitative Predictors and further Extensions

Learning objectives

Qualitative Predictors

Predictors with only Two Levels

Interpretation of β2\beta_2β2​

Qualitative Predictors with more than two Levels

Example: Input Features that are not numerical

Dummy Encoding

One-Hot Encoding

Example of a Regression Table with Dummy Encoding

Case Study

2.4.4 Extensions of the Linear Model

Removing the Additive Assumption

by adding Interaction Terms

Interactions with qualitative Predictors

Modeling non-linear Relationships with Linear Models

Linear Regression Model can capture non linear Relationships

Case Study: Data Science Project with Linear Regression

https://www.scoopnest.com/user/AFP/1035147372572102656-do-you-know-your-gentoo-from-your-adelie-penguins-infographic-on-10-of-the-world39s-species-after

$R_j^2$ is the result of regressing $j$ on all other predictors

Interpretation of $\beta_2$