This repository began as a Udacity Nanodegree project working on the Boston Housing Data. Final work for this early work in machine learning can be viewed here.
Later, this was reworked with a deeper emphasis on linear models.
A baseline model was assessed against three models:
- linear regression with no regularization (ordinary least squares)
- linear regresison with
$\ell_1$ regularization (LASSO) - linear regression with
$\ell_2$ regularization (Ridge Regression)
A simple grid search was performed over the regularized models to identify an optimal coefficient for the regularization.
The results of this were
alpha | model | test_score | train_score |
---|---|---|---|
NaN | linear regression | 0.711009 | 0.743956 |
0.00001 | lasso | 0.711009 | 0.743956 |
0.01000 | ridge | 0.711016 | 0.743956 |
A standardized model was assessed against the same three models:
- linear regression with no regularization (ordinary least squares)
- linear regresison with
$\ell_1$ regularization (LASSO) - linear regression with
$\ell_2$ regularization (Ridge Regression)
A simple grid search was performed over the regularized models to identify an optimal coefficient for the regularization.
The results of this were
alpha | model | test_score | train_score |
---|---|---|---|
NaN | linear regression | 0.711009 | 0.743956 |
0.00001 | lasso | 0.711215 | 0.743880 |
0.01000 | ridge | 0.711298 | 0.742562 |
Note that standardization has no effect on the non-regularized linear regression.
A skew-normal, standardized model was assessed against the same three models:
- linear regression with no regularization (ordinary least squares)
- linear regresison with
$\ell_1$ regularization (LASSO) - linear regression with
$\ell_2$ regularization (Ridge Regression)
A simple grid search was performed over the regularized models to identify an optimal coefficient for the regularization.
The results of this were
alpha | model | test_score | train_score |
---|---|---|---|
NaN | linear regression | 0.751304 | 0.778260 |
0.00001 | lasso | 0.751307 | 0.778260 |
0.01000 | ridge | 0.751436 | 0.778242 |
Note that skew-normalization boosts both train and test performance for all three models.