Skip to content

Deiv101/Linear-Regression-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 

Repository files navigation

Introduction

Linear Regression using Boston Housing Dataset

Linear Regression with Boston Housing Dataset

Goal

To predict the median value of houses in seveal Boston neighborhoods in the 1970s using the given features such as crime rate, proximity to the Charles River, highway accessibility and so on.

Perhaps it is important to know what the columns in this dataset mean in full as they have benn shortened to make appear neat in a dataframe.

  • CRIM - per capita crime rate by town.
  • ZN - Proportion of residential land zoned for lots over 25,000 sq.ft.
  • INDUS - Proportion of non-retail business acres per town.
  • CHAS - Nearness to Charles River- dummy variable (1 if tract bounds river; 0 otherwise).
  • NOX - Nitric oxides concentration (parts per 10 million).
  • RM - Average number of rooms per dwelling.
  • AGE - Proportion of owner-occupied units built prior to 1940.
  • DIS - Weighted distances to five Boston employment centres.
  • RAD - Index of accessibility to radial highways.
  • TAX - Full-value property-tax rate per $10,000.
  • PTRATIO - Pupil-teacher ratio by town.
  • B - The proportion of blacks by town.
  • LSTAT - % lower status of the population.

What we are going to do is further look at the following Challenges:

  • How to treat missing values;
  • How to treat outliers;
  • Understand which variables drive the price of homes in Boston.

Releases

No releases published

Packages

No packages published