This is the second project of term 2 of Udacity's Data Analyst Nanodegree program. In this project, I have used R and applied exploratory data analysis techniques to explore relationships in one variable to multiple variables and to explore the data set for distributions, outliers, and anomalies.
In order to complete the project, you will need to install R. You can download and install R from the Comprehensive R Archive Network (CRAN).
After installing R, you will need to download and install R Studio. Choose the appropriate installation for your operating system.
Exploratory Data Analysis (EDA) is the numerical and graphical examination of data characteristics and relationships before formal, rigorous statistical analyses are applied.
EDA can lead to insights, which may uncover to other questions, and eventually predictive models. It also is an important “line of defense” against bad data and is an opportunity to notice that your assumptions or intuitions about a data set are violated.
After completing the project, I have:
-
Understood the distribution of a variable and to check for anomalies and outliers
-
Learned how to quantify and visualize individual variables within a data set by using appropriate plots such as scatter plots, histograms, bar charts, and box plots
-
Explored variables to identify the most important variables and relationships within a data set before building predictive models; calculate correlations, and investigate conditional means
-
Learned powerful methods and visualizations for examining relationships among multiple variables, such as reshaping data frames and using aesthetics like color and shape to uncover more information
I have done exploratory data analysis for white wines quality dataset. This tidy data set contains 4,898 white wines with 11 variables on quantifying the chemical properties of each wine. At least 3 wine experts rated the quality of each wine, providing a rating between 0 (very bad) and 10 (very excellent).
Which chemical properties influence the quality of white wines?