Skip to content

xinyaoq/Data-Cleaning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

81af230 · May 3, 2023

History

6 Commits
May 3, 2023
May 3, 2023

Repository files navigation

Data-Cleaning

Data cleaning for movie dataset downloaded from Kaggle: https://www.kaggle.com/datasets/bharatnatrayn/movies-dataset-for-feature-extracion-prediction?select=movies.csv

Step 1: Identify missing or incomplete data values in cells in each column.

Step 2: Identify any duplicate records and remove them from the dataset.

Step 3: Separate multiple values in a cell into several cells so that one value has one record.

Step 4: Reform the unstandardized data and make sure consistency among the data.

Step 5: Decide categorical data by using label encoding and identity whether converting the categorical data into numerical data to use in statistical visualization and analysis.