Skip to content

Latest commit

 

History

History
12 lines (7 loc) · 696 Bytes

README.md

File metadata and controls

12 lines (7 loc) · 696 Bytes

Data-Cleaning

Data cleaning for movie dataset downloaded from Kaggle: https://www.kaggle.com/datasets/bharatnatrayn/movies-dataset-for-feature-extracion-prediction?select=movies.csv

Step 1: Identify missing or incomplete data values in cells in each column.

Step 2: Identify any duplicate records and remove them from the dataset.

Step 3: Separate multiple values in a cell into several cells so that one value has one record.

Step 4: Reform the unstandardized data and make sure consistency among the data.

Step 5: Decide categorical data by using label encoding and identity whether converting the categorical data into numerical data to use in statistical visualization and analysis.