Contains all 9 projects (Beginner to Advanced) from my Data Science and Analytics internship at the Sparks Foundation.
> Problem statement : - Predict the percentage of an student based on the no. of study hours.
- This is a simple linear regression task as it involves just 2 variables.
- You can use R, Python, SAS Enterprise Miner or any other tool.
- What will be predicted score if a student studies for 9.25 hrs/ day?
- Here is the dataset : Dataset.csv
> Solution: Prediction using Supervised ML
Problem Statement:
- From the given ‘Iris’ dataset, predict the optimum number of clusters and
represent it visually.
- Use R or Python or perform this task
- Here is the dataset :
Solution: Prediction using UnSupervised ML
Problem Statement:
- Perform ‘Exploratory Data Analysis’ on dataset ‘Retail(Dataset).csv’
- As a business manager, try to find out the weak areas where you can work to
make more profit.
- What all business problems you can derive by exploring the data?
- You can choose any of the tool of your choice
(Python/R/Tableau/PowerBI/Excel/SAP/SAS) - Here is the dataset :
Solution: Exploratory Data Analysis-Retail
Problem Statement:
- Create the Decision Tree classifier and visualize it graphically.
- The purpose is if we feed any new data to this classifier, it would be able to
predict the right class accordingly.
- Use R or Python or perform this task
- Here is the dataset :
Solution: Prediction using DecisionTreeAlgorithm
Problem Statement:
Perform ‘explore Business Analytics’ on dataset ‘superstore.csv’
What all business problems you can derive by exploring the data?
You can choose any of the tool of your choice
(Python/R/Tableau/PowerBI/Excel/SAP/SAS) -
Here is the dataset : Dataset.csv
Solution: To explore Business Analytics
Problem Statement:
- Perform ‘Exploratory Data Analysis’ on dataset ‘Global Terrorism’
- As a security/defense analyst, try to find out the hot zone of terrorism.
- What all security issues and insights you can derive by EDA?
- You can choose any of the tool of your choice
- Here is the dataset :
Solution: Exploratory Data Analysis - Terrorism
Problem Statement:
- Perform ‘Exploratory Data Analysis’ on dataset ‘Indian Premier League’
- As a sports analysts, find out the most successful teams, players and factors
-contributing win or loss of a team. - Suggest teams or players a company should endorse for its products.
- You can choose any of the tool of your choice
- Here is the dataset :
Solution: Exploratory Data Analysis - Sports
Problem Statement:
- Objective: Create a hybrid model for stock price/performance prediction
using numerical analysis of historical stock prices, and sentimental analysis of
news headlines
- Stock to analyze and predict - SENSEX (S&P BSE SENSEX)
- Use either R or Python, or both for separate analysis and then combine the
findings to create a hybrid model
- You are free to select a different stock to analyze and news dataset as well
while not changing the objective of the task.
- Here is the dataset :
- Download historical stock prices from
- Download textual (news) data from
- Download historical stock prices from
Solution: Stock Market Prediction using Numerical and Textual Analysis
Demo: Stock Market Prediction using Numerical and Textual Analysis
Problem Statement:
Create a storyboard showing spread of Covid-19 cases in your country or any region (Asia, Europe, BRICS etc) using Tableau, Power BI or SAP
Identify interesting patterns and possible reasons helping Covid-19 spread with basic as well as advanced charts
Here is the dataset :
- Dataset: Daily updated .csv file on
- Dataset: Daily updated .csv file on
Solution: Timeline Analysis : Covid-19
Let's connect! Find me on the web.
If you have any Queries or Suggestions, feel free to reach out to me.