Skip to content

divya14mishra/bigdataproject

Repository files navigation

Big Data Project Project Contibution by: Divya Mishra - [email protected] Parshad Suthar - [email protected] Bhanu venkat Kasani - [email protected]

we have worked on these 4 uses in this project.

  1. Covid Data Visualization: For this case we have used covid dataset of 2019 when covid19 initially started. Using this dataset we have created different kind of bar charts, showing data such as number of active case, mortality rate, covid growth rate, etc.

  2. Self Covid Test: In this use we have created an Machine Learning model(Logisitc Regression)with data of about 3000 people telling about their health. Based on the data we are predicting if a person is having covid or not.

  3. Twitter Data Visualization: Here we have extracted approx. 65000 thousand tweets using twarc API in jsonl files. Then we have extracted all the necessary data from all those file into another json file using python. Then the updated json file is used to store data in spark using pyspark libraries. With the help of sparksql we fetched the data and created a bar chart with the help of matplotlib library showing unique hashtags counts.

  4. Sentiment Analysis: In this use case we live streaming the twitter data based on hashtags and counts. All the authentication and authorization and connection to twitter has been done with the help of python tweepy module.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published