In this program we learn the skills required to organise data, uncover patterns and insights, draw meaningful conclusions and communicate critical findings. We accomplish this using Python (along with Numpy, Pandas, Matplotlib) and SQL.
In this project, I have analysed local and global temperature data and compared the temperature trends with where I live to overall global temperatures
This project is based on the dataset which collects information from 100k medical appointments in Brazil and is focused on the question of whether or not patients show up for their appointment. A number of characteristics about the patient are included in each row.
A/B tests are very commonly performed by data analysts and data scientists. It is important that we get some practice working with the difficulties of these
For this project, I have worked to understand the results of an A/B test run by an e-commerce website. My goal was to work through this to help the company understand if they should implement the new page, keep the old page, or perhaps run the experiment longer to make their decision.
The dataset that I had worked for wrangling (and analyzing and visualizing) is the tweet archive of Twitter user @dog_rates, also known as WeRateDogs. WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. These ratings almost always have a denominator of 10. The numerators, though? Almost always greater than 10. 11/10, 12/10, 13/10, etc. Why? Because "they're good dogs Brent." WeRateDogs has over 4 million followers and has received international media coverage.
My goal: Wrangle WeRateDogs Twitter data to create interesting and trustworthy analyses and visualizations. The Twitter archive is great, but it only contains very basic tweet information. Additional gathering, then assessing and cleaning is required for "Wow!"-worthy analyses and visualizations.
This data set contains 113,937 loans with 81 variables on each loan, including loan amount, borrower rate (or interest rate), current loan status, borrower income, and many others. This data dictionary explains the variables in the data set. The project objective is not expected to explore all of the variables in the dataset! But focus on only exploration on about 10-15 of them.
- Jupyter notebook
- Pandas
- Numpy
- Seaborn
- Matplotlib
- request
- Twitter API