Skip to content

KristiGourlay/Subreddit_classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Executive Summary

For this project, I successfully scrape two sub-reddits and than used NLP and a Bayes Classifier to predict from which sub-reddit the post came. The two sub-reddits that I chose to compare are:

          -TalesFromYourServer
          -TalesFromTheCustomer

Mainly all of the posts located in 'TalesFromYourServer' have to do with stories being shared by servers and bartenders about their experiences in the industry. The 'TalesFromTheCustomer' are primarily about customers sharing their experiences in restaurants, but also features stories about other service areas. My objective in this project was to train a model to predict from which sub-reddit each comment, labeled 'selftext', belonged. The scraping of the reddit site was performed in the notebook, labeled "scraping". I successfully scraped 40 pages from each sub-reddit and concatenated the two data frames together. 'TalesFromYourServer' was labeled 0 and 'TalesFromTheCustomer' was labeled 1 for classification purposes. The subsequent modeling takes place in the notebook, labeled "modeling".

About

Determine which sub-reddit a post belongs to.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published