The purpose of this project is to build an AI-powered system to detect Sarcasm or Fake News from new headlines.
News Headlines dataset for Sarcasm Detection is collected from two news websites. TheOnion aims at producing sarcastic versions of current. Real (and non-sarcastic) news headlines are collected from HuffPost. The Onion: https://www.theonion.com/ HuffPost: https://www.huffpost.com/
- NLP
- Data Preprocessing
- TFIDF
- Logistic Regression
- FastText
- CNN
- LSTM
- NNLM Universal Embedding
- Universal Sentence Encoder
- BERT
- Python
The project first preoprocessed the headline data, then used different vectorization and classification techniques to build a binary classifier on the vectorized data, to detect whether the headline text is sarcasm/fake news or not. The performance of each method is given by printing out a classification report.