Skip to content

egemenberk/sarcasm_detection

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 

Repository files navigation

Sarcasm Detection using Spark

Mert Tunç, Egemen Berk Galatalı


1.3 million reddit comments that is labeled as sarcastic or not is used as dataset. No coloumns other than comment itself and the label is used. Several methods for preprocessing, feature extraction and ml models are combined to get the best results. Code is written in scala.

Currently, 77% accurcacy is taken with the best combination.

About

Ceng790 Term Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Scala 100.0%