Skip to content

Research paper about SMS spam classification done for 'Text Analysis and Retrieval' course

Notifications You must be signed in to change notification settings

marinkreso/SMS-spam-classification-research-paper

Repository files navigation

Using NLP Techniques and Model Selection to Improve Performance of SMS Spam Classification

ABSTRACT:

Recent reports clearly indicate dramatic growth in volume of SMS spam messages. SMS spam classification is a challenging problem, as this kind of messages are rife with idioms and abbreviations. Most common and baseline solution for this is using Multinomial Naive Bayes algorithm with Bag-of-words term frequencies as features. As an alternative, we propose pipeline approach that uses NLP (Natural Language Processing) techniques, extracts new features and does hyperparameter optimization for most popular machine learning classification algorithms. Our results on the SMS Spam Collection dataset show that by incorporating our proposed pipeline approach, SMS spam classification system can yield statistically significant performance gain as compared to the baseline.

About

Research paper about SMS spam classification done for 'Text Analysis and Retrieval' course

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published