Naive Bayes Spam Filter

This is my first personal project related to Data Science/Machine Learning. Basically, it is a email spam filter using Naive Bayes. The program is written in Python and uses the scikit-learn library to implement Machine Learning.

The filter.py program does a simple training over emails that are known to be spam or not("ham"), utilizing some basic text analysis, the program counts the words over all emails and trains a Naive Bayes model over these words and the amount of times they show in spams and hams.

As of November 8 of 2018, the model needs to be adjusted a little more. Testing only over "ling-spam", the program predicts with an accuracy of ~60% using a Bernoulli based Naive Bayes method, which can certainly be improved on.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Emails		Emails
README.md		README.md
filter.py		filter.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Naive Bayes Spam Filter

About

Releases

Packages

Languages

danielferragut/NaiveBayesSpamFilter

Folders and files

Latest commit

History

Repository files navigation

Naive Bayes Spam Filter

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages