NAIVE BAYES FROM SCRATCH

Basic of Naive Bayes

Multinomial Naive Bayes to train and use to predict the sentiment (can be used for multi class text classification not just 1 and 0)

Tokenizer to remove all html tags and special chars as well english stopwords

Test Case

IMDB Movie Review Sentiment Analyzer

Naive Bayes used to predict whether the sentiment from a review is positive or negative (2 class predictor)

Accuracy (86%):

                precision    recall  f1-score   support

           0       0.89      0.81      0.85      2481
           1       0.83      0.90      0.86      2519

    accuracy                           0.86      5000
   macro avg       0.86      0.86      0.86      5000
weighted avg       0.86      0.86      0.86      5000

Confusion matrix

460 wrong negative predicted on test data 262 wrong positive predicted on test data

At the end, test data was used to see how well the model can predict sentiment on unseen data

Output is here

How to use

Make sure you have numpy, seaborn, pandas, scikit-learn and wordcloud installed (if not just pip install libname)
Clone the repo
Download new data from kaggle or anywhere or you can use the data provided in data folder
Run sentiment_predictor.py

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
__pycache__		__pycache__
data		data
output		output
LICENSE		LICENSE
README.md		README.md
naive_bayes.py		naive_bayes.py
sentiment_predictor.py		sentiment_predictor.py
tokenizer.py		tokenizer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NAIVE BAYES FROM SCRATCH

Basic of Naive Bayes

Test Case

IMDB Movie Review Sentiment Analyzer

How to use

Contributor

About

Releases

Packages

Languages

License

Technologia-X/naive_bayes_text_classification_scratch

Folders and files

Latest commit

History

Repository files navigation

NAIVE BAYES FROM SCRATCH

Basic of Naive Bayes

Test Case

IMDB Movie Review Sentiment Analyzer

How to use

Contributor

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages