Implementation of some of the information retrieval methods. These methods are written with pure Java by me. Some of them are need performance improvements. You need to add TextFolder and put the data in this folder. For more information, please check readFile method of Retrieval class.
- Blocked sort-based indexing
- Boolean retrival
- Naive Bayes
- Positional posting list
- Posting list
- Tf-idf
- Rocchio (not completed)
Data can be reached from here.
Test queries: New York, np vp advp pp nnp, etc.