recommendMeSenpai

##Website

##PreProcessing Main Files:

anime.json
contains all the anime/Manga from animelist at the time we obtained data
42k records of anime/manga
users.json.zip
- contains users and what they rated different animes/Mangas
- 200 million records
To be able to obtain our data:
- anime-min.json
  - a filtered version of anime.json containing only anime
- labels-min.json
  - runing all pictures from anime-min.json through i2v and retrieved labels for each pic
- labels2-min.json
  - runing all pictures from test-2.json through i2v and retrieved labels for each pic
- test-2.json
  - crawled extra pics
- users-200.min.json
  - extract 200 records from users.json.zip
- users-1000.min.json
  - extract 1000 records from users.json.zip

###How to generate our data #####******* NOTE: All directory paths must be change to fit your setup *******

First run the preProcessData.py

python preProcessData.py

that will create anime-min.json

In order to execute the retrievePicTags.py You must first clone i2v
install dependencies

pip install scikit-image
pip install numpy
pip install scipy
pip install Pillow
pip install chainer

and download the following files from website:

tag_list.json
illust2vec_tag_ver200.caffemodel You will then have to move "retrievePicTag.py" into the inside of the cloned Repo Runing retrievePicTag.py will generate labels-min.json

python retrievePicTag.py

The command above will take different amounts of time depending on your CPU:

i5: 5min per picture
i7: 500ms per picture

We must then retireve all the extra pictures we downloaded using scrapy #####install

pip install scrapy

#####Execute This will take about ~8 hrs

cd myAnimeList 
scrapy crawl myAnimeList --set DOWNLOAD_DELAY=8 -o test-2.json

You will need the data from the step above to do the step below
If you execute all cells in "notebooks/retrieve labels round 2.ipynb"
it will generate labels2-min.json

###Team

Nicolas Botello
Yang Yang
Austin Tang
Connor Flatt

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
data		data
myAnimeList		myAnimeList
notebooks		notebooks
scirpts		scirpts
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

recommendMeSenpai

About

Releases

Packages

Contributors 2

Languages

bote795/recommendMeSenpai

Folders and files

Latest commit

History

Repository files navigation

recommendMeSenpai

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages