Code for the Relevant Word Generator + Dual Vocabulary Sequence-to-Sequence generative framework.
As presented in paper: A Deep Generative Approach to Search Extrapolation and Recommendation contains implementation for the RWG model. contains implementation for the DV_Seq2Seq model. driver file for training/testing the RWG model. driver file for training/testing the DV_Seq2Seq model.
A pre-trained word vector file is needed. Put the file inside the libs/ folder and set the name using command line argument -word_vec_file
The word vector file must have a format that works with gensim.models.KeyedVectors.load_word2vec_format(...)
Due to privacy concerns, the original data cannot be released. Put your own data inside data/:
The expected data format for the RWG model (one line per instance):
Input words | target relevant words
The expected data format for the DV_Seq2Seq model (one line per instance):
Input words | relevant words | target output words
Words must be segmented by a single space
For each model there should be a train data file, a validation data file and a test data file, the filenames can be changed inside the driver files.
Requires pytorch 0.4.1+ and Python 3.