Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sequence encoder decoder #17

Open
basma-b opened this issue Dec 6, 2017 · 10 comments
Open

Sequence encoder decoder #17

basma-b opened this issue Dec 6, 2017 · 10 comments

Comments

@basma-b
Copy link

basma-b commented Dec 6, 2017

I am using your decoder to implement my sequence encoder/decoder but actually i don't know how can I do to get the decoder output the same shape as my input. My input is (None, MAX_SEQ_LENGTH) these are my dialogue turns that I encode and the decode to get the next dialogue turn which has the same shape i.e (None, MAX_SEQ_LENGTH) although the decoder returns 3D dimension tensor because of return_sequences=True.

Can you help me please to do this ?

@basma-b
Copy link
Author

basma-b commented Dec 6, 2017

Here is a small code

``

seq2seq = Sequential() # my turn shape=(None, MAX_SEQUENCE_LENGTH)
seq2seq.add(Embedding(output_dim=args.emb_dim,
                        input_dim=MAX_NB_WORDS,
                        input_length=MAX_SEQUENCE_LENGTH,
                        weights=[embedding_matrix],
                        mask_zero=True,
                        trainable=True))
seq2seq.add(LSTM(units=args.hidden_size, return_sequences=True))
seq2seq.add(AttentionDecoder(args.hidden_size, args.emb_dim)) # the decoded shape=(None, MAX_SEQUENCE_LENGTH, args.emb_dim)
seq2seq.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['acc'])

@zafarali
Copy link
Contributor

zafarali commented Dec 6, 2017 via email

@basma-b
Copy link
Author

basma-b commented Dec 6, 2017

No sure no, this is what is actually been done inside. It's just to explain but you can ignore it ..

@zafarali
Copy link
Contributor

zafarali commented Dec 6, 2017 via email

@basma-b
Copy link
Author

basma-b commented Dec 6, 2017

No, but can you give me more hints please ?

@zafarali
Copy link
Contributor

zafarali commented Dec 6, 2017 via email

@basma-b
Copy link
Author

basma-b commented Dec 6, 2017

Well I think the problem was not clear, but actually now I found a way to transform my decoder output to classes instead of probabilities using argmax. So now my output has the shape (MAX_SEQUENCE_LENGTH, ) the same as my inputs (I removed the batch size here).

Now I have another question: since my outputs are classes instead of probabilities which loss function could I use in this case.
example :
output labels = [0, 0, 1, 12, 165, 3]
predicted_labels = [0, 0, 13, 166, 3]

@zafarali
Copy link
Contributor

zafarali commented Dec 6, 2017 via email

@basma-b
Copy link
Author

basma-b commented Dec 6, 2017

You willl need to convert your predicted labels into one hot vectors representing the class

And how can I do this ?

@Ap1075
Copy link

Ap1075 commented May 28, 2018

For one hot encoding you could use the to_categorical function in keras, look it up in the documentation. To link your encoder to the attention decoder, you could use the RepeatVector layer to change the dimensions according to your requirements. Hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants