-
Notifications
You must be signed in to change notification settings - Fork 78
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
f9ff24d
commit 79afbf1
Showing
86 changed files
with
69 additions
and
98 deletions.
There are no files selected for viewing
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file modified
0
site/_posts/2017-04-28-Simple Baseline for Visual Question Answering.md
100644 → 100755
Empty file.
Empty file.
Empty file modified
0
...in VQA Matter - Elevating the Role of Image Understanding in Visual Question Answering.md
100644 → 100755
Empty file.
Empty file.
Empty file modified
0
site/_posts/2017-06-03-A Fast and Accurate Dependency Parser using Neural Networks.md
100644 → 100755
Empty file.
Empty file modified
0
site/_posts/2017-06-17-A Decomposable Attention Model for Natural Language Inference.md
100644 → 100755
Empty file.
Empty file modified
0
site/_posts/2017-06-26-Two-Too Simple Adaptations of Word2Vec for Syntax Problems.md
100644 → 100755
Empty file.
Empty file.
97 changes: 0 additions & 97 deletions
97
...-09-Ask Me Anything: Dynamic Memory Networks for Natural Language Processing.md
This file was deleted.
Oops, something went wrong.
Empty file modified
0
...sts/2017-07-17-Principled Detection of Out of Distribution Examples in Neural Networks.md
100644 → 100755
Empty file.
Empty file modified
0
site/_posts/2017-07-24-ReasoNet - Learning to Stop Reading in Machine Comprehension.md
100644 → 100755
Empty file.
Empty file modified
0
site/_posts/2017-08-07-R-NET - Machine Reading Comprehension with Self-matching Networks.md
100644 → 100755
Empty file.
Empty file modified
0
site/_posts/2017-08-21-Learning to Compute Word Embeddings On the Fly.md
100644 → 100755
Empty file.
Empty file.
67 changes: 67 additions & 0 deletions
67
...Source Representations with Relation Networks for Neural Machine Translation.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
--- | ||
layout: post | ||
title: Refining Source Representations with Relation Networks for Neural Machine Translation | ||
comments: True | ||
excerpt: | ||
tags: ['2017', 'Relational Network', 'Representation Learning', AI, NLP, NMT] | ||
--- | ||
|
||
## Introduction | ||
|
||
* The paper introduces Relation Network (RN) that refines the encoding representation of the given source document (or sentence). | ||
* This refined source representation can then be used in Neural Machine Translation (NMT) systems to counter the problem of RNNs forgetting old information. | ||
* [Link to the paper](https://arxiv.org/abs/1709.03980) | ||
|
||
## Limitations of existing NMT models | ||
|
||
* The RNN encoder-decoder architecture is the standard choice for NMT systems. But the RNNs are prone to forgetting old information. | ||
* In NMT models, the attention is modeled in the unit of words while the use of phrases (instead of words) would be a better choice. | ||
* While NMT systems might be able to capture certain relationships between words, they are not explicitly designed to capture such information. | ||
|
||
## Contributions of the paper | ||
|
||
* Learn the relationship between the source words using the context (neighboring words). | ||
* Relation Networks (RNs) build pairwise relations between source words using the representations generated by the RNNs. The RN would sit between the encoder and the attention layer of the encoder-decoder framework thereby keeping the main architecture unaffected. | ||
|
||
## Relation Network | ||
|
||
* Neural network which is desgined for relational reasoning. | ||
* Given a set of inputs * O = o<sub>1</sub>, ..., o<sub>n</sub> *, RN is formed as a composition of inputs: | ||
RN(O) = f(sum(g(o<sub>i</sub>, o<sub>j</sub>))), f and g are functions used to learn the relations (feed forward networks) | ||
* *g* learns how the objects are related hence the name "relation". | ||
* **Components**: | ||
* CNN Layer | ||
* Extract information from the words surrounding the given word (context). | ||
* The final output of this layer is the sequence of vectors for different kernel width. | ||
|
||
* Graph Propagation (GP) Layer | ||
* Connect all the words with each other in the form of a graph. | ||
* Each output vector from the CNN corresponds to a node in the graph and there is an edge between all possible pair of nodes. | ||
* The information flows between the nodes of the graph in a message passing sort of fashion (graph propagation) to obtain a new set of vectors for each node. | ||
|
||
* Multi-Layer Perceptron (MLP) Layer | ||
* The representation from the GP Layer is fed to the MLP layer. | ||
* The layer uses residual connections from previous layers in form of concatenation. | ||
|
||
## Datasets | ||
|
||
* IWSLT Data - 44K sentences from tourism and travel domain. | ||
* NIST Data - 1M Chinese-English parallel sentence pairs. | ||
|
||
## Models | ||
|
||
* MOSES - Open source translation system - http://www.statmt.org/moses/ | ||
* NMT - Attention based NMT | ||
* NMT+ - NMT with improved decoder | ||
* TRANSFORMER - Google's new NMT | ||
* RNMT+ - Relation Network integrated with NMT+ | ||
|
||
## Evaluation Metric | ||
|
||
* case-insensitive 4-gram BLEU score | ||
|
||
## Observations | ||
|
||
* As sentences become larger (more than 50 words), RNMT clearly outperforms other baselines. | ||
* Qualitative evaluation shows that RNMT+ model captures the word alignment better than the NMT+ models. | ||
* Similarly, NMT+ system tends to miss some information from the source sentence (more so for longer sentences). While both CNNs and RNNs are weak at capturing long-term dependency, using the relation layer mitigates this issue to some extent. |
Submodule _site
updated
from 694ba0 to ff5079
Empty file.
Empty file.
Empty file.
Empty file.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file.
Empty file.
Empty file modified
0
site/public/font-awesome-4.7.0/fonts/fontawesome-webfont.woff2
100644 → 100755
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.