Paper by category

Neural Machine Translation

Year	Authors	Conf.	Title	Links
2016	Yang et al.	NAACL-HLT'16	Hierarchical Attention Networks for Document Classification	[pdf]
2016	Zoph et al.	arXiv	Multi-Source Neural Translation	[pdf]
2017	Vaswani et al.	NIPS'17	Attention Is All You Need	[pdf] [github]
2017	Xia et al.	NIPS'17	Deliberation Networks: Sequence Generation Beyond One-Pass Decoding	[pdf] [github]
2018	Miculicich et al.	EMNLP'18	Document-Level Neural Machine Translation with Hierarchical Attention Networks	[pdf]
2018	Devlin et al.	arXiv	BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding	[pdf] [github]
2018	Yang et al.	NAACL-HLT'18	Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets	[pdf]
2018	Wu et al.	NAACL-HLT'18	Adversarial Neural Machine Translation	[pdf]
2019	Dai et al.	ACL'19	Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context	[pdf] [github]
2019	Yang et al.	arXiv	XLNet: Generalized Autoregressive Pretraining for Language Understanding	[pdf] [github]
2019	Liu et al.	ACL'19	Hierarchical Transformers for Multi-Document Summarization	[pdf] [github]
2019	Pourdamghani et al.	ACL'19	Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation	[pdf]
2019	Zhou et al.	arXiv	Synchronous Bidirectional Neural Machine Translation	[pdf] [github]

Multimodal Language Models

Year	Authors	Conf.	Title	Links
2011	Jia et al.	ICCV'11	Learning Cross-modality Similarity for Multinomial Data	[pdf]
2014	Mao et al.	arXiv	Explain Images with Multimodal Recurrent Neural Networks	[pdf]
2014	Kiros et al.	arXiv	Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models	[pdf]
2015	Ma et al.	ICCV'15	Multimodal Convolutional Neural Networks for Matching Image and Sentence	[pdf]
2015	Mao et al.	ICLR'15	Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)	[pdf] [github]
2016	Yang et al.	NIPS'16	Review Networks for Caption Generation	[pdf] [github]
2016	You et al.	CVPR'16	Image Captioning with Semantic Attention	[pdf]
2016	Lu et al.	NIPS'16	Hierarchical Question-Image Co-Attention for Visual Question Answering	[pdf] [github]
2018	Anderson et al.	CVPR'18	Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering	[pdf]
2018	Nguyen et al.	CVPR'18	Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering	[pdf]
2018	Wang et al.	NAACL'18	Object Counts! Bringing Explicit Detections Back into Image Captioning	[pdf]
2019	Qin et al.	CVPR'19	Look Back and Predict Forward in Image Captioning	[pdf]
2019	Li et al.	AAAI'19	Beyond RNNs: Positional Self-Attention with Co-Attention for Video Question Answering	[pdf]
2019	Yu et al.	CVPR'19	Deep Modular Co-Attention Networks for Visual Question Answering	[pdf]

Multimodal Machine Translation

Year	Authors	Conf.	Title	Links
2016	Caglayan et al.	WMT'16	Does Multimodality Help Human and Machine for Translation and Image Captioning?	[pdf]
2016	Caglayan et al.	arXiv	Multimodal Attention for Neural Machine Translation	[pdf]
2016	Huang et al.	WMT'16	Attention-based Multimodal Neural Machine Translation	[pdf]
2017	Nakayama et al.	arXiv	Zero-resource Machine Translation by Multimodal Encoder-decoder Network with Multimedia Pivot	[pdf]
2017	Delbrouck et al.	ICLR'17	Multimodal Compact Bilinear Pooling for Multimodal Neural Machine Translation	[pdf]
2017	Lala et al.	PBML'17	Unraveling the Contribution of Image Captioning and Neural Machine Translation for Multimodal Machine Translation	[pdf]
2017	Chen et al.	arXiv	A Teacher-Student Framework for Zero-Resource Neural Machine Translation	[pdf]
2017	Elliott et al.	arXiv	Imagination improves Multimodal Translation	[pdf]
2017	Elliott et al.	WMT'17	Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description	[pdf]
2017	Calixto et al.	arXiv	Doubly-Attentive Decoder for Multi-modal Neural Machine Translation	[pdf] [github]
2017	Libovicky et al.	ACL'17	Attention Strategies for Multi-Source Sequence-to-Sequence Learning	[pdf]
2017	Calixto et al.	EMNLP'17	Incorporating Global Visual Features into Attention-Based Neural Machine Translation	[pdf]
2018	Barrault et al.	WMT'18	Findings of the Third Shared Task on Multimodal Machine Translation	[pdf]
2018	Caglayan et al.	WMT'18	LIUM-CVC Submissions for WMT18 Multimodal Translation Task	[pdf]
2018	Gronroos et al.	WMT'18	The MeMAD Submission to the WMT18 Multimodal Translation Task	[pdf]
2018	Gwinnup et al.	WMT'18	The AFRL-Ohio State WMT18 Multimodal System: Combining Visual with Traditional	[pdf]
2018	Helcl et al.	WMT'18	CUNI System for the WMT18 Multimodal Translation Task	[pdf]
2018	Lala et al.	WMT'18	Sheffield Submissions for WMT18 Multimodal Translation Shared Task	[pdf]
2018	Zheng et al.	WMT'18	Ensemble Sequence Level Training for Multimodal MT: OSU-Baidu WMT18 Multimodal Translation System Report	[pdf]
2018	Delbrouck et al.	WMT'18	UMONS Submission for WMT18 Multimodal Translation Task	[pdf] [github]
2018	Libovicky et al.	WMT'18	Input Combination Strategies for Multi-Source Transformer Decoder	[pdf]
2018	Shin et al.	WMT'18	Multi-encoder Transformer Network for Automatic Post-Editing	[pdf]
2018	Zhou et al.	ACL'18	A Visual Attention Grounding Neural Model for Multimodal Machine Translation	[pdf]
2018	Qian et al.	arXiv	Multimodal Machine Translation with Reinforcement Learning	[pdf]
2019	Caglayan et al.	NAACL-HLT'19	Probing the Need for Visual Context in Multimodal Machine Translation	[pdf]
2019	Su et al.	CVPR'19	Unsupervised Multi-modal Neural Machine Translation	[pdf]
2019	Ive et al.	ACL'19	Distilling Translations with Visual Awareness	[pdf] [github]
2019	Calixto et al.	ACL'19	Latent Variable Model for Multi-modal Translation	[pdf]
2019	Chen et al.	IJCAI'19	From Words to Sentences: A Progressive Learning Approach for Zero-resource Machine Translation with Visual Pivots	[pdf]
2019	Hirasawa et al.	ACL'19	Debiasing Word Embedding Improves Multimodal Machine Translation	[pdf]
2019	Mogadala et al.	arXiv	Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods	[pdf]
2019	Calixto et al.	Springer	An Error Analysis for Image-based Multi-modal Neural Machine Translation	[pdf]
2019	Hirasawa et al.	arXiv	Multimodal Machine Translation with Embedding Prediction	[pdf] [github]
2020.01	Park et al.	WACV'20	MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding	[pdf] [repo]

Datasets

Dataset	Authors	Paper	Links
Flickr30K	Young et al.	From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions	[pdf] [web]
Flickr30K Entities	Plummer et al.	Flickr30K Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models	[pdf] [web] [github]
Multi30K	Elliott et al.	Multi30K: Multilingual English-German Image Descriptions	[pdf] [github]
IAPR-TC12	Grubinger et al.	The IAPR TC-12 Benchmark: A New Evaluation Resource for Visual Information Systems	[pdf] [web]
VATEX	Wang et al.	VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research	[pdf] [web]

Metrics

Metric	Authors	Paper	Links
BLEU	Papineni et al.	BLEU: a Method for Automatic Evaluation of Machine Translation	[pdf]
METEOR	Banerjee et al.	METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments	[pdf] [web]
METEOR 1.5	Denkowski et al.	METEOR Universal: Language Specific Translation Evaluation for Any Target Language	[pdf] [web]
TER	Snover et al.	A study of Translation Edit Rate with Targeted Human Annotation	[pdf]

Tutorials

Year	Authors	Title	Links
2016	Elliott et al.	Multimodal Learning and Reasoning	[pdf]
2017	Lucia Specia	Multimodal Machine Translation	[pdf]
2018	Loic Barrault	Introduction to Multimodal Machine Translation	[pdf]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resource_list_by_category.md

resource_list_by_category.md

Paper by category

Neural Machine Translation

Multimodal Language Models

Multimodal Machine Translation

Datasets

Metrics

Tutorials

Files

resource_list_by_category.md

Latest commit

History

resource_list_by_category.md

File metadata and controls

Paper by category

Neural Machine Translation

Multimodal Language Models

Multimodal Machine Translation

Datasets

Metrics

Tutorials