Skip to content

Commit

Permalink
Added some more papers
Browse files Browse the repository at this point in the history
  • Loading branch information
shagunsodhani committed Mar 25, 2018
1 parent 8191a01 commit 7216d80
Show file tree
Hide file tree
Showing 4 changed files with 106 additions and 1 deletion.
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,12 @@ I am trying a new initiative - a-paper-a-week. This repository will hold all tho
* [Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning](https://shagunsodhani.in/papers-I-read/Improving-Information-Extraction-by-Acquiring-External-Evidence-with-Reinforcement-Learning)
* [An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks](https://shagunsodhani.in/papers-I-read/An-Empirical-Investigation-of-Catastrophic-Forgetting-in-Gradient-Based-Neural-Networks)
* [Learning an SAT Solver from Single-Bit Supervision](https://shagunsodhani.in/papers-I-read/Learning-a-SAT-Solver-from-Single-Bit-Supervision)
* [Neural Relational Inference for Interacting Systems](https://shagunsodhani.in/papers-I-read/Neural-Relational-Inference-for-Interacting-Systems)
* [Stylistic Transfer in Natural Language Generation Systems Using Recurrent Neural Networks](https://shagunsodhani.in/papers-I-read/Stylistic-Transfer-in-Natural-Language-Generation-Systems-Using-Recurrent-Neural-Networks)
* [Get To The Point: Summarization with Pointer-Generator Networks](https://shagunsodhani.in/papers-I-read/Get-To-The-Point-Summarization-with-Pointer-Generator-Networks)
* [StarSpace - Embed All The Things!](https://shagunsodhani.in/papers-I-read/StarSpace-Embed-All-The-Things)
* [Emotional Chatting Machine - Emotional Conversation Generation with Internal and External Memory](https://shagunsodhani.in/papers-I-read/Emotional-Chatting-Machine-Emotional-Conversation-Generation-with-Internal-and-External-Memory)
* [Exploring Models and Data for Image Question Answering](https://shagunsodhani.in/papers-I-read/Exploring-Models-and-Data-for-Image-Question-Answering)
* [How transferable are features in deep neural networks](https://shagunsodhani.in/papers-I-read/How-transferable-are-features-in-deep-neural-networks)
* [Distilling the Knowledge in a Neural Network](https://shagunsodhani.in/papers-I-read/Distilling-the-Knowledge-in-a-Neural-Network)
* [Revisiting Semi-Supervised Learning with Graph Embeddings](https://shagunsodhani.in/papers-I-read/Revisiting-Semi-Supervised-Learning-with-Graph-Embeddings)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ layout: post
title: Task-Oriented Query Reformulation with Reinforcement Learning
comments: True
excerpt: The paper introduces a query reformulation system that rewrites a query to maximise the number of "relevant" documents that are extracted from a given black box search engine.
tags: ['2017', 'EMNLP 2017', Information Retrieval', AI, EMNLP, NLP, RL]
tags: ['2017', 'EMNLP 2017', 'Information Retrieval', AI, EMNLP, NLP, RL]
---

## Introduction
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
---
layout: post
title: Exploring Models and Data for Image Question Answering
comments: True
excerpt: Given an image, answer a given question about the image.
tags: ['2015', 'NIPS 2015', AI, CV, Dataset, NIPS, NLP, VQA]
---

## Introduction

* **Problem Statement**: Given an image, answer a given question about the image.

* [Link to the paper](https://arxiv.org/abs/1505.02074)

* **Assumptions**:
* The answer is assumed to be a single word thereby bypassing the evaluation issues of multi-word generation tasks.

## VIS-LSTM Model

* Treat the input image as the first word in the question.
* Obtain the vector representation (skip-gram) for words in the question.
* Obtain the VGG Net embeddings of the image and use a linear transformation (dimensionality reduction weight matrix) to match the dimensions of word embeddings.
* Keep image embedding frozen during training and use an LSTM to combine the word vectors.
* LSTM outputs are fed into a softmax layer which generates the answer.

## Dataset

* DAtaset for QUestion Ansering on Real-world images (DAQUAR)
* 1300 images and 7000 questions with 37 object classes.
* Downside is that even guess work can yield good results.
* The paper proposed an algorithm for generating questions using MS-COCO dataset.
* Perform preprocessing steps like breaking large sentences and changing indefinite determines to definite ones.
* *object* questions, *number* questions, *colour* questions and *location* questions can be generated by searching for nouns, numbers, colours and prepositions respectively.
* Resulting dataset has ~120K questions across above 4 semantic types.

## Models

* VIS+LSTM - explained above
* 2-VIS+BLSTM - Add the image features twice, in beginning and in the end (using different linear transformations) plus use bidirectional LSTM
* IMG+BOW - Multinomial logistic regression on image features without dimensionality reduction + bag of words (averaging word vectors).
* FULL - Simple average of above 2 models.

### Baseline

* Includes models where the answer is guessed, or only image or question features are used or image features along with prior knowledge of object are used.
* Also includes a KNN model where the system finds the nearest (image, question) pair.

### Metrics

* Accuracy
* Wu-Palmer similarity measure

## Observations

* The VIS-LSTM model outperforms the baselines while the FULL model benefits from averaging across all the models.
* Some useful information seems to be lost when downsizing the VGG vectors.
* Fine tuning the word vectors helps with performance.
* Normalising CNN hidden image features into zero mean and unit variance leads to faster training.
* Model does not perform well on the task of considering spatial relations between multiple objects and counting objects when multiple objects are present
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
---
layout: post
title: Stylistic Transfer in Natural Language Generation Systems Using Recurrent Neural Networks
comments: True
excerpt: The paper explores the problem of style transfer in natural language generation.
tags: ['2016', 'ACL 2016', ACL, AI, NLG, NLP, Workshop]
---

## Introduction

* [This workshop paper](https://aclweb.org/anthology/W/W16/W16-6010.pdf) explores the problem of style transfer in natural language generation (NLG).
* One possible manifestation would be rewriting technical articles in an easy-to-understate manner.

## Challenges

* Identifying relevant stylistic cues and using them to control text generation in NLG systems.
* Absence of a large amount of training data.

## Pitch

* Using Recurrent Neural Networks (RNNs) to disentangle the style from semantic content.
* Autoencoder model with two components - one for learning style and another for learning content.
* This allows for "style" component to be replaced while keeping the "content" component same, resulting in a style transfer.
* One way to think about this is - the encoder generates a 100-dimensional vector. In this, the first 50 entries, correspond to the "style" component and remaining to the "content" component.
* The proposal is that the loss function should be modified to include a cross-covariance term for ensuring disentanglement.
* I think one way of doing this is to have two loss functions:
* The **first loss** function ensures that the input sentence is decoded properly into the target sentence. This loss is computed for each sentence.
* The **second loss** ensures that the first 50 entries across all the encoded represenations are are correlated. This loss operates at the batch level.
* The **total loss** is the weighted sum of these 2 losses.

## Possible Datasets

* [Complete works of Shakespeare](http://norvig.com/ngrams/shakespeare.txt)
* [Wikpedia Kaggle dataset](https://www.kaggle.com/c/wikichallenge/data)
* [Oxford Text Archive](https://ota.ox.ac.uk/)
* Twitter data

## Possible Metrics

* Soundness - is the generated text entailed with the input sentence.
* Coherence - free of grammatical errors, proper word usage etc.
* Effectiveness - how effective was the style transfer
* Since some of the metrics are subjective, human evaluators also need to be employed.

0 comments on commit 7216d80

Please sign in to comment.