GitHub - a6kme/bert_pals: Course project for Stanford University - CS224. Application of Projected Attention Layers in BERT

Abstract

We examine different approaches to improve the performance of the Bidirectional Encoder Representations from Transformers (BERT) on three downstream tasks: Sentiment Analysis, Paraphrase Detection, and Semantic Text Similarity (STS). Throughout our experimentation, a variety of different fine-tuning strategies and advanced techniques were leveraged including implementing Projected Attention Layers (PALs), multi-GPU training, Unsupervised Contrastive Learning of Sentence Embeddings (SimCSE), adding relational layers, hyperparameter tuning, and fine-tuning on additional datasets. We have found that a combination of PALs, unsupervised SimCSE, and additional relational layers resulted in the largest improvements in system accuracy.

Report

You can download the report here

Presentation

You can watch the presentation here

Poster

Setup and Running

Setup a virtual environment conda create -n cs224n_dfp python
Activate the virutal environment conda activate cs224n_dfp
Install requirements pip install -r requirements.txt
Unzip data.zip which contains the data sources used for fine tuning and evaluation
Download the BERT Base model weights from BERT's official repository Repo Link || File Link
Unzip the contents of the zip file in uncased_L-12_H-768_A-12 folder

Convert the checkpoints to pytorch bin using below command

transformers-cli convert --model_type bert \
--tf_checkpoint uncased_L-12_H-768_A-12/bert_model.ckpt \
--config uncased_L-12_H-768_A-12/bert_config.json \
--pytorch_dump_output uncased_L-12_H-768_A-12/pytorch_model.bin

Fine tune the BERT model src/multitask_classifier.py --fine-tune-mode full-model --lr 1e-5

Recommendations

It is recommended to run the training on a multi gpu cluster so that the training can run faster

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.vscode		.vscode
src		src
uncased_L-12_H-768_A-12		uncased_L-12_H-768_A-12
.gitignore		.gitignore
cs224_report.pdf		cs224_report.pdf
poster.jpg		poster.jpg
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Abstract

Report

Presentation

Poster

Setup and Running

Recommendations

About

Releases

Packages

Languages

a6kme/bert_pals

Folders and files

Latest commit

History

Repository files navigation

Abstract

Report

Presentation

Poster

Setup and Running

Recommendations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages