Skip to content

Latest commit

 

History

History
54 lines (38 loc) · 3.41 KB

README.md

File metadata and controls

54 lines (38 loc) · 3.41 KB

ALBERT.jl

The Repo contains implementation of ALBERT in julia

Simply implementation of ALBERT(A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS). This implementation is based on Transformers.jl

KEYWORD IN ALBERT

  1. SOP(sentence-order prediction) loss : In Original BERT, creating is-not-next(negative) two sentences with randomly picking, however ALBERT use negative examples the same two consecutive segments but with their order swapped.
  2. Cross-Layer Parameter Sharing : ALBERT use cross-layer parameter sharing in Attention and FFN(FeedForward Network) to reduce number of parameter.
  3. ALBERT seperated Embedding matrix(VxD) to VxE and ExD.

Pre-trained BSON

Pre-trained tensorflow checkpoint file by google-research to the Julia desired pre-trained model format(i.e. BSON) :

Version-1 of ALBERT models

Version-2 of ALBERT models

Flie Structure

src/albert.jl - File contains wrapper for ALBERT transformer.It is implemented on top of Transformers.jl

src/alberttokenizer.jl - File contains Albert tokenizer implemented on top of WordTokenizer to tokenize the word before feeding into wordpiece or sentence piece

src/model.jl - It contains model structure of original ALBERT model released by google-Research src/sentencepiece.jl - Currently it contains Wordpiece model (directly taken from Transformers.jl) and planning to replace it with complete sentence piece model

tfckpt2bsonforalbert.jl - It is used to convert Tensorflow checkpoint file to Raw bson file

Status

The code is still underdevelopment

Checklist

  • file to convert Tensflowcheckpoint to bson
  • Space tokenizer (planning to update it as per need of sentencepiece)
  • SentencePiece (code can be founded in wordtokenizer)
  • wrapper for ALBERT Transformer
  • Model file containing structure of ALBERT

most functions is under development and will be available soon

sophisticated implemented and doc can be founded here

Demo

Demo file contains tutorial for pretraining.