Fine-tuning HuggingFace Models

Overview

This project compares the performance of pre-trained and fine-tuned LLMs on BigPatent (from the python package Datasets) approximating new, unseen vocabulary. Models chosen for the project are the smaller variants of their originals (e.g. distilbart-xsum-12-1) designed for more affordable tuning and less latency.

The results find that fine-tuning significantly improves performance, and that smaller models can be advantageous for NLP practitioners who favor speed over quality. Read the full paper here.

Architecture

Baseline Models

Lead 3: first three sentences
BigBird: BART fine-tuned on BigPatent

Variant Models

BART: (distilbart-xsum-12-1)
Pegasus: (distill-pegasus-xsum-16-4)
T5: (t5-small)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
data		data
.gitignore		.gitignore
BigPatent_Statistics.ipynb		BigPatent_Statistics.ipynb
EvalScript.ipynb		EvalScript.ipynb
README.md		README.md
Seq2Seq.ipynb		Seq2Seq.ipynb
baseline_creation.ipynb		baseline_creation.ipynb
requirements.txt		requirements.txt
summarization_script.ipynb		summarization_script.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fine-tuning HuggingFace Models

Overview

Architecture

About

Releases

Packages

Languages

ccmilne/huggingface-fine-tuning

Folders and files

Latest commit

History

Repository files navigation

Fine-tuning HuggingFace Models

Overview

Architecture

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages