You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to request that BetterTransformer not be deprecated.
Motivation
I have come to rely on BetterTransformer significantly for accelerating RoBERTa and BERT models. I have found BetterTransformer-transformed RoBERTa models and BERT to be much faster than just using SDPA with RoBERTa and BERT (previously I was using my own implementation but I know that SPDA has been added to transformers's RoBERTa implementation but again the speed does not compare to BetterTransformer). I believe BetterTransformer's fusing of layers into a single BetterTransformer encoder block is probably responsible for the added performance gains.
Removing BetterTransformer would be a net negative if those performance gains are not replicated in the native transformers implementation of RoBERTa and BERT.
To emphasise how important BetterTransformer is to me, if it were deprecated, I would create my own private library to preserve its features.
My preference is however on having it remain.
Your contribution
This request.
The text was updated successfully, but these errors were encountered:
Here is an example I just whipped up to illustrate just how valuable BetterTransformer is:
importtorchfromtransformersimportRobertaModel, RobertaTokenizerFast# BEGIN CONFIG #MODEL_NAME='umarbutler/emubert'EXAMPLE_INPUT="\The Parliament shall, subject to this Constitution,\have power to make laws for the peace, order, and good\government of the Commonwealth with respect to:\ (i) trade and commerce with other countries, and among\ the States;\ (ii) taxation; but so as not to discriminate between"""# END CONFIG #sdpa_model=RobertaModel.from_pretrained(MODEL_NAME, attn_implementation='sdpa').to(torch.bfloat16).to('cuda').eval()
bettertransformer_model=RobertaModel.from_pretrained(MODEL_NAME).to(torch.bfloat16).to_bettertransformer().to('cuda').eval()
tokenizer=RobertaTokenizerFast.from_pretrained(MODEL_NAME)
input_tokens=tokenizer(EXAMPLE_INPUT, return_tensors='pt').to('cuda')
withtorch.inference_mode():
# Do unbenched forward passes to control for potential caching effects.for_inrange(10):
bettertransformer_model(**input_tokens)
sdpa_model(**input_tokens)
# Benchmark the models.%timeitbettertransformer_model(**input_tokens)
%timeitsdpa_model(**input_tokens)
On my 4090, BetterTransformer achieves 1.93 ms ± 104 μs and SDPA achieves 3.64 ms ± 259 μs. BetterTransformer is almost 2x faster (1.88x)...
umarbutler
changed the title
Please do not deprecate BetterTransformer
Please don't kill BetterTransformer — 1.88x faster inference than SDPA
Oct 29, 2024
Feature request
I would like to request that BetterTransformer not be deprecated.
Motivation
I have come to rely on BetterTransformer significantly for accelerating RoBERTa and BERT models. I have found BetterTransformer-transformed RoBERTa models and BERT to be much faster than just using SDPA with RoBERTa and BERT (previously I was using my own implementation but I know that SPDA has been added to
transformers
's RoBERTa implementation but again the speed does not compare to BetterTransformer). I believe BetterTransformer's fusing of layers into a single BetterTransformer encoder block is probably responsible for the added performance gains.Removing BetterTransformer would be a net negative if those performance gains are not replicated in the native
transformers
implementation of RoBERTa and BERT.To emphasise how important BetterTransformer is to me, if it were deprecated, I would create my own private library to preserve its features.
My preference is however on having it remain.
Your contribution
This request.
The text was updated successfully, but these errors were encountered: