You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be useful to me if I could use the model's own embedding to compute similarity smart tags. For my particular use case, semantic embeddings are not useful.
Proposal
config.json
{
"similarity" : {"faiss_encoder": "model" } // could be 'self-similar'?
}
One can get the embedding of their HuggingFace model with:
inputs=tokenizer(...)
model : BertModelForSequenceClassification= ...
embedding=model.base_model(**inputs).last_hidden_state[:, 0, :] # Take first token embedding
A simpler approach is to load the same model with the feature-extractor task, but that might be more involved.
I didn't realize that sentence-transformer can now load model directly from the hub and convert them. The only caveat is that it uses mean pooling instead of the first token. Not a terrible issue AFAIK.
So if your model name is cardiffnlp/twitter-roberta-base-sentiment-latest, Sentence Transformers will extract the Roberta base when you do
model = SentenceTransformers('cardiffnlp/twitter-roberta-base-sentiment-latest').
It also uses authentication so it works with private repo as well.
Hello!
It would be useful to me if I could use the model's own embedding to compute similarity smart tags. For my particular use case, semantic embeddings are not useful.
Proposal
config.json
One can get the embedding of their HuggingFace model with:
A simpler approach is to load the same model with the
feature-extractor
task, but that might be more involved.pipe = pipeline('feature-extraction', model='your_model', truncation=True)
EDIT: One can also do
The text was updated successfully, but these errors were encountered: