Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not hardcoded embedding models #99

Closed
jamescalam opened this issue Aug 16, 2023 · 8 comments
Closed

Not hardcoded embedding models #99

jamescalam opened this issue Aug 16, 2023 · 8 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@jamescalam
Copy link
Contributor

Hi, I'd like to propose a fix to allow us to set the embedding model used to create the vector representations of rails. My reason for this is primarily that I'd like to be able to use a service like OpenAI for ada-002 primarily as an easy fix to large container sizes when deploying anything containing guardrails code — and additionally to allow us to select different hugging face models if preferred.

I'd propose in an initial version of this to support OpenAI and Hugging Face models. But very open to suggestions — I'd also be happy to work on this. Would there be any preference on how to set an embedding model? I figure we could do something in the config.yaml like:

models:
  - type: main
    engine: openai
    model: text-davinci-003
 - type: embedding
   engine: openai
   model: text-embedding-ada-002

OR with hugging face:

models:
  - type: main
    engine: openai
    model: text-davinci-003
 - type: embedding
   engine: huggingface
   model: sentence-transformers/all-MiniLM-L6-v2
@drazvan
Copy link
Collaborator

drazvan commented Aug 16, 2023

Hi @jamescalam! Your suggestion works.

There is actually a minimal support for changing the embedding model, but still to a SentenceTransformer one.
https://github.com/NVIDIA/NeMo-Guardrails/blob/main/nemoguardrails/actions/llm/generation.py#L73

See #97 as well.

models:
   ...
   - type: embedding
     engine: SentenceTransformer
     model: all-MiniLM-L6-v2

We have a work in progress on a private branch to enable the integration of any embedding and search provider. There are a few things that are changing, including the interface for EmbeddingsIndex which needs to be async. I reckon next week we should be able to push this to the repo. And if you can, you can help with adding support for OpenAI as an EmbeddingSearchProvider.

Thanks a lot!

@jamescalam
Copy link
Contributor Author

@drazvan created PR here #101

@dom-vaz
Copy link

dom-vaz commented Sep 22, 2023

Hi there

I think this approach will work in a lot of cases. But if you have a more complicated enterprise environment it may not be the best solution.

We can already pass a langchain LLM as main Model to LLMRails. Why not add a second variable to pass a langchain embedding model? This would open up a lot of new options without having to implement all of them. OpenAI Support is already good. But in an enterprise setup it will anyways most likely be AzureOpenAI...

Kind Regards
Dominik

@jamescalam
Copy link
Contributor Author

Issue was solved with #101

@krannnn
Copy link

krannnn commented Nov 23, 2023

Issue was solved with #101

@jamescalam thanks for the contribution. Can you please elaborate or provide the configuration or code snippet where this fix supports Azure OpenAI for Embeddings?

Cheers!

@NirnayK
Copy link

NirnayK commented Jun 7, 2024

so has the support for custom embedding model been added ? I have been search the repo for an example but I couldn't find anything

@drazvan
Copy link
Collaborator

drazvan commented Jun 7, 2024

It will be added by #548 and be part of the 0.10.0 release.

@NirnayK
Copy link

NirnayK commented Jun 7, 2024

It will be added by #548 and be part of the 0.10.0 release.

Let me change my question. Lets say i have my own qdrant db which has embeddings and its corresponding chunk, where the vector embeddings have been generated using e5-mistral-7b-instruct (4096 dim). How can I use this with guardrails ? Is it possible to configure a custom Embedding Search Provider where when the async search function is called it calls my embedding model and uses the returned embedding to search through my qdrant db and then return List of IndexItem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants