-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RAG demo to the app #4
Conversation
ead1ef4
to
d7cb0ba
Compare
src/chat.py
Outdated
with C level executives in a professional setting."""}, | ||
] | ||
with C level executives in a professional setting."""},] | ||
self.embeddings = SentenceTransformerEmbeddings(model_name="BAAI/bge-base-en-v1.5") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this going to download a model for embedding unless it already exists in the the cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have our embedding model part of the image in such a way that its not coming from the internet every time we instantiate a Chat object? Like read it from a local file instead that we downloaded a head of time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes certainly, It's gonna add ~500Mib to the image
data/chroma.sqlite3
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file has been pre-vectorzied right? What would the code look like if I had a new set of docs I wanted to use instead?
10ed08a
to
fa202f5
Compare
Playing with it 👀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested on my MAC with applehv and worked great. I can see the difference when using or not using the rag.
```python | ||
from langchain_community.document_loaders import TextLoader | ||
from langchain.text_splitter import CharacterTextSplitter | ||
from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings | ||
from langchain.vectorstores import Chroma | ||
|
||
raw_documents = TextLoader("data/fake_meeting.txt").load() | ||
text_splitter = CharacterTextSplitter(separator = ".", chunk_size=150, chunk_overlap=0) | ||
docs = text_splitter.split_documents(raw_documents) | ||
e = SentenceTransformerEmbeddings(model_name="BAAI/bge-base-en-v1.5",cache_folder="models/") | ||
db = Chroma.from_documents(docs,e,persist_directory="./data/chromaDB") | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copy/paste this script to run the sample but can we save it somewhere? And maybe that it accepts the document url to load.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, sure. I can work on a python script that can just do this for a user given the documents location. Let's do this in a follow up PR though if that is ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, ok for me 👍
from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings | ||
from langchain.vectorstores import Chroma | ||
|
||
raw_documents = TextLoader("data/fake_meeting.txt").load() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it accepts any document, right? Any limitations (file size, format.. )?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will just read in a text file (of arbitrary size), but I have not fully tested it out. However, the package it comes from "langchain" has many different document loader if we want to use a different sort of data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, cool 👍 Thanks
``` | ||
### Download the embedding model | ||
|
||
To encode our additional data and populate our vector database, we need an embedding model (a second language model) for this workflow. Here we will use `BAAI/bge-large-en-v1.5` all the necessary model files can be found and downloaded from https://huggingface.co/BAAI/bge-large-en-v1.5. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So bge will always be the one to use or there could be different models? Does it depends on the primary model type we use? E.g this only works with llama2 and mistral
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there could be different models. But I wanted to keep this simple for now and not provide the additional options of choosing an embedding model as well. But maybe its better to make it another user option now?
It does not depend on the primary model. It only matters that what ever embedding model was used to create your vector database is the same one used at run time with the vector database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a desktop perspective i thought that when users want to build/play with a rag sample, we make them pick a model, a rag and then proceed with the usual tasks (download, build, run, ....).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, there would just be an additional step so it would be like, pick a model, select rag recipe, (rag requires an embedding model) select an embedding model, then proceed as usual.
- name: rag-demo-service | ||
contextdir: model_services | ||
containerfile: base/Containerfile | ||
model-service: true | ||
backend: | ||
- llama | ||
arch: | ||
- arm64 | ||
- amd64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add a property so we know that it supports rag, and maybe the type if there could be alternatives (e.g this works with BAAI/bge-base-en-v1.5).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, what would that look like:
backend:
- llama
rag:
- True
What would that be used for? Right now this example uses an in-memory vectordatabase in the same container. However, I would like to extend this example (in a follow up PR). Where the vectordatabase is a stand alone container. How would that impact the ai-studio.yaml
file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case we should add a completely new container to the list. We can do it later when you push the follow up PR
An example interaction
To do