Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hallucination rail not working properly #227

Closed
shima-khoshraftar opened this issue Dec 14, 2023 · 9 comments · Fixed by #311
Closed

Hallucination rail not working properly #227

shima-khoshraftar opened this issue Dec 14, 2023 · 9 comments · Fixed by #311
Assignees
Labels
question Further information is requested

Comments

@shima-khoshraftar
Copy link

Hi, I am writing a RAG flow and want to add the hallucination rail to it using the check_hallucination function (Action) that is given as a sample in the repo. At the beginning of this function, there are two calls as follows:

bot_response = context.get("bot_message")
last_bot_prompt_string = context.get("_last_bot_prompt")

they both return empty and therefore the rest of the code can not check for hallucination. I am wondering if these values are extracted properly. I was able to get the bot response by changing "bot_message" to "last_bot_message" but not sure how to get the prompt string.

Here is my RAG flow:

define flow llama
user ask llama
$contexts = execute retrieve(query=$last_user_message)
$answer = execute rag(query=$last_user_message, contexts=$contexts)
bot $answer
$accurate = execute check_hallucination
if not $accurate:
bot remove last message
bot inform answer unknown

Looking forward to your response, thanks very much.

@drazvan
Copy link
Collaborator

drazvan commented Dec 19, 2023

Hi @shima-khoshraftar !

This is tricky. The current implementation for the hallucination self-check needs to know the exact prompt that was used to generate the original message. The _last_bot_prompt gets set behind the scenes when generating the bot message. Since you're generating the response "yourself" by executing the rag action, that variable will not be set. I'm not sure if there's an easy way to extract the exact prompt that was used, as you pointed out. Is the rag action a LangChain chain?

I'll need to investigate why you had to change bot_message to last_bot_message.

@drazvan drazvan self-assigned this Dec 19, 2023
@drazvan drazvan added the question Further information is requested label Dec 19, 2023
@shima-khoshraftar
Copy link
Author

Hi @drazvan,

Thanks for your reply. My rag action is not a LangChain chain. Here is the code:

async def rag(query: str, contexts: list) -> str:
print("> RAG Called") # we'll add this so we can see when this is being used
context_str = "\n".join(contexts)
# place query and contexts into RAG prompt
prompt = f"""You are a helpful assistant, below is a query from a user and
some relevant contexts. Answer the question given the information in those
contexts. If you cannot find the answer to the question, say "I don't know".

Contexts:
{context_str}

Query: {query}

Answer: """
# generate answer
res = openai.Completion.create(
    engine="text-davinci-003",
    prompt=prompt,
    temperature=0.0,
    max_tokens=100
)
return res['choices'][0]['text']

I see, so maybe I should get the prompt from the rag action myself and pass it to check hallucination action. I will try that and update you.

I also have an issue with the check_facts action from github repo. Again the evidence = context.get("relevant_chunks", []) is empty and can be fixed by passing the relevant_chunks myself. However, my question is that although the answer that llm returns given the evidence is correct, why check_facts action says that the response does not entail the evidence. I can open a new issue for this if that's better. I really appreciate your response.

@drazvan
Copy link
Collaborator

drazvan commented Dec 20, 2023

@shima-khoshraftar : yes, let's split the two. Try setting the prompt yourself for the hallucination rail and let me know how it goes.

For the fact-checking one, what LLM are you using?

@shima-khoshraftar
Copy link
Author

I set the prompt for check_hallucination action myself by passing the prompt as a parameter but it was not able to check if the response is hallucination or not. I am using azureOpenAI, gpt-35-turbo. Here is an example of the output i get:

question:when this agreement is made?

RAG Called
bot_response: The Sublease Agreement was made on the 8th day of February, 2021.
last_bot_prompt_string: You are a helpful assistant, below is a query from a user and
some relevant contexts. Answer the question given the information in those
contexts. If you cannot find the answer to the question, say "I don't know". Contexts: 'Exhibit 10.1\n\n\n SUBLEASE AGREEMENT\n\n\n THIS SUBLEASE AGREEMENT (the "Sublease") is made as of this 8th day of February, 2021,.....
Query: when this agreement is made?
{'role': 'assistant', 'content': "I don't know the answer that."}

Thanks for your help.

@niels-garve
Copy link
Contributor

Hi @drazvan and @shima-khoshraftar ,

I faced the same issue and would like to share my solution. I'm using a LangChain chain, but the framework is irrelevant for the solution. So here's the action I'm using:

async def rag(question: str, llm: BaseLLM) -> ActionResult:
    context_updates = {}
    prompt_template_variables = {
        "input": question,
        "context": "..."
    }

    # 💡 Store the context for NeMo-Guardrails
    context_updates["relevant_chunks"] = prompt_template_variables["context"]

    prompt_template = PromptTemplate.from_template("...")
    # 💡 Store prompt for NeMo-Guardrails
    context_updates["_last_bot_prompt"] = prompt_template.format(**prompt_template_variables)

    # Execute my own chain
    chain = prompt_template | llm | output_parser
    answer = await chain.ainvoke(prompt_template_variables)

    return ActionResult(
        return_value=answer,
        context_updates=context_updates
    )

The core of the solution is the return type ActionResult. Here, I return the answer that is generated using my own context and prompt. And besides that, I return context updates where my context is assigned to relevant_chunks and prompt to _last_bot_prompt.

What do you think? I hope it helps!

Kind regards,
Niels

@drazvan
Copy link
Collaborator

drazvan commented Jan 24, 2024

Thanks for sharing this @niels-garve ! This is actually a good solution. It would be great if you could turn that into an example configuration e.g. custom_rag. Setting the relevant_chunks and _last_bot_prompt like this has the advantage that both the fact-checking and hallucination detection rails would work. Let me know if you have a bit of time to create a PR. Thanks!

@niels-garve
Copy link
Contributor

Thanks for your feedback @drazvan ! Sure, I'll open a PR, just give me 1 or max. 2 weeks.

@shima-khoshraftar
Copy link
Author

@niels-garve Thanks for your solution. This way of returning the context is good. I was able to get it similarly. But the main issue that I have is with the check_hallucination action which can not detect whether the answer of a llm is hallucination or not. Even in cases that I can manually detect that is not hallucination. You can see an example of this, in my previous post. I really appreciate if anyone can help me with that. Thanks.

@niels-garve
Copy link
Contributor

Hi @drazvan, How are you?

So, here's my pull request: #311

@shima-khoshraftar , please enable verbose logging and try to find the part where it logs hallucination checks. It should say something like:

...
execute self_check_facts
# The result was 1.0
execute check_hallucination
# The result was False
...

I'm missing that part in your previous post.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
3 participants