Hallucination rail not working properly #227

shima-khoshraftar · 2023-12-14T22:15:54Z

Hi, I am writing a RAG flow and want to add the hallucination rail to it using the check_hallucination function (Action) that is given as a sample in the repo. At the beginning of this function, there are two calls as follows:

bot_response = context.get("bot_message")
last_bot_prompt_string = context.get("_last_bot_prompt")

they both return empty and therefore the rest of the code can not check for hallucination. I am wondering if these values are extracted properly. I was able to get the bot response by changing "bot_message" to "last_bot_message" but not sure how to get the prompt string.

Here is my RAG flow:

define flow llama
user ask llama
$contexts = execute retrieve(query=$last_user_message)
$answer = execute rag(query=$last_user_message, contexts=$contexts)
bot $answer
$accurate = execute check_hallucination
if not $accurate:
bot remove last message
bot inform answer unknown

Looking forward to your response, thanks very much.

drazvan · 2023-12-19T21:54:52Z

Hi @shima-khoshraftar !

This is tricky. The current implementation for the hallucination self-check needs to know the exact prompt that was used to generate the original message. The _last_bot_prompt gets set behind the scenes when generating the bot message. Since you're generating the response "yourself" by executing the rag action, that variable will not be set. I'm not sure if there's an easy way to extract the exact prompt that was used, as you pointed out. Is the rag action a LangChain chain?

I'll need to investigate why you had to change bot_message to last_bot_message.

shima-khoshraftar · 2023-12-20T14:39:34Z

Hi @drazvan,

Thanks for your reply. My rag action is not a LangChain chain. Here is the code:

async def rag(query: str, contexts: list) -> str:
print("> RAG Called") # we'll add this so we can see when this is being used
context_str = "\n".join(contexts)
# place query and contexts into RAG prompt
prompt = f"""You are a helpful assistant, below is a query from a user and
some relevant contexts. Answer the question given the information in those
contexts. If you cannot find the answer to the question, say "I don't know".

Contexts:
{context_str}

Query: {query}

Answer: """
# generate answer
res = openai.Completion.create(
    engine="text-davinci-003",
    prompt=prompt,
    temperature=0.0,
    max_tokens=100
)
return res['choices'][0]['text']

I see, so maybe I should get the prompt from the rag action myself and pass it to check hallucination action. I will try that and update you.

I also have an issue with the check_facts action from github repo. Again the evidence = context.get("relevant_chunks", []) is empty and can be fixed by passing the relevant_chunks myself. However, my question is that although the answer that llm returns given the evidence is correct, why check_facts action says that the response does not entail the evidence. I can open a new issue for this if that's better. I really appreciate your response.

drazvan · 2023-12-20T23:03:41Z

@shima-khoshraftar : yes, let's split the two. Try setting the prompt yourself for the hallucination rail and let me know how it goes.

For the fact-checking one, what LLM are you using?

shima-khoshraftar · 2023-12-22T21:06:20Z

I set the prompt for check_hallucination action myself by passing the prompt as a parameter but it was not able to check if the response is hallucination or not. I am using azureOpenAI, gpt-35-turbo. Here is an example of the output i get:

question:when this agreement is made?

RAG Called
bot_response: The Sublease Agreement was made on the 8th day of February, 2021.
last_bot_prompt_string: You are a helpful assistant, below is a query from a user and
some relevant contexts. Answer the question given the information in those
contexts. If you cannot find the answer to the question, say "I don't know". Contexts: 'Exhibit 10.1\n\n\n SUBLEASE AGREEMENT\n\n\n THIS SUBLEASE AGREEMENT (the "Sublease") is made as of this 8th day of February, 2021,.....
Query: when this agreement is made?
{'role': 'assistant', 'content': "I don't know the answer that."}

Thanks for your help.

niels-garve · 2024-01-04T09:50:28Z

Hi @drazvan and @shima-khoshraftar ,

I faced the same issue and would like to share my solution. I'm using a LangChain chain, but the framework is irrelevant for the solution. So here's the action I'm using:

async def rag(question: str, llm: BaseLLM) -> ActionResult:
    context_updates = {}
    prompt_template_variables = {
        "input": question,
        "context": "..."
    }

    # 💡 Store the context for NeMo-Guardrails
    context_updates["relevant_chunks"] = prompt_template_variables["context"]

    prompt_template = PromptTemplate.from_template("...")
    # 💡 Store prompt for NeMo-Guardrails
    context_updates["_last_bot_prompt"] = prompt_template.format(**prompt_template_variables)

    # Execute my own chain
    chain = prompt_template | llm | output_parser
    answer = await chain.ainvoke(prompt_template_variables)

    return ActionResult(
        return_value=answer,
        context_updates=context_updates
    )

The core of the solution is the return type ActionResult. Here, I return the answer that is generated using my own context and prompt. And besides that, I return context updates where my context is assigned to relevant_chunks and prompt to _last_bot_prompt.

What do you think? I hope it helps!

Kind regards,
Niels

drazvan · 2024-01-24T01:39:26Z

Thanks for sharing this @niels-garve ! This is actually a good solution. It would be great if you could turn that into an example configuration e.g. custom_rag. Setting the relevant_chunks and _last_bot_prompt like this has the advantage that both the fact-checking and hallucination detection rails would work. Let me know if you have a bit of time to create a PR. Thanks!

niels-garve · 2024-01-26T05:02:48Z

Thanks for your feedback @drazvan ! Sure, I'll open a PR, just give me 1 or max. 2 weeks.

shima-khoshraftar · 2024-01-29T15:46:51Z

@niels-garve Thanks for your solution. This way of returning the context is good. I was able to get it similarly. But the main issue that I have is with the check_hallucination action which can not detect whether the answer of a llm is hallucination or not. Even in cases that I can manually detect that is not hallucination. You can see an example of this, in my previous post. I really appreciate if anyone can help me with that. Thanks.

niels-garve · 2024-02-07T09:23:43Z

Hi @drazvan, How are you?

So, here's my pull request: #311

@shima-khoshraftar , please enable verbose logging and try to find the part where it logs hallucination checks. It should say something like:

...
execute self_check_facts
# The result was 1.0
execute check_hallucination
# The result was False
...

I'm missing that part in your previous post.

drazvan self-assigned this Dec 19, 2023

drazvan added the question Further information is requested label Dec 19, 2023

niels-garve mentioned this issue Feb 7, 2024

Update documentation to demonstrate the use of output rails when using a custom RAG #311

Merged

7 tasks

drazvan closed this as completed in #311 Feb 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hallucination rail not working properly #227

Hallucination rail not working properly #227

shima-khoshraftar commented Dec 14, 2023

drazvan commented Dec 19, 2023

shima-khoshraftar commented Dec 20, 2023

drazvan commented Dec 20, 2023

shima-khoshraftar commented Dec 22, 2023

niels-garve commented Jan 4, 2024

drazvan commented Jan 24, 2024

niels-garve commented Jan 26, 2024

shima-khoshraftar commented Jan 29, 2024

niels-garve commented Feb 7, 2024

Hallucination rail not working properly #227

Hallucination rail not working properly #227

Comments

shima-khoshraftar commented Dec 14, 2023

drazvan commented Dec 19, 2023

shima-khoshraftar commented Dec 20, 2023

drazvan commented Dec 20, 2023

shima-khoshraftar commented Dec 22, 2023

niels-garve commented Jan 4, 2024

drazvan commented Jan 24, 2024

niels-garve commented Jan 26, 2024

shima-khoshraftar commented Jan 29, 2024

niels-garve commented Feb 7, 2024