Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Elasticsearch] BM25 retrieval is too restrictive #124

Closed
anakin87 opened this issue Dec 19, 2023 · 0 comments · Fixed by #125
Closed

[Elasticsearch] BM25 retrieval is too restrictive #124

anakin87 opened this issue Dec 19, 2023 · 0 comments · Fixed by #125
Assignees
Labels
bug Something isn't working

Comments

@anakin87
Copy link
Member

To Reproduce

from elasticsearch_haystack.bm25_retriever import ElasticsearchBM25Retriever
from elasticsearch_haystack.document_store import ElasticsearchDocumentStore

document_store = ElasticsearchDocumentStore(hosts= "http://localhost:9200/")

documents = [Document(content="There are over 7,000 languages spoken around the world today."),
  Document(content="Elephants have been observed to behave in a way that indicates a high level of self-awareness, such as recognizing themselves in mirrors."),
  Document(content="In certain parts of the world, like the Maldives, Puerto Rico, and San Diego, you can witness the phenomenon of bioluminescent waves.")]

retriever = ElasticsearchBM25Retriever(document_store=document_store)
print(retriever.run(query="How much self awareness do elephants have?"))
# {'documents': []}

This should return the 2nd Document but this does not happen because of this AND operator:

See for comparison the same query in Haystack 1.x:
https://github.com/deepset-ai/haystack/blob/c812250453ab7da35f526a5f2a53e18c058fe2ff/haystack/document_stores/search_engine.py#L1100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
1 participant