Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

About dataset generation #238

Open
Heisenberg-Yin opened this issue Jan 11, 2023 · 0 comments
Open

About dataset generation #238

Heisenberg-Yin opened this issue Jan 11, 2023 · 0 comments

Comments

@Heisenberg-Yin
Copy link

I am a new rookie for the dense retrieval task. And I have a question for the dataset, which consists of a question, positives, hard negatives, and negatives.

I am not sure how can we get the positives, hard negatives, and negatives. From my respective, the query is equipped with positives, so the hard negatives are retrieved by the BM25, and negatives are selected randomly.

Am I right or not?

Best Wishes.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant