Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NQ - File datasets/nq/qrels/train.tsv not present #179

Open
GiacoL opened this issue Jul 29, 2024 · 6 comments
Open

NQ - File datasets/nq/qrels/train.tsv not present #179

GiacoL opened this issue Jul 29, 2024 · 6 comments

Comments

@GiacoL
Copy link

GiacoL commented Jul 29, 2024

I downloaded the NQ dataset and the tsv file for the train set appears to be missing

@2020uce0047
Copy link

Hi @GiacoL
There's another zip file for train set - "nq-train"
All the available datasets - https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/

@Gerry-j
Copy link

Gerry-j commented Aug 14, 2024

https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/,this url contains nq dataset,but I didn't find train.csv in nq.zip

@Gerry-j
Copy link

Gerry-j commented Aug 14, 2024

I downloaded the NQ dataset and the tsv file for the train set appears to be missing

please,where did you finally download train.csv?

@Gerry-j
Copy link

Gerry-j commented Aug 15, 2024 via email

@orionw
Copy link

orionw commented Sep 7, 2024

Hi all, thanks for this info. Is the corpus set not the same between them? I see 18,060,996 lines in the corpus for this link but the BEIR NQ corpus for test has 2,681,468? Perhaps the train has the unfiltered corpus while the test has the filtered version.

It seems like the qrels for train have documents up to 18 million also, so it appears one would have to index the train corpora separately to use these.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants