Large-scale Complex Question Answering

LC-QuAD v1.0 and v2.0 are large-scale QA datasets towards complex questions against knowledge graphs.

LC-QuAD v1

The Largescale Complex Question Answering Dataset 1.0 (LC-QuAD 1.0)^[1] is a Question Answering dataset with 5000 pairs of question and its corresponding SPARQL query. The target knowledge base is DBpedia, specifically, the April, 2016 version. Please see the original paper for details about the dataset creation process and framework.

This dataset can be downloaded via the link.

Leaderboard

Model / System	Year	Precision	Recall	F1	Accuracy	Language	Reported by
T5-Base	2022	-	-	91	-	EN	Banerjee et al
T5-Small	2022	-	-	90	-	EN	Banerjee et al
PGN-BERT-BERT	2022	-	-	88	-	EN	Banerjee et al
QA Sparql	2023	88	56	68	-	EN	Kosten et al.
mBERT	2021	73	-	85.50	-	EN	Zhou Y. et al
SubQG	2019	-	-	85	-	EN	Banerjee et al
BART	2022	-	-	84	-	EN	Banerjee et al
Stage-I No Noise	2022	83.11	83.04	83.08	-	EN	Purkayastha et al.
mBERT	2021	-	-	82.40	-	DE	Zhou Y. et al
LAMA	2019	-	-	81.60	-	EN	Radoev et. al.
mBERT	2021	-	-	80.90	-	NL	Zhou Y. et al
CompQA	2018	-	-	77	-	EN	Banerjee et al
mBERT	2021	-	-	76.10	-	ES	Zhou Y. et al
HGNet	2021	75.82	75.22	75.10	-	EN	Chen et al.
AQG-net	2021	76	75	76	-	EN	Liu et al.
SQG	2018	-	-	75	-	EN	Banerjee et al
O-Ranking	2021	75.54	74.95	74.81	-	EN	Chen et al.
AQG-net	2021	-	-	74.80	-	EN	Chen et al.
mBERT	2021	-	-	74.50	-	RU	Zhou Y. et al
mBERT	2021	-	-	74	-	PT	Zhou Y. et al
mBERT	2021	-	-	73.20	-	FR	Zhou Y. et al
mBERT	2021	-	-	72.60	-	RO	Zhou Y. et al
mBERT	2021	-	-	72.30	-	IT	Zhou Y. et al
DAM	2021	-	-	72	-	EN	Chen et al.
GSM	2021	71	73	72	-	EN	Liu et al.
mBERT	2021	-	-	71.90	-	HI_IN	Zhou Y. et al
mBERT	2021	-	-	71.70	-	FA	Zhou Y. et al
GGNN	2022	66	78	71	-	EN	Liu et al.
DAM	2022	65	77	71	-	EN	Liu et al.
Slot-Matching	2021	-	-	71	-	EN	Chen et al.
G Maheshwari et. al. Pairwise	2019	66	77	71	-	EN	G Maheshwari et. al.
G Maheshwari et. al. Pointwise	2019	65	76	70	-	EN	G Maheshwari et. al.
HR-BiLSTM	2021	-	-	70	-	EN	Chen et al.
S-Ranking	2021	65.89	75.30	69.53	-	EN	Chen et al.
STAGG	2021	-	-	69	-	EN	Chen et al.
Liang et al.	2021	88	56	68	-	EN	Liang et al.
ValueNet4SPARQL	2023	86	84	85	-	EN	Kosten et al.
PGN-BERT	2018	-	-	67	-	EN	Banerjee et al
STaG-QA_pre	2021	74.50	54.80	53.60	-	EN	Ravishankar et al.
KGQAn	2023	58.07	47.12	52.03	-	EN	Omar et al.
STaG-QA	2021	76.50	52.80	51.40	-	EN	Ravishankar et al.
sparql-qa	2021	49.50	49.20	49.10	-	EN	M. Borroto et al
BART	2021	48.01	49.19	47.62	-	EN	Chen et al.
NLIWOD	2018	-	-	48	-	EN	Banerjee et al
SYGMA	2021	47	48	47	-	EN	S Neelam et al
NHGG	2021	46.93	48.36	46.12	-	EN	Chen et al.
WDAqua-core1	2021	59	38	46	-	EN	Liang et al.
NSQA	2021	44.80	45.80	44.40	-	EN	Ravishankar et al.
NSQA	2023	45	46	45	-	EN	Kosten et al.
Stage-I Part Noise	2022	42.40	42.26	42.33	-	EN	Purkayastha et al.
Stage-II w/ type	2022	37.03	37.06	37.05	-	EN	Purkayastha et al.
QASparql	2021	-	-	34	-	EN	Orogat et al.
DTQA	2021	33.94	34.99	33.72	-	EN	Abdelaziz et al.
QAmp	2021	25	50	33.33	-	EN	Purkayastha et al.
DTQA	2021	33	34	33	-	EN	Chen et al.
QAmp	2021	25	50	33	-	EN	Steinmetz et al.
QAmp	2021	25	50	33	-	EN	Chen et al.
QAmp	2021	25	50	33	-	EN	Abdelaziz et al.
QAmp	2021	25	50	33	-	EN	Ravishankar et al.
QAmp	2021	25	50	33	-	EN	Kapanipathi et al.
Stage-II w/o type	2022	32.17	32.20	32.18	-	EN	Purkayastha et al.
WDAqua-core1	2021	22	38	28	-	EN	Abdelaziz et al.
WDAqua-core1	2021	22	38	28	-	EN	Purkayastha et al.
WDAqua-core1	2021	22	38	28	-	EN	Steinmetz et al.
WDAqua-core0	2021	22	38	28	-	EN	Ravishankar et al.
Ensemble BR framework	2023	27.40	41.30	28.70	-	EN	Chen et al.
Stage-I Full Noise	2022	25.54	25.64	25.59	-	EN	Purkayastha et al.
SINA	2015	-	-	24	-	EN	Banerjee et al
Frankenstein	2021	20	21	20	-	EN	Liang et al.
WDAqua-core0	2021	-	-	15	-	EN	Orogat et al.
AskNow	2021	-	-	11	-	EN	Orogat et al.
Qanary(TM+DP+QB)	2021	-	-	1	-	EN	Orogat et al.
Entity Type Tags Modified	2022	-	-	-	72	EN	Lin and Lu
SPARQL Generator	2022	-	-	-	71.27	EN	Lin and Lu
Diomedi and Hogan	2022	-	-	-	14	EN	Lin and Lu
Yin et al.	2022	-	-	-	8	EN	Lin and Lu

LC-QuAD v2

The Largescale Complex Question Answering Dataset 2.0 (LC-QuAD 2.0)^[2] is a Large Question Answering dataset with 30,000 pairs of question and its corresponding SPARQL query. The target knowledge base is Wikidata and DBpedia, specifically the 2018 version. Please see our paper for details about the dataset creation process and framework.

This dataset can be downloaded via the link.

Leaderboard for systems which require gold entity and/or relation as input

Model / System	Year	Precision	Recall	F1	Language	Reported by
T5-Small	2022	-	-	92	EN	Banerjee et al.
T5-Base	2022	-	-	91	EN	Banerjee et al.
PGN-BERT-BERT	2022	-	-	86	EN	Banerjee et al.
SGPT_Q,K [1]	2022	-	-	89.04	EN	Al Hasan Rony et al.
PGN-BERT	2022	-	-	77	EN	Banerjee et al.
NSpM [2]	2022	-	-	66.47	EN	Al Hasan Rony et al.
BART	2022	-	-	64	EN	Banerjee et al.
Zou et al. + Bert	2021	-	-	59.30	EN	Zou et al.
CLC	2021	-	-	59	EN	Banerjee et al.
Multi-hop QGG	2020	-	-	53	EN	Banerjee et al.
Zou et al. + Tencent Word	2021	-	-	52.90	EN	Zou et al.
Multi-hop QGG	2021	-	-	52.60	EN	Zou et al.
AQG-net	2021	-	-	44.90	EN	Zou et al.

[1][2] Token wise match of query string is performed. Answers are not fetched from KG.

Leaderboard for systems which do not require gold entity and/or relation as input

Model / System	Year	Precision	Recall	F1	Language	Reported by
SGPT_Q [3]	2022	-	-	83.45	EN	Al Hasan Rony et al.
ChatGPT	2023	-	-	42.76	EN	Tan et al.
GPT-3.5v3	2023	-	-	39.04	EN	Tan et al.
GPT-3.5v2	2023	-	-	33.77	EN	Tan et al.
GPT-3	2023	-	-	33.04	EN	Tan et al.
FLAN-T5	2023	-	-	30.14	EN	Tan et al.
UNIQORN	2021	33.1	-	-	EN	Pramanik et al.
QAnswer	2020	30.80	-	-	EN	Pramanik et al.
GraftNet	2018	19.7	-	-	EN	Christmann P. et al
ElNeuQA-ConvS2S [1]	2021	26.90	27	26.90	EN	Diomedi, Hogan
GRAFT-Net + Clocq [2]	2022	19.70	-	-	EN	Christmann P. et al
Platypus	2018	3.6	-	-	EN	Pramanik et al.
Pullnet	2019	1.1	-	-	EN	Pramanik et al.
UNIK-QA	2020	0.5	-	-	EN	Pramanik et al.
GETT-QA [4]	2023	40.3	-	-	EN	Banerjee et al.

[1] discarded 2,502 (8.2%) of the 30,226 instances due to such quality issues..
[2] 2k dev, 8k test, more complex questions from origical LC-QuAD 2.0.
[3] Token wise match of query string is performed. Answers are not fetched from KG.
[4] With truncated KG embeddings.

LC-QuAD v2 + QALD-9

Leaderboard

Model / System	Year	Precision	Recall	F1	Language	Reported by
mBERT [1]	2021	-	-	70	PT_BR	Zhou Y. et al
mBERT [2]	2021	-	-	66.7	EN	Zhou Y. et. al.
mBERT [3]	2021	-	-	65.9	NL	Zhou Y. et al
mBERT [4]	2021	-	-	63.6	FR	Zhou Y. et al
mBERT [5]	2021	-	-	63.5	RU	Zhou Y. et al
mBERT [6]	2021	-	-	63.5	PT	Zhou Y. et al
mBERT [7]	2021	-	-	62.6	HI_IN	Zhou Y. et al
mBERT [8]	2021	-	-	62.2	DE	Zhou Y. et al
mBERT [9]	2021	-	-	62.1	RO	Zhou Y. et al
mBERT [10]	2021	-	-	60	FA	Zhou Y. et al
mBERT [11]	2021	-	-	58.8	ES	Zhou Y. et al
mBERT [12]	2021	-	-	57.7	IT	Zhou Y. et al

[1] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
[2] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
[3] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
[4] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
[5] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
[6] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
[7] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
[8] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
[9] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
[10] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
[11] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.
[12] trained on LC-QuAD 1.0, tested on a data combining qald4 -9 and filter out some out-of-scope questionss.

References

[1] Trivedi, Priyansh, Gaurav Maheshwari, Mohnish Dubey, and Jens Lehmann. Lc-quad: A corpus for complex question answering over knowledge graphs. In International Semantic Web Conference, pp. 210-218. Springer, Cham, 2017.

[2] Dubey, Mohnish, Debayan Banerjee, Abdelrahman Abdelkawi, and Jens Lehmann. Lc-quad 2.0: A large dataset for complex question answering over wikidata and dbpedia. In International semantic web conference, pp. 69-78. Springer, Cham, 2019.

Go back to the README

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lcquad.md

lcquad.md

Large-scale Complex Question Answering

Table of contents

LC-QuAD v1

Leaderboard

LC-QuAD v2

Leaderboard for systems which require gold entity and/or relation as input

Leaderboard for systems which do not require gold entity and/or relation as input

LC-QuAD v2 + QALD-9

Leaderboard

References

Files

lcquad.md

Latest commit

History

lcquad.md

File metadata and controls

Large-scale Complex Question Answering

Table of contents

LC-QuAD v1

Leaderboard

LC-QuAD v2

Leaderboard for systems which require gold entity and/or relation as input

Leaderboard for systems which do not require gold entity and/or relation as input

LC-QuAD v2 + QALD-9

Leaderboard

References