- Description:
With system performance on existing reading comprehension benchmarks nearing or surpassing human performance, we need a new, hard dataset that improves systems' capabilities to actually read paragraphs of text. DROP is a crowdsourced, adversarially-created, 96k-question benchmark, in which a system must resolve references in a question, perhaps to multiple input positions, and perform discrete operations over them (such as addition, counting, or sorting). These operations require a much more comprehensive understanding of the content of paragraphs than what was necessary for prior datasets.
Additional Documentation: Explore on Papers With Code
Homepage: https://allennlp.org/drop
Source code:
tfds.text.drop.DropVersions:
1.0.0: Initial release.2.0.0(default): Add all options for the answers.
Download size:
7.92 MiBDataset size:
116.24 MiBAuto-cached (documentation): Yes
Splits:
| Split | Examples |
|---|---|
'dev' |
9,536 |
'train' |
77,409 |
- Feature structure:
FeaturesDict({
'answer': Text(shape=(), dtype=string),
'passage': Text(shape=(), dtype=string),
'query_id': Text(shape=(), dtype=string),
'question': Text(shape=(), dtype=string),
'validated_answers': Sequence(Text(shape=(), dtype=string)),
})
- Feature documentation:
| Feature | Class | Shape | Dtype | Description |
|---|---|---|---|---|
| FeaturesDict | ||||
| answer | Text | string | ||
| passage | Text | string | ||
| query_id | Text | string | ||
| question | Text | string | ||
| validated_answers | Sequence(Text) | (None,) | string |
Supervised keys (See
as_superviseddoc):NoneFigure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):
- Citation:
@inproceedings{Dua2019DROP,
author={Dheeru Dua and Yizhong Wang and Pradeep Dasigi and Gabriel Stanovsky and Sameer Singh and Matt Gardner},
title={ {DROP}: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs},
booktitle={Proc. of NAACL},
year={2019}
}