- Description:
BoolQ is a question answering dataset for yes/no questions containing 15942 examples. These questions are naturally occurring, they are generated in unprompted and unconstrained settings.
Each example is a triplet of (question, passage, answer), with the title of the page as optional additional context. The text-pair classification setup is similar to existing natural language inference tasks.
Additional Documentation: Explore on Papers With Code
Homepage: https://github.com/google-research-datasets/boolean-questions
Source code:
tfds.datasets.bool_q.Builder
Versions:
1.0.0
(default): No release notes.
Download size:
8.36 MiB
Dataset size:
8.51 MiB
Auto-cached (documentation): Yes
Splits:
Split | Examples |
---|---|
'train' |
9,427 |
'validation' |
3,270 |
- Feature structure:
FeaturesDict({
'answer': bool,
'passage': Text(shape=(), dtype=string),
'question': Text(shape=(), dtype=string),
'title': Text(shape=(), dtype=string),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
answer | Tensor | bool | ||
passage | Text | string | ||
question | Text | string | ||
title | Text | string |
Supervised keys (See
as_supervised
doc):None
Figure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):
- Citation:
@inproceedings{clark2019boolq,
title = {BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions},
author = {Clark, Christopher and Lee, Kenton and Chang, Ming-Wei, and Kwiatkowski, Tom and Collins, Michael, and Toutanova, Kristina},
booktitle = {NAACL},
year = {2019},
}