openbookqa
Stay organized with collections
Save and categorize content based on your preferences.
The dataset contains 5,957 4-way multiple choice questions. Additionally, they
provide 5,167 crowd-sourced common knowledge facts, and an expanded version of
the train/dev/test questions where each question is associated with its
originating core fact, a human accuracy score, a clarity score, and an
anonymized crowd-worker ID.
Split |
Examples |
'test' |
500 |
'train' |
4,957 |
'validation' |
500 |
FeaturesDict({
'answerKey': ClassLabel(shape=(), dtype=int64, num_classes=4),
'clarity': float32,
'fact1': Text(shape=(), dtype=string),
'humanScore': float32,
'question': FeaturesDict({
'choice_A': Text(shape=(), dtype=string),
'choice_B': Text(shape=(), dtype=string),
'choice_C': Text(shape=(), dtype=string),
'choice_D': Text(shape=(), dtype=string),
'stem': Text(shape=(), dtype=string),
}),
'turkIdAnonymized': Text(shape=(), dtype=string),
})
Feature |
Class |
Shape |
Dtype |
Description |
|
FeaturesDict |
|
|
|
answerKey |
ClassLabel |
|
int64 |
|
clarity |
Tensor |
|
float32 |
|
fact1 |
Text |
|
string |
|
humanScore |
Tensor |
|
float32 |
|
question |
FeaturesDict |
|
|
|
question/choice_A |
Text |
|
string |
|
question/choice_B |
Text |
|
string |
|
question/choice_C |
Text |
|
string |
|
question/choice_D |
Text |
|
string |
|
question/stem |
Text |
|
string |
|
turkIdAnonymized |
Text |
|
string |
|
@article{mihaylov2018can,
title={Can a suit of armor conduct electricity? a new dataset for open book question answering},
author={Mihaylov, Todor and Clark, Peter and Khot, Tushar and Sabharwal, Ashish},
journal={arXiv preprint arXiv:1809.02789},
year={2018}
}
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2022-12-15 UTC.
[null,null,["Last updated 2022-12-15 UTC."],[],[],null,["# openbookqa\n\n\u003cbr /\u003e\n\n- **Description**:\n\nThe dataset contains 5,957 4-way multiple choice questions. Additionally, they\nprovide 5,167 crowd-sourced common knowledge facts, and an expanded version of\nthe train/dev/test questions where each question is associated with its\noriginating core fact, a human accuracy score, a clarity score, and an\nanonymized crowd-worker ID.\n\n- **Additional Documentation** :\n [Explore on Papers With Code\n north_east](https://paperswithcode.com/dataset/openbookqa)\n\n- **Homepage** :\n \u003chttps://leaderboard.allenai.org/open_book_qa/submissions/get-started\u003e\n\n- **Source code** :\n [`tfds.datasets.openbookqa.Builder`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/datasets/openbookqa/openbookqa_dataset_builder.py)\n\n- **Versions**:\n\n - **`0.1.0`** (default): No release notes.\n- **Download size** : `1.38 MiB`\n\n- **Dataset size** : `2.40 MiB`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n Yes\n\n- **Splits**:\n\n| Split | Examples |\n|----------------|----------|\n| `'test'` | 500 |\n| `'train'` | 4,957 |\n| `'validation'` | 500 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'answerKey': ClassLabel(shape=(), dtype=int64, num_classes=4),\n 'clarity': float32,\n 'fact1': Text(shape=(), dtype=string),\n 'humanScore': float32,\n 'question': FeaturesDict({\n 'choice_A': Text(shape=(), dtype=string),\n 'choice_B': Text(shape=(), dtype=string),\n 'choice_C': Text(shape=(), dtype=string),\n 'choice_D': Text(shape=(), dtype=string),\n 'stem': Text(shape=(), dtype=string),\n }),\n 'turkIdAnonymized': Text(shape=(), dtype=string),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|-------------------|--------------|-------|---------|-------------|\n| | FeaturesDict | | | |\n| answerKey | ClassLabel | | int64 | |\n| clarity | Tensor | | float32 | |\n| fact1 | Text | | string | |\n| humanScore | Tensor | | float32 | |\n| question | FeaturesDict | | | |\n| question/choice_A | Text | | string | |\n| question/choice_B | Text | | string | |\n| question/choice_C | Text | | string | |\n| question/choice_D | Text | | string | |\n| question/stem | Text | | string | |\n| turkIdAnonymized | Text | | string | |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `('question', 'answerKey')`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n Not supported.\n\n- **Examples**\n ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\n- **Citation**:\n\n @article{mihaylov2018can,\n title={Can a suit of armor conduct electricity? a new dataset for open book question answering},\n author={Mihaylov, Todor and Clark, Peter and Khot, Tushar and Sabharwal, Ashish},\n journal={arXiv preprint arXiv:1809.02789},\n year={2018}\n }"]]