natural_questions_open
Stay organized with collections
Save and categorize content based on your preferences.
The NQ-Open task, introduced by Lee et.al. 2019, is an open domain question
answering benchmark that is derived from Natural Questions. The goal is to
predict an English answer string for an input English question. All questions
can be answered using the contents of English Wikipedia.
Split |
Examples |
'train' |
87,925 |
'validation' |
3,610 |
FeaturesDict({
'answer': Sequence(string),
'question': string,
})
Feature |
Class |
Shape |
Dtype |
Description |
|
FeaturesDict |
|
|
|
answer |
Sequence(Tensor) |
(None,) |
string |
|
question |
Tensor |
|
string |
|
@inproceedings{orqa,
title = {Latent Retrieval for Weakly Supervised Open Domain Question Answering},
author = {Lee, Kenton and Chang, Ming-Wei and Toutanova, Kristina},
year = {2019},
month = {01},
pages = {6086-6096},
booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics},
doi = {10.18653/v1/P19-1612}
}
@article{47761,
title = {Natural Questions: a Benchmark for Question Answering Research},
author = {Tom Kwiatkowski and Jennimaria Palomaki and Olivia Redfield and Michael Collins and Ankur Parikh and Chris Alberti and Danielle Epstein and Illia Polosukhin and Matthew Kelcey and Jacob Devlin and Kenton Lee and Kristina N. Toutanova and Llion Jones and Ming-Wei Chang and Andrew Dai and Jakob Uszkoreit and Quoc Le and Slav Petrov},
year = {2019},
journal = {Transactions of the Association of Computational Linguistics}
}
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2022-12-14 UTC.
[null,null,["Last updated 2022-12-14 UTC."],[],[],null,["# natural_questions_open\n\n\u003cbr /\u003e\n\n- **Description**:\n\nThe NQ-Open task, introduced by Lee et.al. 2019, is an open domain question\nanswering benchmark that is derived from Natural Questions. The goal is to\npredict an English answer string for an input English question. All questions\ncan be answered using the contents of English Wikipedia.\n\n- **Homepage** :\n \u003chttps://github.com/google-research-datasets/natural-questions/tree/master/nq_open\u003e\n\n- **Source code** :\n [`tfds.datasets.natural_questions_open.Builder`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/datasets/natural_questions_open/natural_questions_open_dataset_builder.py)\n\n- **Versions**:\n\n - **`1.0.0`** (default): No release notes.\n- **Download size** : `8.50 MiB`\n\n- **Dataset size** : `8.70 MiB`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n Yes\n\n- **Splits**:\n\n| Split | Examples |\n|----------------|----------|\n| `'train'` | 87,925 |\n| `'validation'` | 3,610 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'answer': Sequence(string),\n 'question': string,\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|----------|------------------|---------|--------|-------------|\n| | FeaturesDict | | | |\n| answer | Sequence(Tensor) | (None,) | string | |\n| question | Tensor | | string | |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `None`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n Not supported.\n\n- **Examples**\n ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\n- **Citation**:\n\n @inproceedings{orqa,\n title = {Latent Retrieval for Weakly Supervised Open Domain Question Answering},\n author = {Lee, Kenton and Chang, Ming-Wei and Toutanova, Kristina},\n year = {2019},\n month = {01},\n pages = {6086-6096},\n booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics},\n doi = {10.18653/v1/P19-1612}\n }\n\n @article{47761,\n title = {Natural Questions: a Benchmark for Question Answering Research},\n author = {Tom Kwiatkowski and Jennimaria Palomaki and Olivia Redfield and Michael Collins and Ankur Parikh and Chris Alberti and Danielle Epstein and Illia Polosukhin and Matthew Kelcey and Jacob Devlin and Kenton Lee and Kristina N. Toutanova and Llion Jones and Ming-Wei Chang and Andrew Dai and Jakob Uszkoreit and Quoc Le and Slav Petrov},\n year = {2019},\n journal = {Transactions of the Association of Computational Linguistics}\n }"]]