snli
Stay organized with collections
Save and categorize content based on your preferences.
The SNLI corpus (version 1.0) is a collection of 570k human-written English
sentence pairs manually labeled for balanced classification with the labels
entailment, contradiction, and neutral, supporting the task of natural language
inference (NLI), also known as recognizing textual entailment (RTE).
Split |
Examples |
'test' |
10,000 |
'train' |
550,152 |
'validation' |
10,000 |
FeaturesDict({
'hypothesis': Text(shape=(), dtype=string),
'label': ClassLabel(shape=(), dtype=int64, num_classes=3),
'premise': Text(shape=(), dtype=string),
})
Feature |
Class |
Shape |
Dtype |
Description |
|
FeaturesDict |
|
|
|
hypothesis |
Text |
|
string |
|
label |
ClassLabel |
|
int64 |
|
premise |
Text |
|
string |
|
@inproceedings{snli:emnlp2015,
Author = {Bowman, Samuel R. and Angeli, Gabor and Potts, Christopher, and Manning, Christopher D.},
Booktitle = {Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
Publisher = {Association for Computational Linguistics},
Title = {A large annotated corpus for learning natural language inference},
Year = {2015}
}
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2023-01-13 UTC.
[null,null,["Last updated 2023-01-13 UTC."],[],[],null,["# snli\n\n\u003cbr /\u003e\n\n- **Description**:\n\nThe SNLI corpus (version 1.0) is a collection of 570k human-written English\nsentence pairs manually labeled for balanced classification with the labels\nentailment, contradiction, and neutral, supporting the task of natural language\ninference (NLI), also known as recognizing textual entailment (RTE).\n\n- **Additional Documentation** :\n [Explore on Papers With Code\n north_east](https://paperswithcode.com/dataset/snli)\n\n- **Homepage** :\n \u003chttps://nlp.stanford.edu/projects/snli/\u003e\n\n- **Source code** :\n [`tfds.datasets.snli.Builder`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/datasets/snli/snli_dataset_builder.py)\n\n- **Versions**:\n\n - **`1.1.0`** (default): No release notes.\n- **Download size** : `90.17 MiB`\n\n- **Dataset size** : `87.00 MiB`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n Yes\n\n- **Splits**:\n\n| Split | Examples |\n|----------------|----------|\n| `'test'` | 10,000 |\n| `'train'` | 550,152 |\n| `'validation'` | 10,000 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'hypothesis': Text(shape=(), dtype=string),\n 'label': ClassLabel(shape=(), dtype=int64, num_classes=3),\n 'premise': Text(shape=(), dtype=string),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|------------|--------------|-------|--------|-------------|\n| | FeaturesDict | | | |\n| hypothesis | Text | | string | |\n| label | ClassLabel | | int64 | |\n| premise | Text | | string | |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `None`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n Not supported.\n\n- **Examples**\n ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\n- **Citation**:\n\n @inproceedings{snli:emnlp2015,\n Author = {Bowman, Samuel R. and Angeli, Gabor and Potts, Christopher, and Manning, Christopher D.},\n Booktitle = {Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP)},\n Publisher = {Association for Computational Linguistics},\n Title = {A large annotated corpus for learning natural language inference},\n Year = {2015}\n }"]]