conll2003
Stay organized with collections
Save and categorize content based on your preferences.
The shared task of CoNLL-2003 concerns language-independent named entity
recognition and concentrates on four types of named entities: persons,
locations, organizations and names of miscellaneous entities that do not belong
to the previous three groups.
Split |
Examples |
'dev' |
3,251 |
'test' |
3,454 |
'train' |
14,042 |
FeaturesDict({
'chunks': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=23)),
'ner': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=9)),
'pos': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=47)),
'tokens': Sequence(Text(shape=(), dtype=string)),
})
Feature |
Class |
Shape |
Dtype |
Description |
|
FeaturesDict |
|
|
|
chunks |
Sequence(ClassLabel) |
(None,) |
int64 |
|
ner |
Sequence(ClassLabel) |
(None,) |
int64 |
|
pos |
Sequence(ClassLabel) |
(None,) |
int64 |
|
tokens |
Sequence(Text) |
(None,) |
string |
|
@inproceedings{tjong-kim-sang-de-meulder-2003-introduction,
title = "Introduction to the {C}o{NLL}-2003 Shared Task: Language-Independent Named Entity Recognition",
author = "Tjong Kim Sang, Erik F. and
De Meulder, Fien",
booktitle = "Proceedings of the Seventh Conference on Natural Language Learning at {HLT}-{NAACL} 2003",
year = "2003",
url = "https://www.aclweb.org/anthology/W03-0419",
pages = "142--147",
}
conll2003/conll2003 (default config)
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2022-12-22 UTC.
[null,null,["Last updated 2022-12-22 UTC."],[],[],null,["# conll2003\n\n\u003cbr /\u003e\n\n- **Description**:\n\nThe shared task of CoNLL-2003 concerns language-independent named entity\nrecognition and concentrates on four types of named entities: persons,\nlocations, organizations and names of miscellaneous entities that do not belong\nto the previous three groups.\n\n- **Homepage** :\n \u003chttps://www.aclweb.org/anthology/W03-0419/\u003e\n\n- **Source code** :\n [`tfds.text.conll2003.Conll2003`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/text/conll2003/conll2003.py)\n\n- **Versions**:\n\n - **`1.0.0`** (default): Initial release.\n- **Download size** : `959.94 KiB`\n\n- **Dataset size** : `3.87 MiB`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n Yes\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'dev'` | 3,251 |\n| `'test'` | 3,454 |\n| `'train'` | 14,042 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'chunks': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=23)),\n 'ner': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=9)),\n 'pos': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=47)),\n 'tokens': Sequence(Text(shape=(), dtype=string)),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|---------|----------------------|---------|--------|-------------|\n| | FeaturesDict | | | |\n| chunks | Sequence(ClassLabel) | (None,) | int64 | |\n| ner | Sequence(ClassLabel) | (None,) | int64 | |\n| pos | Sequence(ClassLabel) | (None,) | int64 | |\n| tokens | Sequence(Text) | (None,) | string | |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `None`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n Not supported.\n\n- **Examples**\n ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\n- **Citation**:\n\n @inproceedings{tjong-kim-sang-de-meulder-2003-introduction,\n title = \"Introduction to the {C}o{NLL}-2003 Shared Task: Language-Independent Named Entity Recognition\",\n author = \"Tjong Kim Sang, Erik F. and\n De Meulder, Fien\",\n booktitle = \"Proceedings of the Seventh Conference on Natural Language Learning at {HLT}-{NAACL} 2003\",\n year = \"2003\",\n url = \"https://www.aclweb.org/anthology/W03-0419\",\n pages = \"142--147\",\n }\n\nconll2003/conll2003 (default config)\n------------------------------------"]]