emnist
Stay organized with collections
Save and categorize content based on your preferences.
The EMNIST dataset is a set of handwritten character digits derived from the
NIST Special Database 19 and converted to a 28x28 pixel image format and dataset
structure that directly matches the MNIST dataset.
Note: Like the original EMNIST data, images provided here are inverted
horizontally and rotated 90 anti-clockwise. You can use tf.transpose
within
ds.map
to convert the images to a human-friendlier format.
@article { cohen_afshar_tapson_schaik_2017 ,
title = { EMNIST : Extending MNIST to handwritten letters } ,
DOI = { 10.1109 / ijcnn .2017.7966217 } ,
journal = { 2017 International Joint Conference on Neural Networks ( IJCNN ) } ,
author = { Cohen , Gregory and Afshar , Saeed and Tapson , Jonathan and Schaik , Andre Van } ,
year = { 2017 }
}
emnist/byclass (default config)
Split
Examples
'test'
116,323
'train'
697,932
FeaturesDict ({
'image' : Image ( shape = ( 28 , 28 , 1 ), dtype = uint8 ),
'label' : ClassLabel ( shape = (), dtype = int64 , num_classes = 62 ),
})
Feature
Class
Shape
Dtype
Description
FeaturesDict
image
Image
(28, 28, 1)
uint8
label
ClassLabel
int64
emnist/bymerge
Split
Examples
'test'
116,323
'train'
697,932
FeaturesDict ({
'image' : Image ( shape = ( 28 , 28 , 1 ), dtype = uint8 ),
'label' : ClassLabel ( shape = (), dtype = int64 , num_classes = 47 ),
})
Feature
Class
Shape
Dtype
Description
FeaturesDict
image
Image
(28, 28, 1)
uint8
label
ClassLabel
int64
emnist/balanced
Split
Examples
'test'
18,800
'train'
112,800
FeaturesDict ({
'image' : Image ( shape = ( 28 , 28 , 1 ), dtype = uint8 ),
'label' : ClassLabel ( shape = (), dtype = int64 , num_classes = 47 ),
})
Feature
Class
Shape
Dtype
Description
FeaturesDict
image
Image
(28, 28, 1)
uint8
label
ClassLabel
int64
emnist/letters
Split
Examples
'test'
14,800
'train'
88,800
FeaturesDict ({
'image' : Image ( shape = ( 28 , 28 , 1 ), dtype = uint8 ),
'label' : ClassLabel ( shape = (), dtype = int64 , num_classes = 37 ),
})
Feature
Class
Shape
Dtype
Description
FeaturesDict
image
Image
(28, 28, 1)
uint8
label
ClassLabel
int64
emnist/digits
Split
Examples
'test'
40,000
'train'
240,000
FeaturesDict ({
'image' : Image ( shape = ( 28 , 28 , 1 ), dtype = uint8 ),
'label' : ClassLabel ( shape = (), dtype = int64 , num_classes = 10 ),
})
Feature
Class
Shape
Dtype
Description
FeaturesDict
image
Image
(28, 28, 1)
uint8
label
ClassLabel
int64
emnist/mnist
Split
Examples
'test'
10,000
'train'
60,000
FeaturesDict ({
'image' : Image ( shape = ( 28 , 28 , 1 ), dtype = uint8 ),
'label' : ClassLabel ( shape = (), dtype = int64 , num_classes = 10 ),
})
Feature
Class
Shape
Dtype
Description
FeaturesDict
image
Image
(28, 28, 1)
uint8
label
ClassLabel
int64
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License , and code samples are licensed under the Apache 2.0 License . For details, see the Google Developers Site Policies . Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-06-01 UTC.
[null,null,["Last updated 2024-06-01 UTC."],[],[],null,["# emnist\n\n\u003cbr /\u003e\n\n- **Description**:\n\nThe EMNIST dataset is a set of handwritten character digits derived from the\nNIST Special Database 19 and converted to a 28x28 pixel image format and dataset\nstructure that directly matches the MNIST dataset.\n| **Note:** Like the original EMNIST data, images provided here are inverted horizontally and rotated 90 anti-clockwise. You can use `tf.transpose` within `ds.map` to convert the images to a human-friendlier format.\n\n- **Additional Documentation** :\n [Explore on Papers With Code\n north_east](https://paperswithcode.com/dataset/emnist)\n\n- **Homepage** :\n \u003chttps://www.nist.gov/itl/products-and-services/emnist-dataset\u003e\n\n- **Source code** :\n [`tfds.image_classification.EMNIST`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/image_classification/mnist.py)\n\n- **Versions**:\n\n - `3.0.0`: New split API (\u003chttps://tensorflow.org/datasets/splits\u003e)\n - **`3.1.0`** (default): Updated broken download URL\n- **Download size** : `535.73 MiB`\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `('image', 'label')`\n\n- **Citation**:\n\n @article{cohen_afshar_tapson_schaik_2017,\n title={EMNIST: Extending MNIST to handwritten letters},\n DOI={10.1109/ijcnn.2017.7966217},\n journal={2017 International Joint Conference on Neural Networks (IJCNN)},\n author={Cohen, Gregory and Afshar, Saeed and Tapson, Jonathan and Schaik, Andre Van},\n year={2017}\n }\n\nemnist/byclass (default config)\n-------------------------------\n\n- **Config description**: EMNIST ByClass\n\n- **Dataset size** : `349.16 MiB`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n No\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'test'` | 116,323 |\n| `'train'` | 697,932 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'image': Image(shape=(28, 28, 1), dtype=uint8),\n 'label': ClassLabel(shape=(), dtype=int64, num_classes=62),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|---------|--------------|-------------|-------|-------------|\n| | FeaturesDict | | | |\n| image | Image | (28, 28, 1) | uint8 | |\n| label | ClassLabel | | int64 | |\n\n- **Figure** ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nemnist/bymerge\n--------------\n\n- **Config description**: EMNIST ByMerge\n\n- **Dataset size** : `349.16 MiB`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n No\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'test'` | 116,323 |\n| `'train'` | 697,932 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'image': Image(shape=(28, 28, 1), dtype=uint8),\n 'label': ClassLabel(shape=(), dtype=int64, num_classes=47),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|---------|--------------|-------------|-------|-------------|\n| | FeaturesDict | | | |\n| image | Image | (28, 28, 1) | uint8 | |\n| label | ClassLabel | | int64 | |\n\n- **Figure** ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nemnist/balanced\n---------------\n\n- **Config description**: EMNIST Balanced\n\n- **Dataset size** : `56.63 MiB`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n Yes\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'test'` | 18,800 |\n| `'train'` | 112,800 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'image': Image(shape=(28, 28, 1), dtype=uint8),\n 'label': ClassLabel(shape=(), dtype=int64, num_classes=47),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|---------|--------------|-------------|-------|-------------|\n| | FeaturesDict | | | |\n| image | Image | (28, 28, 1) | uint8 | |\n| label | ClassLabel | | int64 | |\n\n- **Figure** ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nemnist/letters\n--------------\n\n- **Config description**: EMNIST Letters\n\n- **Dataset size** : `44.14 MiB`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n Yes\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'test'` | 14,800 |\n| `'train'` | 88,800 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'image': Image(shape=(28, 28, 1), dtype=uint8),\n 'label': ClassLabel(shape=(), dtype=int64, num_classes=37),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|---------|--------------|-------------|-------|-------------|\n| | FeaturesDict | | | |\n| image | Image | (28, 28, 1) | uint8 | |\n| label | ClassLabel | | int64 | |\n\n- **Figure** ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nemnist/digits\n-------------\n\n- **Config description**: EMNIST Digits\n\n- **Dataset size** : `120.32 MiB`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n Yes\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'test'` | 40,000 |\n| `'train'` | 240,000 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'image': Image(shape=(28, 28, 1), dtype=uint8),\n 'label': ClassLabel(shape=(), dtype=int64, num_classes=10),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|---------|--------------|-------------|-------|-------------|\n| | FeaturesDict | | | |\n| image | Image | (28, 28, 1) | uint8 | |\n| label | ClassLabel | | int64 | |\n\n- **Figure** ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nemnist/mnist\n------------\n\n- **Config description**: EMNIST MNIST\n\n- **Dataset size** : `30.09 MiB`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n Yes\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'test'` | 10,000 |\n| `'train'` | 60,000 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'image': Image(shape=(28, 28, 1), dtype=uint8),\n 'label': ClassLabel(shape=(), dtype=int64, num_classes=10),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|---------|--------------|-------------|-------|-------------|\n| | FeaturesDict | | | |\n| image | Image | (28, 28, 1) | uint8 | |\n| label | ClassLabel | | int64 | |\n\n- **Figure** ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples..."]]