cifar10_n
Stay organized with collections
Save and categorize content based on your preferences.
Warning: Manual download required. See instructions below.
A re-labeled version of CIFAR-10 with real human annotation errors. For every
pair (image, label) in the original CIFAR-10 train set, it provides several
additional labels given by real human annotators.
Homepage :
https://ucsc-real.soe.ucsc.edu:1995/Home.html/
Source code :
tfds.image_classification.cifar10_n.Cifar10N
Versions :
1.0.0
: Initial release.
1.0.1
: Fixed typo in worse_label
key.
1.0.2
: Fixed correspondence between annotations and images.
1.0.3
: Fixed files in MANUAL_DIR
.
1.0.4
(default): Fixed loading of side information.
Download size : 162.17 MiB
Dataset size : 147.91 MiB
Manual download instructions : This dataset requires you to
download the source data manually into download_config.manual_dir
(defaults to ~/tensorflow_datasets/downloads/manual/
):
Download 'side_info_cifar10N.csv', 'CIFAR-10_human_ordered.npy' and
'image_order_c10.npy' from https://github.com/UCSC-REAL/cifar-10-100n
Then convert 'CIFAR-10_human_ordered.npy' into a CSV file
'CIFAR-10_human_annotations.csv'. This can be done with the following code:
import numpy as np
from tensorflow_datasets.core.utils.lazy_imports_utils import pandas as pd
from tensorflow_datasets.core.utils.lazy_imports_utils import tensorflow as tf
human_labels_np_path = '<local_path>/CIFAR-10_human_ordered.npy'
human_labels_csv_path = '<local_path>/CIFAR-10_human_annotations.csv'
with tf . io . gfile . GFile ( human_labels_np_path , "rb" ) as f :
human_annotations = np . load ( f , allow_pickle = True )
df = pd . DataFrame ( human_annotations [()])
with tf . io . gfile . GFile ( human_labels_csv_path , "w" ) as f :
df . to_csv ( f , index = False )
Split
Examples
'test'
10,000
'train'
50,000
FeaturesDict ({
'aggre_label' : ClassLabel ( shape = (), dtype = int64 , num_classes = 10 ),
'id' : Text ( shape = (), dtype = string ),
'image' : Image ( shape = ( 32 , 32 , 3 ), dtype = uint8 ),
'label' : ClassLabel ( shape = (), dtype = int64 , num_classes = 10 ),
'random_label1' : ClassLabel ( shape = (), dtype = int64 , num_classes = 10 ),
'random_label2' : ClassLabel ( shape = (), dtype = int64 , num_classes = 10 ),
'random_label3' : ClassLabel ( shape = (), dtype = int64 , num_classes = 10 ),
'worker1_id' : int64 ,
'worker1_time' : float32 ,
'worker2_id' : int64 ,
'worker2_time' : float32 ,
'worker3_id' : int64 ,
'worker3_time' : float32 ,
'worse_label' : ClassLabel ( shape = (), dtype = int64 , num_classes = 10 ),
})
Feature
Class
Shape
Dtype
Description
FeaturesDict
aggre_label
ClassLabel
int64
id
Text
string
image
Image
(32, 32, 3)
uint8
label
ClassLabel
int64
random_label1
ClassLabel
int64
random_label2
ClassLabel
int64
random_label3
ClassLabel
int64
worker1_id
Tensor
int64
worker1_time
Tensor
float32
worker2_id
Tensor
int64
worker2_time
Tensor
float32
worker3_id
Tensor
int64
worker3_time
Tensor
float32
worse_label
ClassLabel
int64
@inproceedings { wei2022learning ,
title = { Learning with Noisy Labels Revisited : A Study Using Real - World Human
Annotations } ,
author = { Jiaheng Wei and Zhaowei Zhu and Hao Cheng and Tongliang Liu and Gang
Niu and Yang Liu } ,
booktitle = { International Conference on Learning Representations } ,
year = { 2022 } ,
url = { https : // openreview . net / forum ? id = TBWA6PLJZQm }
}
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License , and code samples are licensed under the Apache 2.0 License . For details, see the Google Developers Site Policies . Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2023-08-11 UTC.
[null,null,["Last updated 2023-08-11 UTC."],[],[],null,["# cifar10_n\n\n\u003cbr /\u003e\n\n| **Warning:** Manual download required. See instructions below.\n\n- **Description**:\n\nA re-labeled version of CIFAR-10 with real human annotation errors. For every\npair (image, label) in the original CIFAR-10 train set, it provides several\nadditional labels given by real human annotators.\n\n- **Homepage** :\n \u003chttps://ucsc-real.soe.ucsc.edu:1995/Home.html/\u003e\n\n- **Source code** :\n [`tfds.image_classification.cifar10_n.Cifar10N`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/image_classification/cifar10_n/cifar10_n.py)\n\n- **Versions**:\n\n - `1.0.0`: Initial release.\n - `1.0.1`: Fixed typo in `worse_label` key.\n - `1.0.2`: Fixed correspondence between annotations and images.\n - `1.0.3`: Fixed files in `MANUAL_DIR`.\n - **`1.0.4`** (default): Fixed loading of side information.\n- **Download size** : `162.17 MiB`\n\n- **Dataset size** : `147.91 MiB`\n\n- **Manual download instructions** : This dataset requires you to\n download the source data manually into `download_config.manual_dir`\n (defaults to `~/tensorflow_datasets/downloads/manual/`): \n\n Download 'side_info_cifar10N.csv', 'CIFAR-10_human_ordered.npy' and\n 'image_order_c10.npy' from \u003chttps://github.com/UCSC-REAL/cifar-10-100n\u003e\n\nThen convert 'CIFAR-10_human_ordered.npy' into a CSV file\n'CIFAR-10_human_annotations.csv'. This can be done with the following code: \n\n import numpy as np\n from tensorflow_datasets.core.utils.lazy_imports_utils import pandas as pd\n from tensorflow_datasets.core.utils.lazy_imports_utils import tensorflow as tf\n\n human_labels_np_path = '\u003clocal_path\u003e/CIFAR-10_human_ordered.npy'\n human_labels_csv_path = '\u003clocal_path\u003e/CIFAR-10_human_annotations.csv'\n\n with tf.io.gfile.GFile(human_labels_np_path, \"rb\") as f:\n human_annotations = np.load(f, allow_pickle=True)\n\n df = pd.DataFrame(human_annotations[()])\n\n with tf.io.gfile.GFile(human_labels_csv_path, \"w\") as f:\n df.to_csv(f, index=False)\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n Yes\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'test'` | 10,000 |\n| `'train'` | 50,000 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'aggre_label': ClassLabel(shape=(), dtype=int64, num_classes=10),\n 'id': Text(shape=(), dtype=string),\n 'image': Image(shape=(32, 32, 3), dtype=uint8),\n 'label': ClassLabel(shape=(), dtype=int64, num_classes=10),\n 'random_label1': ClassLabel(shape=(), dtype=int64, num_classes=10),\n 'random_label2': ClassLabel(shape=(), dtype=int64, num_classes=10),\n 'random_label3': ClassLabel(shape=(), dtype=int64, num_classes=10),\n 'worker1_id': int64,\n 'worker1_time': float32,\n 'worker2_id': int64,\n 'worker2_time': float32,\n 'worker3_id': int64,\n 'worker3_time': float32,\n 'worse_label': ClassLabel(shape=(), dtype=int64, num_classes=10),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|---------------|--------------|-------------|---------|-------------|\n| | FeaturesDict | | | |\n| aggre_label | ClassLabel | | int64 | |\n| id | Text | | string | |\n| image | Image | (32, 32, 3) | uint8 | |\n| label | ClassLabel | | int64 | |\n| random_label1 | ClassLabel | | int64 | |\n| random_label2 | ClassLabel | | int64 | |\n| random_label3 | ClassLabel | | int64 | |\n| worker1_id | Tensor | | int64 | |\n| worker1_time | Tensor | | float32 | |\n| worker2_id | Tensor | | int64 | |\n| worker2_time | Tensor | | float32 | |\n| worker3_id | Tensor | | int64 | |\n| worker3_time | Tensor | | float32 | |\n| worse_label | ClassLabel | | int64 | |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `None`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\n- **Citation**:\n\n @inproceedings{wei2022learning,\n title={Learning with Noisy Labels Revisited: A Study Using Real-World Human\n Annotations},\n author={Jiaheng Wei and Zhaowei Zhu and Hao Cheng and Tongliang Liu and Gang\n Niu and Yang Liu},\n booktitle={International Conference on Learning Representations},\n year={2022},\n url={https://openreview.net/forum?id=TBWA6PLJZQm}\n }"]]