TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

cifar10_1

Visualization: Explore in Know Your Data
Description:

The CIFAR-10.1 dataset is a new test set for CIFAR-10. CIFAR-10.1 contains roughly 2,000 new test images that were sampled after multiple years of research on the original CIFAR-10 dataset. The data collection for CIFAR-10.1 was designed to minimize distribution shift relative to the original dataset. We describe the creation of CIFAR-10.1 in the paper "Do CIFAR-10 Classifiers Generalize to CIFAR-10?". The images in CIFAR-10.1 are a subset of the TinyImages dataset. There are currently two versions of the CIFAR-10.1 dataset: v4 and v6.

Homepage: https://github.com/modestyachts/CIFAR-10.1
Source code: tfds.image_classification.Cifar10_1
Versions:
- 1.1.0 (default): No release notes.
Auto-cached (documentation): Yes
Feature structure:

FeaturesDict({
    'image': Image(shape=(32, 32, 3), dtype=uint8),
    'label': ClassLabel(shape=(), dtype=int64, num_classes=10),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
image	Image	(32, 32, 3)	uint8
label	ClassLabel		int64

Supervised keys (See as_supervised doc): ('image', 'label')
Citation:

@article{recht2018cifar10.1,
  author = {Benjamin Recht and Rebecca Roelofs and Ludwig Schmidt and Vaishaal Shankar},
  title = {Do CIFAR-10 Classifiers Generalize to CIFAR-10?},
  year = {2018},
  note = {\url{https://arxiv.org/abs/1806.00451} },
}

@article{torralba2008tinyimages,
  author = {Antonio Torralba and Rob Fergus and William T. Freeman},
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
  title = {80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition},
  year = {2008},
  volume = {30},
  number = {11},
  pages = {1958-1970}
}

cifar10_1/v4 (default config)

Config description: It is the first version of our dataset on which we tested any classifier. As mentioned above, this makes the v4 dataset independent of the classifiers we evaluate. The numbers reported in the main sections of our paper use this version of the dataset. It was built from the top 25 TinyImages keywords for each class, which led to a slight class imbalance. The largest difference is that ships make up only 8% of the test set instead of 10%. v4 contains 2,021 images.
Download size: 5.93 MiB
Dataset size: 4.46 MiB
Splits:

Split	Examples
`'test'`	2,021

Figure (tfds.show_examples):

Visualization

Examples (tfds.as_dataframe):

cifar10_1/v6

Config description: It is derived from a slightly improved keyword allocation that is exactly class balanced. This version of the dataset corresponds to the results in Appendix D of our paper. v6 contains 2,000 images.
Download size: 5.87 MiB
Dataset size: 4.40 MiB
Splits:

Split	Examples
`'test'`	2,000

Figure (tfds.show_examples):

Visualization

Examples (tfds.as_dataframe):