• Description:

The STL-10 dataset is an image recognition dataset for developing unsupervised feature learning, deep learning, self-taught learning algorithms. It is inspired by the CIFAR-10 dataset but with some modifications. In particular, each class has fewer labeled training examples than in CIFAR-10, but a very large set of unlabeled examples is provided to learn image models prior to supervised training. The primary challenge is to make use of the unlabeled data (which comes from a similar but different distribution from the labeled data) to build a useful prior. All images were acquired from labeled examples on ImageNet.

Split Examples
'test' 8,000
'train' 5,000
'unlabelled' 100,000
  • Feature structure:
    'image': Image(shape=(96, 96, 3), dtype=uint8),
    'label': ClassLabel(shape=(), dtype=int64, num_classes=10),
  • Feature documentation:
Feature Class Shape Dtype Description
image Image (96, 96, 3) uint8
label ClassLabel int64


  • Citation:
  title={ {An Analysis of Single Layer Networks in Unsupervised Feature Learning} },
  author={Coates, Adam and Ng, Andrew and Lee, Honglak},
  note = {\url{} },