Attend the Women in ML Symposium on December 7 Register now


  • Description:

The iNaturalist dataset 2021 contains a total of 10,000 species. The full training dataset contains nearly 2.7M images. To make the dataset more accessible we have also created a "mini" training dataset with 50 examples per species for a total of 500K images. The full training train split overlaps with the mini split. The val set contains for each species 10 validation images (100K in total). There are a total of 500,000 test images in the public_test split (without ground-truth labels).

Split Examples
'mini' 500,000
'test' 500,000
'train' 2,686,843
'val' 100,000
  • Feature structure:
    'id': Text(shape=(), dtype=object),
    'image': Image(shape=(None, None, 3), dtype=uint8),
    'label': ClassLabel(shape=(), dtype=int64, num_classes=10000),
    'supercategory': ClassLabel(shape=(), dtype=int64, num_classes=11),
  • Feature documentation:
Feature Class Shape Dtype Description
id Text object
image Image (None, None, 3) uint8
label ClassLabel int64
supercategory ClassLabel int64


  • Citation:
    Howpublished = {~\url{} },
    Title = { {iNaturalist} 2021 competition dataset.},
    Year = {2021},
    key = { {iNaturalist} 2021 competition dataset},