Google I/O is a wrap! Catch up on TensorFlow sessions View sessions

pass

PASS is a large-scale image dataset that does not include any humans, human parts, or other personally identifiable information. It that can be used for high-quality self-supervised pretraining while significantly reducing privacy concerns.

PASS contains 1,439,719 images without any labels sourced from YFCC-100M.

All images in this dataset are licenced under the CC-BY licence, as is the dataset itself. For YFCC-100M see http://www.multimediacommons.org/

Split Examples
'train' 1,439,719
  • Feature structure:
FeaturesDict({
    'image': Image(shape=(None, None, 3), dtype=tf.uint8),
    'image/creator_uname': Text(shape=(), dtype=tf.string),
    'image/date_taken': Text(shape=(), dtype=tf.string),
    'image/gps_lat': tf.float32,
    'image/gps_lon': tf.float32,
    'image/hash': Text(shape=(), dtype=tf.string),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
image Image (None, None, 3) tf.uint8
image/creator_uname Text tf.string
image/date_taken Text tf.string
image/gps_lat Tensor tf.float32
image/gps_lon Tensor tf.float32
image/hash Text tf.string

Visualization

  • Citation:
@Article{asano21pass,
author = "Yuki M. Asano and Christian Rupprecht and Andrew Zisserman and Andrea Vedaldi",
title = "PASS: An ImageNet replacement for self-supervised pretraining without humans",
journal = "NeurIPS Track on Datasets and Benchmarks",
year = "2021"
}