wake_vision

  • Description:

Wake Vision is a large, high-quality dataset featuring over 6 million images, significantly exceeding the scale and diversity of current tinyML datasets (100x). This dataset includes images with annotations of whether each image contains a person. Additionally, it incorporates a comprehensive fine-grained benchmark to assess fairness and robustness, covering perceived gender, perceived age, subject distance, lighting conditions, and depictions. The Wake Vision labels are derived from Open Image's annotations which are licensed by Google LLC under CC BY 4.0 license. The images are listed as having a CC BY 2.0 license. Note from Open Images: "while we tried to identify images that are licensed under a Creative Commons Attribution license, we make no representations or warranties regarding the license status of each image and you should verify the license for each image yourself."

Split Examples
'test' 55,763
'train_large' 5,760,428
'train_quality' 1,248,230
'validation' 18,582
  • Feature structure:
FeaturesDict({
    'age_unknown': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'body_part': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'bright': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'dark': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'depiction': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'far': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'filename': Text(shape=(), dtype=string),
    'gender_unknown': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'image': Image(shape=(None, None, 3), dtype=uint8),
    'medium_distance': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'middle_age': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'near': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'non-person_depiction': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'non-person_non-depiction': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'normal_lighting': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'older': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'person': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'person_depiction': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'predominantly_female': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'predominantly_male': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'young': ClassLabel(shape=(), dtype=int64, num_classes=2),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
age_unknown ClassLabel int64
body_part ClassLabel int64
bright ClassLabel int64
dark ClassLabel int64
depiction ClassLabel int64
far ClassLabel int64
filename Text string
gender_unknown ClassLabel int64
image Image (None, None, 3) uint8
medium_distance ClassLabel int64
middle_age ClassLabel int64
near ClassLabel int64
non-person_depiction ClassLabel int64
non-person_non-depiction ClassLabel int64
normal_lighting ClassLabel int64
older ClassLabel int64
person ClassLabel int64
person_depiction ClassLabel int64
predominantly_female ClassLabel int64
predominantly_male ClassLabel int64
young ClassLabel int64

Visualization

@article{banbury2024wake,
  title={Wake Vision: A Large-scale, Diverse Dataset and Benchmark Suite for TinyML Person Detection},
  author={Banbury, Colby and Njor, Emil and Stewart, Matthew and Warden, Pete and Kudlur, Manjunath and Jeffries, Nat and Fafoutis, Xenofon and Reddi, Vijay Janapa},
  journal={arXiv preprint arXiv:2405.00892},
  year={2024}
}