• Description:

The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect.

Split Examples
'train' 47,584
'validation' 654
  • Feature structure:
    'depth': Tensor(shape=(480, 640), dtype=float16),
    'image': Image(shape=(480, 640, 3), dtype=uint8),
  • Feature documentation:
Feature Class Shape Dtype Description
depth Tensor (480, 640) float16
image Image (480, 640, 3) uint8


  • Citation:
  author    = {Nathan Silberman, Derek Hoiem, Pushmeet Kohli and Rob Fergus},
  title     = {Indoor Segmentation and Support Inference from RGBD Images},
  booktitle = {ECCV},
  year      = {2012}
  author    = {Wofk, Diana and Ma, Fangchang and Yang, Tien-Ju and Karaman, Sertac and Sze, Vivienne},
  title     = {FastDepth: Fast Monocular Depth Estimation on Embedded Systems},
  booktitle = {IEEE International Conference on Robotics and Automation (ICRA)},
  year      = {2019}