Join TensorFlow at Google I/O, May 11-12 Register now


Scene parsing is to segment and parse an image into different image regions associated with semantic categories, such as sky, road, person, and bed. MIT Scene Parsing Benchmark (SceneParse150) provides a standard training and evaluation platform for the algorithms of scene parsing.

Split Examples
'test' 2,000
'train' 20,210
  • Feature structure:
    'annotation': Image(shape=(None, None, 3), dtype=tf.uint8),
    'image': Image(shape=(None, None, 3), dtype=tf.uint8),
  • Feature documentation:
Feature Class Shape Dtype Description
annotation Image (None, None, 3) tf.uint8
image Image (None, None, 3) tf.uint8
  • Citation:
title={Scene Parsing through ADE20K Dataset},
author={Zhou, Bolei and Zhao, Hang and Puig, Xavier and Fidler, Sanja and Barriuso, Adela and Torralba, Antonio},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},