kitti
Stay organized with collections
Save and categorize content based on your preferences.
Kitti contains a suite of vision tasks built using an autonomous driving
platform. The full benchmark contains many tasks such as stereo, optical flow,
visual odometry, etc. This dataset contains the object detection dataset,
including the monocular images and bounding boxes. The dataset contains 7481
training images annotated with 3D bounding boxes. A full description of the
annotations can be found in the readme of the object development kit readme on
the Kitti homepage.
Split |
Examples |
'test' |
711 |
'train' |
6,347 |
'validation' |
423 |
FeaturesDict({
'image': Image(shape=(None, None, 3), dtype=uint8),
'image/file_name': Text(shape=(), dtype=string),
'objects': Sequence({
'alpha': float32,
'bbox': BBoxFeature(shape=(4,), dtype=float32, description=2D bounding box of object in the image),
'dimensions': Tensor(shape=(3,), dtype=float32, description=3D object dimensions: height, width, length (in meters)),
'location': Tensor(shape=(3,), dtype=float32, description=3D object location x,y,z in camera coordinates (in meters)),
'occluded': ClassLabel(shape=(), dtype=int64, num_classes=4),
'rotation_y': float32,
'truncated': float32,
'type': ClassLabel(shape=(), dtype=int64, num_classes=8),
}),
})
Feature |
Class |
Shape |
Dtype |
Description |
|
FeaturesDict |
|
|
|
image
|
Image
|
(None, None,
3) |
uint8
|
|
image/file_name |
Text |
|
string |
|
objects |
Sequence |
|
|
|
objects/alpha
|
Tensor
|
|
float32
|
Observation
angle of
object, ranging
[-pi..pi] |
objects/bbox
|
BBoxFeature
|
(4,)
|
float32
|
2D bounding box
of object in
the image |
objects/dimensions
|
Tensor
|
(3,)
|
float32
|
3D object
dimensions:
height, width,
length (in
meters) |
objects/location
|
Tensor
|
(3,)
|
float32
|
3D object
location x,y,z
in camera
coordinates (in
meters) |
objects/occluded
|
ClassLabel
|
|
int64
|
Integer
(0,1,2,3)
indicating
occlusion
state: 0 =
fully visible,
1 = partly
occluded2 =
largely
occluded, 3 =
unknown |
objects/rotation_y
|
Tensor
|
|
float32
|
Rotation ry
around Y-axis
in camera
coordinates
[-pi..pi] |
objects/truncated
|
Tensor
|
|
float32
|
Float from 0
(non-truncated)
to 1
(truncated),
wheretruncated
refers to the
object leaving
image
boundaries |
objects/type
|
ClassLabel
|
|
int64
|
The type of
object, e.g.
'Car' or 'Van' |

@inproceedings{Geiger2012CVPR,
author = {Andreas Geiger and Philip Lenz and Raquel Urtasun},
title = {Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite},
booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2012}
}
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-09-03 UTC.
[null,null,["Last updated 2024-09-03 UTC."],[],[],null,["# kitti\n\n\u003cbr /\u003e\n\n- **Description**:\n\nKitti contains a suite of vision tasks built using an autonomous driving\nplatform. The full benchmark contains many tasks such as stereo, optical flow,\nvisual odometry, etc. This dataset contains the object detection dataset,\nincluding the monocular images and bounding boxes. The dataset contains 7481\ntraining images annotated with 3D bounding boxes. A full description of the\nannotations can be found in the readme of the object development kit readme on\nthe Kitti homepage.\n\n- **Additional Documentation** :\n [Explore on Papers With Code\n north_east](https://paperswithcode.com/dataset/kitti)\n\n- **Homepage** :\n \u003chttp://www.cvlibs.net/datasets/kitti/\u003e\n\n- **Source code** :\n [`tfds.datasets.kitti.Builder`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/datasets/kitti/kitti_dataset_builder.py)\n\n- **Versions**:\n\n - `3.1.0`: No release notes.\n - `3.2.0`: Devkit updated.\n - **`3.3.0`** (default): Added labels for the `occluded` feature.\n- **Download size** : `11.71 GiB`\n\n- **Dataset size** : `5.27 GiB`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n No\n\n- **Splits**:\n\n| Split | Examples |\n|----------------|----------|\n| `'test'` | 711 |\n| `'train'` | 6,347 |\n| `'validation'` | 423 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'image': Image(shape=(None, None, 3), dtype=uint8),\n 'image/file_name': Text(shape=(), dtype=string),\n 'objects': Sequence({\n 'alpha': float32,\n 'bbox': BBoxFeature(shape=(4,), dtype=float32, description=2D bounding box of object in the image),\n 'dimensions': Tensor(shape=(3,), dtype=float32, description=3D object dimensions: height, width, length (in meters)),\n 'location': Tensor(shape=(3,), dtype=float32, description=3D object location x,y,z in camera coordinates (in meters)),\n 'occluded': ClassLabel(shape=(), dtype=int64, num_classes=4),\n 'rotation_y': float32,\n 'truncated': float32,\n 'type': ClassLabel(shape=(), dtype=int64, num_classes=8),\n }),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|--------------------|--------------|-----------------|---------|-----------------------------------------------------------------------------------------------------------------------|\n| | FeaturesDict | | | |\n| image | Image | (None, None, 3) | uint8 | |\n| image/file_name | Text | | string | |\n| objects | Sequence | | | |\n| objects/alpha | Tensor | | float32 | Observation angle of object, ranging \\[-pi..pi\\] |\n| objects/bbox | BBoxFeature | (4,) | float32 | 2D bounding box of object in the image |\n| objects/dimensions | Tensor | (3,) | float32 | 3D object dimensions: height, width, length (in meters) |\n| objects/location | Tensor | (3,) | float32 | 3D object location x,y,z in camera coordinates (in meters) |\n| objects/occluded | ClassLabel | | int64 | Integer (0,1,2,3) indicating occlusion state: 0 = fully visible, 1 = partly occluded2 = largely occluded, 3 = unknown |\n| objects/rotation_y | Tensor | | float32 | Rotation ry around Y-axis in camera coordinates \\[-pi..pi\\] |\n| objects/truncated | Tensor | | float32 | Float from 0 (non-truncated) to 1 (truncated), wheretruncated refers to the object leaving image boundaries |\n| objects/type | ClassLabel | | int64 | The type of object, e.g. 'Car' or 'Van' |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `None`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\n- **Citation**:\n\n @inproceedings{Geiger2012CVPR,\n author = {Andreas Geiger and Philip Lenz and Raquel Urtasun},\n title = {Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite},\n booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},\n year = {2012}\n }"]]