cmu_play_fusion
Stay organized with collections
Save and categorize content based on your preferences.
The robot plays with 3 complex scenes: a grill with many cooking objects like
toaster, pan, etc. It has to pick, open, place, close. It has to set a table,
move plates, cups, utensils. And it has to place dishes in the sink, dishwasher,
hand cups etc.
Split |
Examples |
'train' |
576 |
FeaturesDict({
'episode_metadata': FeaturesDict({
'file_path': Text(shape=(), dtype=string),
}),
'steps': Dataset({
'action': Tensor(shape=(9,), dtype=float32, description=Robot action, consists of [7x delta eef (pos + quat), 1x gripper open/close (binary), 1x terminate episode].),
'discount': Scalar(shape=(), dtype=float32, description=Discount if provided, default to 1.),
'is_first': bool,
'is_last': bool,
'is_terminal': bool,
'language_embedding': Tensor(shape=(512,), dtype=float32, description=Kona language embedding. See https://tfhub.dev/google/universal-sentence-encoder-large/5),
'language_instruction': Text(shape=(), dtype=string),
'observation': FeaturesDict({
'image': Image(shape=(128, 128, 3), dtype=uint8, description=Main camera RGB observation.),
'state': Tensor(shape=(8,), dtype=float32, description=Robot state, consists of [7x robot joint angles, 1x gripper position.),
}),
'reward': Scalar(shape=(), dtype=float32, description=Reward if provided, 1 on final step for demos.),
}),
})
Feature |
Class |
Shape |
Dtype |
Description |
|
FeaturesDict |
|
|
|
episode_metadata |
FeaturesDict |
|
|
|
episode_metadata/file_path |
Text |
|
string |
Path to the original data file. |
steps |
Dataset |
|
|
|
steps/action |
Tensor |
(9,) |
float32 |
Robot action, consists of [7x delta eef (pos + quat), 1x gripper open/close (binary), 1x terminate episode]. |
steps/discount |
Scalar |
|
float32 |
Discount if provided, default to 1. |
steps/is_first |
Tensor |
|
bool |
|
steps/is_last |
Tensor |
|
bool |
|
steps/is_terminal |
Tensor |
|
bool |
|
steps/language_embedding |
Tensor |
(512,) |
float32 |
Kona language embedding. See https://tfhub.dev/google/universal-sentence-encoder-large/5 |
steps/language_instruction |
Text |
|
string |
Language Instruction. |
steps/observation |
FeaturesDict |
|
|
|
steps/observation/image |
Image |
(128, 128, 3) |
uint8 |
Main camera RGB observation. |
steps/observation/state |
Tensor |
(8,) |
float32 |
Robot state, consists of [7x robot joint angles, 1x gripper position. |
steps/reward |
Scalar |
|
float32 |
Reward if provided, 1 on final step for demos. |
@inproceedings{chen2023playfusion,
title={PlayFusion: Skill Acquisition via Diffusion from Language-Annotated Play},
author={Chen, Lili and Bahl, Shikhar and Pathak, Deepak},
booktitle={CoRL},
year={2023}
}
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-09-03 UTC.
[null,null,["Last updated 2024-09-03 UTC."],[],[],null,["# cmu_play_fusion\n\n\u003cbr /\u003e\n\n- **Description**:\n\nThe robot plays with 3 complex scenes: a grill with many cooking objects like\ntoaster, pan, etc. It has to pick, open, place, close. It has to set a table,\nmove plates, cups, utensils. And it has to place dishes in the sink, dishwasher,\nhand cups etc.\n\n- **Homepage** :\n \u003chttps://play-fusion.github.io/\u003e\n\n- **Source code** :\n [`tfds.robotics.rtx.CmuPlayFusion`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/robotics/rtx/rtx.py)\n\n- **Versions**:\n\n - **`0.1.0`** (default): Initial release.\n- **Download size** : `Unknown size`\n\n- **Dataset size** : `6.68 GiB`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n No\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'train'` | 576 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'episode_metadata': FeaturesDict({\n 'file_path': Text(shape=(), dtype=string),\n }),\n 'steps': Dataset({\n 'action': Tensor(shape=(9,), dtype=float32, description=Robot action, consists of [7x delta eef (pos + quat), 1x gripper open/close (binary), 1x terminate episode].),\n 'discount': Scalar(shape=(), dtype=float32, description=Discount if provided, default to 1.),\n 'is_first': bool,\n 'is_last': bool,\n 'is_terminal': bool,\n 'language_embedding': Tensor(shape=(512,), dtype=float32, description=Kona language embedding. See https://tfhub.dev/google/universal-sentence-encoder-large/5),\n 'language_instruction': Text(shape=(), dtype=string),\n 'observation': FeaturesDict({\n 'image': Image(shape=(128, 128, 3), dtype=uint8, description=Main camera RGB observation.),\n 'state': Tensor(shape=(8,), dtype=float32, description=Robot state, consists of [7x robot joint angles, 1x gripper position.),\n }),\n 'reward': Scalar(shape=(), dtype=float32, description=Reward if provided, 1 on final step for demos.),\n }),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|----------------------------|--------------|---------------|---------|----------------------------------------------------------------------------------------------------------------|\n| | FeaturesDict | | | |\n| episode_metadata | FeaturesDict | | | |\n| episode_metadata/file_path | Text | | string | Path to the original data file. |\n| steps | Dataset | | | |\n| steps/action | Tensor | (9,) | float32 | Robot action, consists of \\[7x delta eef (pos + quat), 1x gripper open/close (binary), 1x terminate episode\\]. |\n| steps/discount | Scalar | | float32 | Discount if provided, default to 1. |\n| steps/is_first | Tensor | | bool | |\n| steps/is_last | Tensor | | bool | |\n| steps/is_terminal | Tensor | | bool | |\n| steps/language_embedding | Tensor | (512,) | float32 | Kona language embedding. See \u003chttps://tfhub.dev/google/universal-sentence-encoder-large/5\u003e |\n| steps/language_instruction | Text | | string | Language Instruction. |\n| steps/observation | FeaturesDict | | | |\n| steps/observation/image | Image | (128, 128, 3) | uint8 | Main camera RGB observation. |\n| steps/observation/state | Tensor | (8,) | float32 | Robot state, consists of \\[7x robot joint angles, 1x gripper position. |\n| steps/reward | Scalar | | float32 | Reward if provided, 1 on final step for demos. |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `None`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n Not supported.\n\n- **Examples**\n ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\n- **Citation**:\n\n @inproceedings{chen2023playfusion,\n title={PlayFusion: Skill Acquisition via Diffusion from Language-Annotated Play},\n author={Chen, Lili and Bahl, Shikhar and Pathak, Deepak},\n booktitle={CoRL},\n year={2023}\n }"]]