rlu_dmlab_seekavoid_arena01
Stay organized with collections
Save and categorize content based on your preferences.
RL Unplugged is suite of benchmarks for offline reinforcement learning. The RL
Unplugged is designed around the following considerations: to facilitate ease of
use, we provide the datasets with a unified API which makes it easy for the
practitioner to work with all data in the suite once a general pipeline has been
established.
The datasets follow the RLDS format
to represent steps and episodes.
DeepMind Lab dataset has several levels from the challenging, partially
observable Deepmind Lab suite. DeepMind Lab
dataset is collected by training distributed R2D2 by
Kapturowski et al., 2018 agents
from scratch on individual tasks. We recorded the experience across all actors
during entire training runs a few times for every task. The details of the
dataset generation process is described in
Gulcehre et al., 2021.
We release datasets for five different DeepMind Lab levels:
seekavoid_arena_01
, explore_rewards_few
, explore_rewards_many
,
rooms_watermaze
, rooms_select_nonmatching_object
. We also release the
snapshot datasets for seekavoid_arena_01
level that we generated the datasets
from a trained R2D2 snapshot with different levels of epsilons for the
epsilon-greedy algorithm when evaluating the agent in the environment.
DeepMind Lab dataset is fairly large-scale. We recommend you to try it if you
are interested in large-scale offline RL models with memory.
FeaturesDict({
'episode_id': int64,
'episode_return': float32,
'steps': Dataset({
'action': int64,
'discount': float32,
'is_first': bool,
'is_last': bool,
'is_terminal': bool,
'observation': FeaturesDict({
'last_action': int64,
'last_reward': float32,
'pixels': Image(shape=(72, 96, 3), dtype=uint8),
}),
'reward': float32,
}),
})
Feature |
Class |
Shape |
Dtype |
Description |
|
FeaturesDict |
|
|
|
episode_id |
Tensor |
|
int64 |
|
episode_return |
Tensor |
|
float32 |
|
steps |
Dataset |
|
|
|
steps/action |
Tensor |
|
int64 |
|
steps/discount |
Tensor |
|
float32 |
|
steps/is_first |
Tensor |
|
bool |
|
steps/is_last |
Tensor |
|
bool |
|
steps/is_terminal |
Tensor |
|
bool |
|
steps/observation |
FeaturesDict |
|
|
|
steps/observation/last_action |
Tensor |
|
int64 |
|
steps/observation/last_reward |
Tensor |
|
float32 |
|
steps/observation/pixels
|
Image
|
(72,
96,
3) |
uint8
|
|
steps/reward |
Tensor |
|
float32 |
|
@article{gulcehre2021rbve,
title={Regularized Behavior Value Estimation},
author={ {\c{C} }aglar G{\"{u} }l{\c{c} }ehre and
Sergio G{\'{o} }mez Colmenarejo and
Ziyu Wang and
Jakub Sygnowski and
Thomas Paine and
Konrad Zolna and
Yutian Chen and
Matthew W. Hoffman and
Razvan Pascanu and
Nando de Freitas},
year={2021},
journal = {CoRR},
url = {https://arxiv.org/abs/2103.09575},
eprint={2103.09575},
archivePrefix={arXiv},
}
rlu_dmlab_seekavoid_arena01/training_0 (default config)
Dataset size: 356.86 GiB
Splits:
Split |
Examples |
'train' |
134,707 |
rlu_dmlab_seekavoid_arena01/training_1
Dataset size: 337.09 GiB
Splits:
Split |
Examples |
'train' |
128,472 |
rlu_dmlab_seekavoid_arena01/training_2
Dataset size: 355.62 GiB
Splits:
Split |
Examples |
'train' |
133,545 |
rlu_dmlab_seekavoid_arena01/snapshot_0_eps_0.0
Dataset size: 89.16 GiB
Splits:
Split |
Examples |
'train' |
33,340 |
rlu_dmlab_seekavoid_arena01/snapshot_1_eps_0.0
Dataset size: 89.03 GiB
Splits:
Split |
Examples |
'train' |
33,340 |
rlu_dmlab_seekavoid_arena01/snapshot_0_eps_0.01
Dataset size: 89.12 GiB
Splits:
Split |
Examples |
'train' |
33,340 |
rlu_dmlab_seekavoid_arena01/snapshot_1_eps_0.01
Dataset size: 89.02 GiB
Splits:
Split |
Examples |
'train' |
33,340 |
rlu_dmlab_seekavoid_arena01/snapshot_0_eps_0.25
Dataset size: 88.57 GiB
Splits:
Split |
Examples |
'train' |
33,340 |
rlu_dmlab_seekavoid_arena01/snapshot_1_eps_0.25
Dataset size: 88.51 GiB
Splits:
Split |
Examples |
'train' |
33,340 |
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2022-11-23 UTC.
[null,null,["Last updated 2022-11-23 UTC."],[],[],null,["# rlu_dmlab_seekavoid_arena01\n\n\u003cbr /\u003e\n\n- **Description**:\n\nRL Unplugged is suite of benchmarks for offline reinforcement learning. The RL\nUnplugged is designed around the following considerations: to facilitate ease of\nuse, we provide the datasets with a unified API which makes it easy for the\npractitioner to work with all data in the suite once a general pipeline has been\nestablished.\n\nThe datasets follow the [RLDS format](https://github.com/google-research/rlds)\nto represent steps and episodes.\n\nDeepMind Lab dataset has several levels from the challenging, partially\nobservable [Deepmind Lab suite](https://github.com/deepmind/lab). DeepMind Lab\ndataset is collected by training distributed R2D2 by\n[Kapturowski et al., 2018](https://openreview.net/forum?id=r1lyTjAqYX) agents\nfrom scratch on individual tasks. We recorded the experience across all actors\nduring entire training runs a few times for every task. The details of the\ndataset generation process is described in\n[Gulcehre et al., 2021](https://arxiv.org/abs/2103.09575).\n\nWe release datasets for five different DeepMind Lab levels:\n`seekavoid_arena_01`, `explore_rewards_few`, `explore_rewards_many`,\n`rooms_watermaze`, `rooms_select_nonmatching_object`. We also release the\nsnapshot datasets for `seekavoid_arena_01` level that we generated the datasets\nfrom a trained R2D2 snapshot with different levels of epsilons for the\nepsilon-greedy algorithm when evaluating the agent in the environment.\n\nDeepMind Lab dataset is fairly large-scale. We recommend you to try it if you\nare interested in large-scale offline RL models with memory.\n\n- **Homepage** :\n \u003chttps://github.com/deepmind/deepmind-research/tree/master/rl_unplugged\u003e\n\n- **Source code** :\n [`tfds.rl_unplugged.rlu_dmlab_seekavoid_arena01.RluDmlabSeekavoidArena01`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/rl_unplugged/rlu_dmlab_seekavoid_arena01/rlu_dmlab_seekavoid_arena01.py)\n\n- **Versions**:\n\n - `1.0.0`: Initial release.\n - `1.1.0`: Added is_last.\n - **`1.2.0`** (default): BGR -\\\u003e RGB fix for pixel observations.\n- **Download size** : `Unknown size`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n No\n\n- **Feature structure**:\n\n FeaturesDict({\n 'episode_id': int64,\n 'episode_return': float32,\n 'steps': Dataset({\n 'action': int64,\n 'discount': float32,\n 'is_first': bool,\n 'is_last': bool,\n 'is_terminal': bool,\n 'observation': FeaturesDict({\n 'last_action': int64,\n 'last_reward': float32,\n 'pixels': Image(shape=(72, 96, 3), dtype=uint8),\n }),\n 'reward': float32,\n }),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|-------------------------------|--------------|-------------|---------|-------------|\n| | FeaturesDict | | | |\n| episode_id | Tensor | | int64 | |\n| episode_return | Tensor | | float32 | |\n| steps | Dataset | | | |\n| steps/action | Tensor | | int64 | |\n| steps/discount | Tensor | | float32 | |\n| steps/is_first | Tensor | | bool | |\n| steps/is_last | Tensor | | bool | |\n| steps/is_terminal | Tensor | | bool | |\n| steps/observation | FeaturesDict | | | |\n| steps/observation/last_action | Tensor | | int64 | |\n| steps/observation/last_reward | Tensor | | float32 | |\n| steps/observation/pixels | Image | (72, 96, 3) | uint8 | |\n| steps/reward | Tensor | | float32 | |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `None`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n Not supported.\n\n- **Citation**:\n\n @article{gulcehre2021rbve,\n title={Regularized Behavior Value Estimation},\n author={ {\\c{C} }aglar G{\\\"{u} }l{\\c{c} }ehre and\n Sergio G{\\'{o} }mez Colmenarejo and\n Ziyu Wang and\n Jakub Sygnowski and\n Thomas Paine and\n Konrad Zolna and\n Yutian Chen and\n Matthew W. Hoffman and\n Razvan Pascanu and\n Nando de Freitas},\n year={2021},\n journal = {CoRR},\n url = {https://arxiv.org/abs/2103.09575},\n eprint={2103.09575},\n archivePrefix={arXiv},\n }\n\nrlu_dmlab_seekavoid_arena01/training_0 (default config)\n-------------------------------------------------------\n\n- **Dataset size** : `356.86 GiB`\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'train'` | 134,707 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nrlu_dmlab_seekavoid_arena01/training_1\n--------------------------------------\n\n- **Dataset size** : `337.09 GiB`\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'train'` | 128,472 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nrlu_dmlab_seekavoid_arena01/training_2\n--------------------------------------\n\n- **Dataset size** : `355.62 GiB`\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'train'` | 133,545 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nrlu_dmlab_seekavoid_arena01/snapshot_0_eps_0.0\n----------------------------------------------\n\n- **Dataset size** : `89.16 GiB`\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'train'` | 33,340 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nrlu_dmlab_seekavoid_arena01/snapshot_1_eps_0.0\n----------------------------------------------\n\n- **Dataset size** : `89.03 GiB`\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'train'` | 33,340 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nrlu_dmlab_seekavoid_arena01/snapshot_0_eps_0.01\n-----------------------------------------------\n\n- **Dataset size** : `89.12 GiB`\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'train'` | 33,340 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nrlu_dmlab_seekavoid_arena01/snapshot_1_eps_0.01\n-----------------------------------------------\n\n- **Dataset size** : `89.02 GiB`\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'train'` | 33,340 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nrlu_dmlab_seekavoid_arena01/snapshot_0_eps_0.25\n-----------------------------------------------\n\n- **Dataset size** : `88.57 GiB`\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'train'` | 33,340 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\nrlu_dmlab_seekavoid_arena01/snapshot_1_eps_0.25\n-----------------------------------------------\n\n- **Dataset size** : `88.51 GiB`\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'train'` | 33,340 |\n\n- **Examples** ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples..."]]