locomotion

  • Description:

The datasets were created with a SAC agent trained on the environment reward of MuJoCo locomotion tasks. These datasets are used in What Matters for Adversarial Imitation Learning? Orsini et al. 2021.

The datasets follow the RLDS format to represent steps and episodes.s

@article{orsini2021matters,
  title={What Matters for Adversarial Imitation Learning?},
  author={Orsini, Manu and Raichuk, Anton and Hussenot, L{'e}onard and Vincent, Damien and Dadashi, Robert and Girgin, Sertan and Geist, Matthieu and Bachem, Olivier and Pietquin, Olivier and Andrychowicz, Marcin},
  journal={International Conference in Machine Learning},
  year={2021}
}

locomotion/ant_sac_1M_single_policy_stochastic (default config)

  • Config description: Dataset generated by a SAC agent trained for 1M steps for Ant.

  • Download size: 6.49 MiB

  • Dataset size: 23.02 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 50
  • Feature structure:
FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(8,), dtype=tf.float32),
        'discount': tf.float32,
        'is_first': tf.bool,
        'is_last': tf.bool,
        'is_terminal': tf.bool,
        'observation': Tensor(shape=(111,), dtype=tf.float32),
        'reward': tf.float32,
    }),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
steps Dataset
steps/action Tensor (8,) tf.float32
steps/discount Tensor tf.float32
steps/is_first Tensor tf.bool
steps/is_last Tensor tf.bool
steps/is_terminal Tensor tf.bool
steps/observation Tensor (111,) tf.float32
steps/reward Tensor tf.float32

locomotion/hopper_sac_1M_single_policy_stochastic

  • Config description: Dataset generated by a SAC agent trained for 1M steps for Hopper.

  • Download size: 2.26 MiB

  • Dataset size: 2.62 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 50
  • Feature structure:
FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(3,), dtype=tf.float32),
        'discount': tf.float32,
        'is_first': tf.bool,
        'is_last': tf.bool,
        'is_terminal': tf.bool,
        'observation': Tensor(shape=(11,), dtype=tf.float32),
        'reward': tf.float32,
    }),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
steps Dataset
steps/action Tensor (3,) tf.float32
steps/discount Tensor tf.float32
steps/is_first Tensor tf.bool
steps/is_last Tensor tf.bool
steps/is_terminal Tensor tf.bool
steps/observation Tensor (11,) tf.float32
steps/reward Tensor tf.float32

locomotion/halfcheetah_sac_1M_single_policy_stochastic

  • Config description: Dataset generated by a SAC agent trained for 1M steps for HalfCheetah.

  • Download size: 4.49 MiB

  • Dataset size: 4.93 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 50
  • Feature structure:
FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=tf.float32),
        'discount': tf.float32,
        'is_first': tf.bool,
        'is_last': tf.bool,
        'is_terminal': tf.bool,
        'observation': Tensor(shape=(17,), dtype=tf.float32),
        'reward': tf.float32,
    }),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
steps Dataset
steps/action Tensor (6,) tf.float32
steps/discount Tensor tf.float32
steps/is_first Tensor tf.bool
steps/is_last Tensor tf.bool
steps/is_terminal Tensor tf.bool
steps/observation Tensor (17,) tf.float32
steps/reward Tensor tf.float32

locomotion/walker2d_sac_1M_single_policy_stochastic

  • Config description: Dataset generated by a SAC agent trained for 1M steps for Walker2d.

  • Download size: 4.35 MiB

  • Dataset size: 4.91 MiB

  • Auto-cached (documentation): Yes

  • Splits:

Split Examples
'train' 50
  • Feature structure:
FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=tf.float32),
        'discount': tf.float32,
        'is_first': tf.bool,
        'is_last': tf.bool,
        'is_terminal': tf.bool,
        'observation': Tensor(shape=(17,), dtype=tf.float32),
        'reward': tf.float32,
    }),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
steps Dataset
steps/action Tensor (6,) tf.float32
steps/discount Tensor tf.float32
steps/is_first Tensor tf.bool
steps/is_last Tensor tf.bool
steps/is_terminal Tensor tf.bool
steps/observation Tensor (17,) tf.float32
steps/reward Tensor tf.float32

locomotion/humanoid_sac_15M_single_policy_stochastic

  • Config description: Dataset generated by a SAC agent trained for 15M steps for Humanoid.

  • Download size: 192.78 MiB

  • Dataset size: 300.94 MiB

  • Auto-cached (documentation): No

  • Splits:

Split Examples
'train' 200
  • Feature structure:
FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(17,), dtype=tf.float32),
        'discount': tf.float32,
        'is_first': tf.bool,
        'is_last': tf.bool,
        'is_terminal': tf.bool,
        'observation': Tensor(shape=(376,), dtype=tf.float32),
        'reward': tf.float32,
    }),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
steps Dataset
steps/action Tensor (17,) tf.float32
steps/discount Tensor tf.float32
steps/is_first Tensor tf.bool
steps/is_last Tensor tf.bool
steps/is_terminal Tensor tf.bool
steps/observation Tensor (376,) tf.float32
steps/reward Tensor tf.float32