scan

  • Description:

SCAN tasks with various splits.

SCAN is a set of simple language-driven navigation tasks for studying compositional learning and zero-shot generalization.

Most splits are described at https://github.com/brendenlake/SCAN For the MCD splits please see https://arxiv.org/abs/1912.09713.pdf

Basic usage:

data = tfds.load('scan/length')

More advanced example:

import tensorflow_datasets as tfds
from tensorflow_datasets.datasets.scan import scan_dataset_builder

data = tfds.load(
    'scan',
    builder_kwargs=dict(
        config=scan_dataset_builder.ScanConfig(
            name='simple_p8', directory='simple_split/size_variations')))
FeaturesDict({
    'actions': Text(shape=(), dtype=string),
    'commands': Text(shape=(), dtype=string),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
actions Text string
commands Text string
@inproceedings{Lake2018GeneralizationWS,
  title={Generalization without Systematicity: On the Compositional Skills of
         Sequence-to-Sequence Recurrent Networks},
  author={Brenden M. Lake and Marco Baroni},
  booktitle={ICML},
  year={2018},
  url={https://arxiv.org/pdf/1711.00350.pdf},
}
@inproceedings{Keysers2020,
  title={Measuring Compositional Generalization: A Comprehensive Method on
         Realistic Data},
  author={Daniel Keysers and Nathanael Sch\"{a}rli and Nathan Scales and
          Hylke Buisman and Daniel Furrer and Sergii Kashubin and
          Nikola Momchev and Danila Sinopalnikov and Lukasz Stafiniak and
          Tibor Tihon and Dmitry Tsarkov and Xiao Wang and Marc van Zee and
          Olivier Bousquet},
  note={Additional citation for MCD splits},
  booktitle={ICLR},
  year={2020},
  url={https://arxiv.org/abs/1912.09713.pdf},
}

scan/simple (default config)

  • Download size: 17.82 MiB

  • Dataset size: 4.47 MiB

  • Splits:

Split Examples
'test' 4,182
'train' 16,728

scan/addprim_jump

  • Download size: 17.82 MiB

  • Dataset size: 4.53 MiB

  • Splits:

Split Examples
'test' 7,706
'train' 14,670

scan/addprim_turn_left

  • Download size: 17.82 MiB

  • Dataset size: 4.58 MiB

  • Splits:

Split Examples
'test' 1,208
'train' 21,890

scan/filler_num0

  • Download size: 17.82 MiB

  • Dataset size: 3.20 MiB

  • Splits:

Split Examples
'test' 1,173
'train' 15,225

scan/filler_num1

  • Download size: 17.82 MiB

  • Dataset size: 3.51 MiB

  • Splits:

Split Examples
'test' 1,173
'train' 16,290

scan/filler_num2

  • Download size: 17.82 MiB

  • Dataset size: 3.84 MiB

  • Splits:

Split Examples
'test' 1,173
'train' 17,391

scan/filler_num3

  • Download size: 17.82 MiB

  • Dataset size: 4.17 MiB

  • Splits:

Split Examples
'test' 1,173
'train' 18,528

scan/length

  • Download size: 17.82 MiB

  • Dataset size: 4.47 MiB

  • Splits:

Split Examples
'test' 3,920
'train' 16,990

scan/template_around_right

  • Download size: 17.82 MiB

  • Dataset size: 4.17 MiB

  • Splits:

Split Examples
'test' 4,476
'train' 15,225

scan/template_jump_around_right

  • Download size: 17.82 MiB

  • Dataset size: 4.17 MiB

  • Splits:

Split Examples
'test' 1,173
'train' 18,528

scan/template_opposite_right

  • Download size: 17.82 MiB

  • Dataset size: 4.22 MiB

  • Splits:

Split Examples
'test' 4,476
'train' 15,225

scan/template_right

  • Download size: 17.82 MiB

  • Dataset size: 4.26 MiB

  • Splits:

Split Examples
'test' 4,476
'train' 15,225

scan/mcd1

  • Download size: 17.89 MiB

  • Dataset size: 1.89 MiB

  • Splits:

Split Examples
'test' 1,045
'train' 8,365

scan/mcd2

  • Download size: 17.89 MiB

  • Dataset size: 1.84 MiB

  • Splits:

Split Examples
'test' 1,045
'train' 8,365

scan/mcd3

  • Download size: 17.89 MiB

  • Dataset size: 1.87 MiB

  • Splits:

Split Examples
'test' 1,045
'train' 8,365