bigbench

References:

abstract_narrative_understanding

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/abstract_narrative_understanding')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 3000
'train' 2400
'validation' 600
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

anachronisms

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/anachronisms')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 230
'train' 184
'validation' 46
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

analogical_similarity

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/analogical_similarity')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 323
'train' 259
'validation' 64
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

analytic_entailment

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/analytic_entailment')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 70
'train' 54
'validation' 16
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

arithmetic

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/arithmetic')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 15023
'train' 12019
'validation' 3004
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

ascii_word_recognition

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/ascii_word_recognition')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 5000
'train' 4000
'validation' 1000
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

authorship_verification

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/authorship_verification')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 880
'train' 704
'validation' 176
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

auto_categorization

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/auto_categorization')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 328
'train' 263
'validation' 65
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

auto_debugging

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/auto_debugging')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 34
'train' 18
'validation' 16
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

bbq_lite_json

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/bbq_lite_json')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 16076
'train' 12866
'validation' 3210
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

bridging_anaphora_resolution_barqa

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/bridging_anaphora_resolution_barqa')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 648
'train' 519
'validation' 129
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

causal_judgment

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/causal_judgment')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 190
'train' 152
'validation' 38
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

cause_and_effect

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/cause_and_effect')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 153
'train' 123
'validation' 30
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

checkmate_in_one

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/checkmate_in_one')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 3498
'train' 2799
'validation' 699
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

chess_state_tracking

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/chess_state_tracking')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 6000
'train' 4800
'validation' 1200
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

chinese_remainder_theorem

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/chinese_remainder_theorem')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 500
'train' 400
'validation' 100
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

cifar10_classification

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/cifar10_classification')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 20000
'train' 16000
'validation' 4000
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

code_line_description

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/code_line_description')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 60
'train' 44
'validation' 16
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

codenames

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/codenames')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 85
'train' 68
'validation' 17
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

color

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/color')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 4000
'train' 3200
'validation' 800
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

common_morpheme

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/common_morpheme')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 50
'train' 34
'validation' 16
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

conceptual_combinations

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/conceptual_combinations')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 103
'train' 84
'validation' 19
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

conlang_translation

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/conlang_translation')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 164
'train' 132
'validation' 32
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

contextual_parametric_knowledge_conflicts

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/contextual_parametric_knowledge_conflicts')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 17528
'train' 14023
'validation' 3505
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

crash_blossom

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/crash_blossom')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 38
'train' 22
'validation' 16
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

crass_ai

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/crass_ai')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 44
'train' 28
'validation' 16
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

cryobiology_spanish

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/cryobiology_spanish')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 146
'train' 117
'validation' 29
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

cryptonite

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/cryptonite')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 26157
'train' 20926
'validation' 5231
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

cs_algorithms

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/cs_algorithms')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 1320
'train' 1056
'validation' 264
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

dark_humor_detection

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/dark_humor_detection')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 80
'train' 64
'validation' 16
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

date_understanding

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/date_understanding')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 369
'train' 296
'validation' 73
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

disambiguation_qa

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/disambiguation_qa')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 258
'train' 207
'validation' 51
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

discourse_marker_prediction

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/discourse_marker_prediction')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 857
'train' 686
'validation' 171
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

disfl_qa

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/disfl_qa')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 8000
'train' 6400
'validation' 1600
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

dyck_languages

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/dyck_languages')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 1000
'train' 800
'validation' 200
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

elementary_math_qa

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/elementary_math_qa')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 38160
'train' 30531
'validation' 7629
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

emoji_movie

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/emoji_movie')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 100
'train' 80
'validation' 20
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

emojis_emotion_prediction

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/emojis_emotion_prediction')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 131
'train' 105
'validation' 26
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

empirical_judgments

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/empirical_judgments')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 99
'train' 80
'validation' 19
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

english_proverbs

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/english_proverbs')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 34
'train' 18
'validation' 16
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

english_russian_proverbs

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/english_russian_proverbs')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 80
'train' 64
'validation' 16
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

entailed_polarity

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/entailed_polarity')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 148
'train' 119
'validation' 29
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

entailed_polarity_hindi

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/entailed_polarity_hindi')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 138
'train' 111
'validation' 27
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

epistemic_reasoning

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/epistemic_reasoning')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 2000
'train' 1600
'validation' 400
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

evaluating_information_essentiality

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/evaluating_information_essentiality')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 68
'train' 52
'validation' 16
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_scores": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

fact_checker

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:bigbench/fact_checker')
  • Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
  • License: Apache License 2.0
  • Version: 0.0.0
  • Splits:
Split Examples
'default' 7154
'train' 5724
'validation' 1430
  • Features:
{
    "idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "inputs": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice_targets": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "multiple_choice