References:
abstract_narrative_understanding
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/abstract_narrative_understanding')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
3000 |
'train' |
2400 |
'validation' |
600 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
anachronisms
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/anachronisms')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
230 |
'train' |
184 |
'validation' |
46 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
analogical_similarity
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/analogical_similarity')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
323 |
'train' |
259 |
'validation' |
64 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
analytic_entailment
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/analytic_entailment')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
70 |
'train' |
54 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
arithmetic
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/arithmetic')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
15023 |
'train' |
12019 |
'validation' |
3004 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
ascii_word_recognition
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/ascii_word_recognition')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
5000 |
'train' |
4000 |
'validation' |
1000 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
authorship_verification
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/authorship_verification')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
880 |
'train' |
704 |
'validation' |
176 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
auto_categorization
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/auto_categorization')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
328 |
'train' |
263 |
'validation' |
65 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
auto_debugging
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/auto_debugging')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
34 |
'train' |
18 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
bbq_lite_json
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/bbq_lite_json')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
16076 |
'train' |
12866 |
'validation' |
3210 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
bridging_anaphora_resolution_barqa
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/bridging_anaphora_resolution_barqa')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
648 |
'train' |
519 |
'validation' |
129 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
causal_judgment
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/causal_judgment')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
190 |
'train' |
152 |
'validation' |
38 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
cause_and_effect
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/cause_and_effect')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
153 |
'train' |
123 |
'validation' |
30 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
checkmate_in_one
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/checkmate_in_one')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
3498 |
'train' |
2799 |
'validation' |
699 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
chess_state_tracking
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/chess_state_tracking')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
6000 |
'train' |
4800 |
'validation' |
1200 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
chinese_remainder_theorem
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/chinese_remainder_theorem')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
500 |
'train' |
400 |
'validation' |
100 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
cifar10_classification
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/cifar10_classification')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
20000 |
'train' |
16000 |
'validation' |
4000 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
code_line_description
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/code_line_description')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
60 |
'train' |
44 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
codenames
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/codenames')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
85 |
'train' |
68 |
'validation' |
17 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
color
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/color')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
4000 |
'train' |
3200 |
'validation' |
800 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
common_morpheme
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/common_morpheme')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
50 |
'train' |
34 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
conceptual_combinations
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/conceptual_combinations')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
103 |
'train' |
84 |
'validation' |
19 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
conlang_translation
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/conlang_translation')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
164 |
'train' |
132 |
'validation' |
32 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
contextual_parametric_knowledge_conflicts
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/contextual_parametric_knowledge_conflicts')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
17528 |
'train' |
14023 |
'validation' |
3505 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
crash_blossom
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/crash_blossom')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
38 |
'train' |
22 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
crass_ai
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/crass_ai')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
44 |
'train' |
28 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
cryobiology_spanish
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/cryobiology_spanish')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
146 |
'train' |
117 |
'validation' |
29 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
cryptonite
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/cryptonite')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
26157 |
'train' |
20926 |
'validation' |
5231 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
cs_algorithms
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/cs_algorithms')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1320 |
'train' |
1056 |
'validation' |
264 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
dark_humor_detection
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/dark_humor_detection')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
80 |
'train' |
64 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
date_understanding
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/date_understanding')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
369 |
'train' |
296 |
'validation' |
73 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
disambiguation_qa
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/disambiguation_qa')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
258 |
'train' |
207 |
'validation' |
51 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
discourse_marker_prediction
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/discourse_marker_prediction')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
857 |
'train' |
686 |
'validation' |
171 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
disfl_qa
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/disfl_qa')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
8000 |
'train' |
6400 |
'validation' |
1600 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
dyck_languages
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/dyck_languages')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
1000 |
'train' |
800 |
'validation' |
200 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
elementary_math_qa
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/elementary_math_qa')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
38160 |
'train' |
30531 |
'validation' |
7629 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
emoji_movie
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/emoji_movie')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
100 |
'train' |
80 |
'validation' |
20 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
emojis_emotion_prediction
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/emojis_emotion_prediction')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
131 |
'train' |
105 |
'validation' |
26 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
empirical_judgments
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/empirical_judgments')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
99 |
'train' |
80 |
'validation' |
19 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
english_proverbs
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/english_proverbs')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
34 |
'train' |
18 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
english_russian_proverbs
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/english_russian_proverbs')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
80 |
'train' |
64 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
entailed_polarity
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/entailed_polarity')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
148 |
'train' |
119 |
'validation' |
29 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
entailed_polarity_hindi
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/entailed_polarity_hindi')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
138 |
'train' |
111 |
'validation' |
27 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
epistemic_reasoning
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/epistemic_reasoning')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
2000 |
'train' |
1600 |
'validation' |
400 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
evaluating_information_essentiality
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/evaluating_information_essentiality')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
68 |
'train' |
52 |
'validation' |
16 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_scores": {
"feature": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
fact_checker
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:bigbench/fact_checker')
- Description:
The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to
probe large language models, and extrapolate their future capabilities.
- License: Apache License 2.0
- Version: 0.0.0
- Splits:
Split | Examples |
---|---|
'default' |
7154 |
'train' |
5724 |
'validation' |
1430 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"inputs": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice_targets": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"multiple_choice