参考文献:
付属島
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/adjunct_island')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
anaphor_gender_agreement
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/anaphor_gender_agreement')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
anaphor_number_agreement
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/anaphor_number_agreement')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
animate_subject_passive
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/animate_subject_passive')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
animate_subject_trans
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/animate_subject_trans')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
原因となる
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/causative')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
コンプレックス_NP_アイランド
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/complex_NP_island')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
座標構造制約複合体左枝
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/coordinate_structure_constraint_complex_left_branch')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
座標構造制約オブジェクト抽出
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/coordinate_structure_constraint_object_extraction')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
決定者_名詞_合意_1
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/determiner_noun_agreement_1')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
決定者_名詞_合意_2
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/determiner_noun_agreement_2')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
決定者_名詞_合意_不規則_1
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/determiner_noun_agreement_irregular_1')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
決定者_名詞_合意_不規則_2
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/determiner_noun_agreement_irregular_2')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
決定者_名詞_合意_with_adj_2
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/determiner_noun_agreement_with_adj_2')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
detecter_noun_agreement_with_adj_irregulator_1
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/determiner_noun_agreement_with_adj_irregular_1')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
detecter_noun_agreement_with_adj_irregulator_2
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/determiner_noun_agreement_with_adj_irregular_2')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
決定詞_名詞_合意_with_形容詞_1
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/determiner_noun_agreement_with_adjective_1')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
ディストラクタ_合意_関係名詞
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/distractor_agreement_relational_noun')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
ディストラクター契約相対条項
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/distractor_agreement_relative_clause')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
ドロップ引数
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/drop_argument')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
ellipsis_n_bar_1
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/ellipsis_n_bar_1')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
ellipsis_n_bar_2
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/ellipsis_n_bar_2')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
存在する_そこにある_オブジェクトを育てる
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/existential_there_object_raising')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
存在する_there_quantifiers_1
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/existential_there_quantifiers_1')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
存在する_there_quantifiers_2
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/existential_there_quantifiers_2')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
実存的_そこ_主題_育成
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/existential_there_subject_raising')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
expletive_it_object_raise
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/expletive_it_object_raising')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
活発でない
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/inchoative')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
自動詞
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/intransitive')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
不規則過去分詞形容詞
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/irregular_past_participle_adjectives')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
不規則過去分詞動詞
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/irregular_past_participle_verbs')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
不規則_複数_主語_動詞_合意_1
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/irregular_plural_subject_verb_agreement_1')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
不規則_複数_主語_動詞_合意_2
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/irregular_plural_subject_verb_agreement_2')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
left_branch_island_echo_question
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/left_branch_island_echo_question')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
left_branch_island_simple_question
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/left_branch_island_simple_question')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
matrix_question_npi_licenser_present
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/matrix_question_npi_licensor_present')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
npi_present_1
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/npi_present_1')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
npi_present_2
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/npi_present_2')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
Only_npi_licenser_present
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/only_npi_licensor_present')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
Only_npi_scope
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/only_npi_scope')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
パッシブ_1
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/passive_1')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
パッシブ_2
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/passive_2')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
原則_A_c_コマンド
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/principle_A_c_command')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
原則_A_ケース_1
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/principle_A_case_1')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
原則_A_ケース_2
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/principle_A_case_2')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
原則_A_ドメイン_1
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/principle_A_domain_1')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
原則_A_ドメイン_2
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/principle_A_domain_2')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
原則_A_ドメイン_3
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/principle_A_domain_3')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
原則_A_再構築
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/principle_A_reconstruction')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
Regular_plural_subject_verb_agreement_1
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/regular_plural_subject_verb_agreement_1')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
Regular_plural_subject_verb_agreement_2
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/regular_plural_subject_verb_agreement_2')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
sendenial_negation_npi_licensor_present
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/sentential_negation_npi_licensor_present')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
Sentential_negation_npi_scope
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/sentential_negation_npi_scope')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
センテンシャル_サブジェクト_アイランド
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/sentential_subject_island')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
superlative_quantifiers_1
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/superlative_quantifiers_1')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
superlative_quantifiers_2
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/superlative_quantifiers_2')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
タフ_VS_レイジング_1
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/tough_vs_raising_1')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
タフ_VS_レイジング_2
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/tough_vs_raising_2')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
推移的
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/transitive')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
ウィッシュアイランド
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/wh_island')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
wh_questions_object_gap
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/wh_questions_object_gap')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
wh_questions_subject_gap
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/wh_questions_subject_gap')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
wh_questions_subject_gap_long_ distance
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/wh_questions_subject_gap_long_distance')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
なんのギャップもない
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/wh_vs_that_no_gap')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
あれ、ギャップのない長距離距離
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/wh_vs_that_no_gap_long_distance')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
ギャップのあるあれとの対比
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/wh_vs_that_with_gap')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}
ギャップのある長距離のあれとの対比
次のコマンドを使用して、このデータセットを TFDS にロードします。
ds = tfds.load('huggingface:blimp/wh_vs_that_with_gap_long_distance')
- 説明:
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
- ライセンス: 既知のライセンスはありません
- バージョン: 0.1.0
- 分割:
スプリット | 例 |
---|---|
'train' | 1000 |
- 特徴:
{
"sentence_good": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_bad": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"field": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"linguistics_term": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"UID": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simple_LM_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"one_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"two_prefix_method": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"lexically_identical": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"pair_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
}
}