dirigibile

Riferimenti:

isola_aggiunta

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/adjunct_island')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

anaphor_gender_agreement

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/anaphor_gender_agreement')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

anaphor_number_agreement

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/anaphor_number_agreement')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

animato_soggetto_passivo

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/animate_subject_passive')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

anima_soggetto_trans

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/animate_subject_trans')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

causale

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/causative')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

complesso_NP_isola

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/complex_NP_island')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

coordina_struttura_vincolo_complesso_ramo_sinistro

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/coordinate_structure_constraint_complex_left_branch')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

coordina_struttura_vincolo_oggetto_estrazione

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/coordinate_structure_constraint_object_extraction')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

determinante_sostantivo_accordo_1

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/determiner_noun_agreement_1')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

determinante_sostantivo_accordo_2

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/determiner_noun_agreement_2')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

determinante_sostantivo_accordo_irregolare_1

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/determiner_noun_agreement_irregular_1')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

determinante_sostantivo_accordo_irregolare_2

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/determiner_noun_agreement_irregular_2')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

determinante_sostantivo_accordo_con_agg_2

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/determiner_noun_agreement_with_adj_2')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

determinante_sostantivo_accordo_con_adj_irregolare_1

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/determiner_noun_agreement_with_adj_irregular_1')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

determinante_sostantivo_accordo_con_adj_irregolare_2

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/determiner_noun_agreement_with_adj_irregular_2')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

determinante_sostantivo_accordo_con_aggettivo_1

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/determiner_noun_agreement_with_adjective_1')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

distrattore_accordo_relazionale_sostantivo

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/distractor_agreement_relational_noun')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

distrattore_agreement_relative_clause

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/distractor_agreement_relative_clause')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

drop_argomento

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/drop_argument')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

puntini di sospensione_n_bar_1

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/ellipsis_n_bar_1')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

puntini di sospensione_n_bar_2

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/ellipsis_n_bar_2')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

esistenziale_lì_oggetto_raising

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/existential_there_object_raising')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

esistenziale_ci_quantificatori_1

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/existential_there_quantifiers_1')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

esistenziale_ci_quantificatori_2

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/existential_there_quantifiers_2')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

esistenziale_là_soggetto_sollevamento

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/existential_there_subject_raising')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

imprecazione_it_object_raising

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/expletive_it_object_raising')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

incoativo

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/inchoative')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

intransitivo

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/intransitive')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

aggettivi_participio_passato_irregolari

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/irregular_past_participle_adjectives')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

verbi_participi_passati_irregolari

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/irregular_past_participle_verbs')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

irregolare_plural_subject_verb_agreement_1

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/irregular_plural_subject_verb_agreement_1')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

irregolare_plural_subject_verb_agreement_2

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/irregular_plural_subject_verb_agreement_2')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

left_branch_island_echo_question

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/left_branch_island_echo_question')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

left_branch_island_simple_question

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:blimp/left_branch_island_simple_question')
  • Descrizione :
BLiMP is a challenge set for evaluating what language models (LMs) know about
major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each
containing 1000 minimal pairs isolating specific contrasts in syntax,
morphology, or semantics. The data is automatically generated according to
expert-crafted grammars.
  • Licenza : nessuna licenza conosciuta
  • Versione : 0.1.0
  • Divide :
Diviso Esempi
'train' 1000
  • Caratteristiche :
{
    "sentence_good": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence_bad": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "field": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "linguistics_term": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "UID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simple_LM_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "one_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "two_prefix_method": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "lexically_identical": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pair_id": {
        "dtype": "int32",
        "id": null,
        "_type":