TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

adversarial_qa

参考：

adversarialQA

使用以下命令在 TFDS 中加载此数据集：

ds = tfds.load('huggingface:adversarial_qa/adversarialQA')

说明：

AdversarialQA is a Reading Comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles using an adversarial model-in-the-loop.
We use three different models; BiDAF (Seo et al., 2016), BERT-Large (Devlin et al., 2018), and RoBERTa-Large (Liu et al., 2019) in the annotation loop and construct three datasets; D(BiDAF), D(BERT), and D(RoBERTa), each with 10,000 training examples, 1,000 validation, and 1,000 test examples.
The adversarial human annotation paradigm ensures that these datasets consist of questions that current state-of-the-art models (at least the ones used as adversaries in the annotation loop) find challenging.

许可证：无已知许可证
版本：1.0.0
拆分：

拆分	样本
`'test'`	3000
`'train'`	30000
`'validation'`	3000

特征：

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "context": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answers": {
        "feature": {
            "text": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "answer_start": {
                "dtype": "int32",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "metadata": {
        "split": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "model_in_the_loop": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        }
    }
}

dbidaf

使用以下命令在 TFDS 中加载此数据集：

ds = tfds.load('huggingface:adversarial_qa/dbidaf')

说明：

AdversarialQA is a Reading Comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles using an adversarial model-in-the-loop.
We use three different models; BiDAF (Seo et al., 2016), BERT-Large (Devlin et al., 2018), and RoBERTa-Large (Liu et al., 2019) in the annotation loop and construct three datasets; D(BiDAF), D(BERT), and D(RoBERTa), each with 10,000 training examples, 1,000 validation, and 1,000 test examples.
The adversarial human annotation paradigm ensures that these datasets consist of questions that current state-of-the-art models (at least the ones used as adversaries in the annotation loop) find challenging.

许可证：无已知许可证
版本：1.0.0
拆分：

拆分	样本
`'test'`	1000
`'train'`	10000
`'validation'`	1000

特征：

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "context": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answers": {
        "feature": {
            "text": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "answer_start": {
                "dtype": "int32",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "metadata": {
        "split": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "model_in_the_loop": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        }
    }
}

dbert

使用以下命令在 TFDS 中加载此数据集：

ds = tfds.load('huggingface:adversarial_qa/dbert')

说明：

AdversarialQA is a Reading Comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles using an adversarial model-in-the-loop.
We use three different models; BiDAF (Seo et al., 2016), BERT-Large (Devlin et al., 2018), and RoBERTa-Large (Liu et al., 2019) in the annotation loop and construct three datasets; D(BiDAF), D(BERT), and D(RoBERTa), each with 10,000 training examples, 1,000 validation, and 1,000 test examples.
The adversarial human annotation paradigm ensures that these datasets consist of questions that current state-of-the-art models (at least the ones used as adversaries in the annotation loop) find challenging.

许可证：无已知许可证
版本：1.0.0
拆分：

拆分	样本
`'test'`	1000
`'train'`	10000
`'validation'`	1000

特征：

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "context": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answers": {
        "feature": {
            "text": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "answer_start": {
                "dtype": "int32",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "metadata": {
        "split": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "model_in_the_loop": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        }
    }
}

droberta

使用以下命令在 TFDS 中加载此数据集：

ds = tfds.load('huggingface:adversarial_qa/droberta')

说明：

AdversarialQA is a Reading Comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles using an adversarial model-in-the-loop.
We use three different models; BiDAF (Seo et al., 2016), BERT-Large (Devlin et al., 2018), and RoBERTa-Large (Liu et al., 2019) in the annotation loop and construct three datasets; D(BiDAF), D(BERT), and D(RoBERTa), each with 10,000 training examples, 1,000 validation, and 1,000 test examples.
The adversarial human annotation paradigm ensures that these datasets consist of questions that current state-of-the-art models (at least the ones used as adversaries in the annotation loop) find challenging.

许可证：无已知许可证
版本：1.0.0
拆分：

拆分	样本
`'test'`	1000
`'train'`	10000
`'validation'`	1000

特征：

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "context": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answers": {
        "feature": {
            "text": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "answer_start": {
                "dtype": "int32",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "metadata": {
        "split": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "model_in_the_loop": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        }
    }
}

adversarial_qa 使用集合让一切井井有条 根据您的偏好保存内容并对其进行分类。

adversarialQA

dbidaf

dbert

droberta

adversarial_qa