مادة عرض

مراجع:

تبسيط

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:asset/simplification')
  • الوصف :
ASSET is a dataset for evaluating Sentence Simplification systems with multiple rewriting transformations,
as described in "ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations".
The corpus is composed of 2000 validation and 359 test original sentences that were each simplified 10 times by different annotators.
The corpus also contains human judgments of meaning preservation, fluency and simplicity for the outputs of several automatic text simplification systems.
  • الترخيص : Creative Common Attribution-NonCommercial 4.0 International
  • الإصدار : 1.0.0
  • الانقسامات :
انشق، مزق أمثلة
'test' 359
'validation' 2000
  • الميزات :
{
    "original": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simplifications": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

التقييمات

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:asset/ratings')
  • الوصف :
ASSET is a dataset for evaluating Sentence Simplification systems with multiple rewriting transformations,
as described in "ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations".
The corpus is composed of 2000 validation and 359 test original sentences that were each simplified 10 times by different annotators.
The corpus also contains human judgments of meaning preservation, fluency and simplicity for the outputs of several automatic text simplification systems.
  • الترخيص : Creative Common Attribution-NonCommercial 4.0 International
  • الإصدار : 1.0.0
  • الانقسامات :
انشق، مزق أمثلة
'full' 4500
  • الميزات :
{
    "original": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "simplification": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "original_sentence_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "aspect": {
        "num_classes": 3,
        "names": [
            "meaning",
            "fluency",
            "simplicity"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    },
    "worker_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "rating": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}