Conozca lo último en aprendizaje automático, IA generativa y más en el Simposio WiML 2023.

Se usó la API de Cloud Translation para traducir esta página.

msr_text_compression

Referencias:

Utilice el siguiente comando para cargar este conjunto de datos en TFDS:

ds = tfds.load('huggingface:msr_text_compression')

Descripción :

This dataset contains sentences and short paragraphs with corresponding shorter (compressed) versions. There are up to five compressions for each input text, together with quality judgements of their meaning preservation and grammaticality. The dataset is derived using source texts from the Open American National Corpus (ww.anc.org) and crowd-sourcing.

Licencia : Acuerdo de licencia de datos de investigación de Microsoft
Versión : 1.1.0
Divisiones :

Separar	Ejemplos
`'test'`	785
`'train'`	4936
`'validation'`	447

Características :

{
    "source_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "domain": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "source_text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "targets": {
        "feature": {
            "compressed_text": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "judge_id": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "num_ratings": {
                "dtype": "int64",
                "id": null,
                "_type": "Value"
            },
            "ratings": {
                "feature": {
                    "dtype": "int64",
                    "id": null,
                    "_type": "Value"
                },
                "length": -1,
                "id": null,
                "_type": "Sequence"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}