Conozca lo último en aprendizaje automático, IA generativa y más en el Simposio WiML 2023.

Se usó la API de Cloud Translation para traducir esta página.

sala de redacción

Referencias:

Utilice el siguiente comando para cargar este conjunto de datos en TFDS:

ds = tfds.load('huggingface:newsroom')

Descripción :

NEWSROOM is a large dataset for training and evaluating summarization systems.
It contains 1.3 million articles and summaries written by authors and
editors in the newsrooms of 38 major publications.

Dataset features includes:
  - text: Input news text.
  - summary: Summary for the news.
And additional features:
  - title: news title.
  - url: url of the news.
  - date: date of the article.
  - density: extractive density.
  - coverage: extractive coverage.
  - compression: compression ratio.
  - density_bin: low, medium, high.
  - coverage_bin: extractive, abstractive.
  - compression_bin: low, medium, high.

This dataset can be downloaded upon requests. Unzip all the contents
"train.jsonl, dev.josnl, test.jsonl" to the tfds folder.

Licencia : Sin licencia conocida
Versión : 1.0.0
Divisiones :

Separar	Ejemplos
`'test'`	108862
`'train'`	995041
`'validation'`	108837

Características :

{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "summary": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "date": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "density_bin": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "coverage_bin": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "compression_bin": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "density": {
        "dtype": "float32",
        "id": null,
        "_type": "Value"
    },
    "coverage": {
        "dtype": "float32",
        "id": null,
        "_type": "Value"
    },
    "compression": {
        "dtype": "float32",
        "id": null,
        "_type": "Value"
    }
}