TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

moroco

References:

moroco

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:moroco/moroco')

Description:

The MOROCO (Moldavian and Romanian Dialectal Corpus) dataset contains 33564 samples of text collected from the news domain.
The samples belong to one of the following six topics:
    - culture
    - finance
    - politics
    - science
    - sports
    - tech

License: CC BY-SA 4.0 License
Version: 1.0.0
Splits:

Split	Examples
`'test'`	5924
`'train'`	21719
`'validation'`	5921

Features:

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "category": {
        "num_classes": 6,
        "names": [
            "culture",
            "finance",
            "politics",
            "science",
            "sports",
            "tech"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    },
    "sample": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2022-06-28 UTC.

English
中文 – 简体