europarl_bilingual

References:

bg-cs

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-cs')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 402657
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "cs"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-da

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-da')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 393449
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "da"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-de

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-de')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 393298
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "de"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-el

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-el')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 377341
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "el"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-en

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-en')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 408290
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-es

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-es')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 388226
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "es"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-et

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-et')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 400712
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "et"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-fi

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-fi')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 396624
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "fi"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-fr

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-fr')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 393644
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "fr"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-hu

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-hu')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 382773
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "hu"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-it

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-it')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 377822
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "it"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-lt

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-lt')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 392554
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "lt"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-lv

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-lv')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 398355
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "lv"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-nl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-nl')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 388273
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "nl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-pl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-pl')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 395269
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "pl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-pt

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-pt')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 388972
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "pt"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-ro

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-ro')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 389381
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "ro"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-sk

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-sk')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 393815
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "sk"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-sl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-sl')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 380231
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "sl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-sv

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/bg-sv')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 398236
  • Features:
{
    "translation": {
        "languages": [
            "bg",
            "sv"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-da

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-da')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 618055
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "da"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-de

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-de')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 568589
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "de"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-el

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-el')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 599489
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "el"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-en

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-en')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 647095
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-es

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-es')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 619774
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "es"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-et

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-et')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 636512
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "et"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-fi

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-fi')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 619320
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "fi"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-fr

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-fr')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 628200
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "fr"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-hu

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-hu')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 616160
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "hu"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-it

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-it')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 607017
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "it"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-lt

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-lt')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 624292
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "lt"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-lv

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-lv')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 627873
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "lv"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-nl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-nl')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 618414
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "nl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-pl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-pl')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 621387
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "pl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-pt

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-pt')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 609729
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "pt"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-ro

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-ro')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 392085
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "ro"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-sk

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-sk')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 636128
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "sk"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-sl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-sl')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 611624
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "sl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-sv

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/cs-sv')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 631544
  • Features:
{
    "translation": {
        "languages": [
            "cs",
            "sv"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-de

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-de')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1928414
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "de"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-el

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-el')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1280579
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "el"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-en

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-en')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1991647
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-es

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-es')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1943931
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "es"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-et

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-et')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 635018
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "et"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-fi

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-fi')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1917260
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "fi"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-fr

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-fr')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1992590
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "fr"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-hu

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-hu')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 617519
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "hu"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-it

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-it')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1876703
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "it"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-lt

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-lt')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 614923
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "lt"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-lv

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-lv')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 627809
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "lv"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-nl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-nl')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1987498
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "nl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-pl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-pl')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 642544
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "pl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-pt

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-pt')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1930454
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "pt"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-ro

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-ro')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 388156
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "ro"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-sk

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-sk')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 621907
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "sk"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-sl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-sl')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 595944
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "sl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

da-sv

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/da-sv')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1871171
  • Features:
{
    "translation": {
        "languages": [
            "da",
            "sv"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-el

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/de-el')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1223026
  • Features:
{
    "translation": {
        "languages": [
            "de",
            "el"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-en

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/de-en')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1961119
  • Features:
{
    "translation": {
        "languages": [
            "de",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-es

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/de-es')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1887879
  • Features:
{
    "translation": {
        "languages": [
            "de",
            "es"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-et

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/de-et')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 578248
  • Features:
{
    "translation": {
        "languages": [
            "de",
            "et"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-fi

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/de-fi')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1871185
  • Features:
{
    "translation": {
        "languages": [
            "de",
            "fi"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-fr

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/de-fr')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1942666
  • Features:
{
    "translation": {
        "languages": [
            "de",
            "fr"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-hu

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/de-hu')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 563571
  • Features:
{
    "translation": {
        "languages": [
            "de",
            "hu"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-it

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/de-it')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1832989
  • Features:
{
    "translation": {
        "languages": [
            "de",
            "it"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-lt

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/de-lt')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 565892
  • Features:
{
    "translation": {
        "languages": [
            "de",
            "lt"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-lv

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/de-lv')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 573226
  • Features:
{
    "translation": {
        "languages": [
            "de",
            "lv"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-nl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/de-nl')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1934111
  • Features:
{
    "translation": {
        "languages": [
            "de",
            "nl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-pl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/de-pl')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 579166
  • Features:
{
    "translation": {
        "languages": [
            "de",
            "pl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-pt

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/de-pt')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1884176
  • Features:
{
    "translation": {
        "languages": [
            "de",
            "pt"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-ro

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/de-ro')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 385663
  • Features:
{
    "translation": {
        "languages": [
            "de",
            "ro"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-sk

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/de-sk')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 569381
  • Features:
{
    "translation": {
        "languages": [
            "de",
            "sk"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-sl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/de-sl')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 546212
  • Features:
{
    "translation": {
        "languages": [
            "de",
            "sl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-sv

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/de-sv')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1842026
  • Features:
{
    "translation": {
        "languages": [
            "de",
            "sv"
        ],
        "id": null,
        "_type": "Translation"
    }
}

el-en

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/el-en')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1292180
  • Features:
{
    "translation": {
        "languages": [
            "el",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

el-es

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/el-es')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1272383
  • Features:
{
    "translation": {
        "languages": [
            "el",
            "es"
        ],
        "id": null,
        "_type": "Translation"
    }
}

el-et

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/el-et')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 599915
  • Features:
{
    "translation": {
        "languages": [
            "el",
            "et"
        ],
        "id": null,
        "_type": "Translation"
    }
}

el-fi

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/el-fi')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1227612
  • Features:
{
    "translation": {
        "languages": [
            "el",
            "fi"
        ],
        "id": null,
        "_type": "Translation"
    }
}

el-fr

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/el-fr')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1290796
  • Features:
{
    "translation": {
        "languages": [
            "el",
            "fr"
        ],
        "id": null,
        "_type": "Translation"
    }
}

el-hu

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/el-hu')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 586250
  • Features:
{
    "translation": {
        "languages": [
            "el",
            "hu"
        ],
        "id": null,
        "_type": "Translation"
    }
}

el-it

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/el-it')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1231222
  • Features:
{
    "translation": {
        "languages": [
            "el",
            "it"
        ],
        "id": null,
        "_type": "Translation"
    }
}

el-lt

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/el-lt')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 590850
  • Features:
{
    "translation": {
        "languages": [
            "el",
            "lt"
        ],
        "id": null,
        "_type": "Translation"
    }
}

el-lv

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/el-lv')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 596929
  • Features:
{
    "translation": {
        "languages": [
            "el",
            "lv"
        ],
        "id": null,
        "_type": "Translation"
    }
}

el-nl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/el-nl')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1277297
  • Features:
{
    "translation": {
        "languages": [
            "el",
            "nl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

el-pl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/el-pl')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 591069
  • Features:
{
    "translation": {
        "languages": [
            "el",
            "pl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

el-pt

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/el-pt')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1261188
  • Features:
{
    "translation": {
        "languages": [
            "el",
            "pt"
        ],
        "id": null,
        "_type": "Translation"
    }
}

el-ro

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/el-ro')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 372839
  • Features:
{
    "translation": {
        "languages": [
            "el",
            "ro"
        ],
        "id": null,
        "_type": "Translation"
    }
}

el-sk

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/el-sk')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 600684
  • Features:
{
    "translation": {
        "languages": [
            "el",
            "sk"
        ],
        "id": null,
        "_type": "Translation"
    }
}

el-sl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/el-sl')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 579109
  • Features:
{
    "translation": {
        "languages": [
            "el",
            "sl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

el-sv

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/el-sv')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1273743
  • Features:
{
    "translation": {
        "languages": [
            "el",
            "sv"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-es

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/en-es')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 2009073
  • Features:
{
    "translation": {
        "languages": [
            "en",
            "es"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-et

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/en-et')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 651236
  • Features:
{
    "translation": {
        "languages": [
            "en",
            "et"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-fi

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/en-fi')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1969624
  • Features:
{
    "translation": {
        "languages": [
            "en",
            "fi"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-fr

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/en-fr')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 2051014
  • Features:
{
    "translation": {
        "languages": [
            "en",
            "fr"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-hu

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/en-hu')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 625178
  • Features:
{
    "translation": {
        "languages": [
            "en",
            "hu"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-it

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/en-it')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 1946253
  • Features:
{
    "translation": {
        "languages": [
            "en",
            "it"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-lt

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/en-lt')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 634284
  • Features:
{
    "translation": {
        "languages": [
            "en",
            "lt"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-lv

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/en-lv')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 639318
  • Features:
{
    "translation": {
        "languages": [
            "en",
            "lv"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-nl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/en-nl')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 2027447
  • Features:
{
    "translation": {
        "languages": [
            "en",
            "nl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-pl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/en-pl')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 631160
  • Features:
{
    "translation": {
        "languages": [
            "en",
            "pl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-pt

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/en-pt')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 2002943
  • Features:
{
    "translation": {
        "languages": [
            "en",
            "pt"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-ro

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/en-ro')
  • Description:
A parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
  • License: The data set comes with the same license as the original sources. Please, check the information about the source that is given on http://opus.nlpl.eu/Europarl-v8.php

  • Version: 8.0.0

  • Splits:

Split Examples
'train' 400356
  • Features:
{
    "translation": {
        "languages": [
            "en",
            "ro"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-sk

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:europarl_bilingual/en-sk')
  • Description