ted_talks_iwslt

مراجع:

eu_ca_2014

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:ted_talks_iwslt/eu_ca_2014')
  • وصف :
The core of WIT3 is the TED Talks corpus, that basically redistributes the original content published by the TED Conference website (http://www.ted.com). Since 2007,
the TED Conference, based in California, has been posting all video recordings of its talks together with subtitles in English
and their translations in more than 80 languages. Aside from its cultural and social relevance, this content, which is published under the Creative Commons BYNC-ND license, also represents a precious
language resource for the machine translation research community, thanks to its size, variety of topics, and covered languages.
This effort repurposes the original content in a way which is more convenient for machine translation researchers.
  • الترخيص : CC-BY-NC-4.0
  • الإصدار : 1.1.0
  • الإنشقاقات :
ينقسم أمثلة
'train' 44
  • سمات :
{
    "translation": {
        "languages": [
            "eu",
            "ca"
        ],
        "id": null,
        "_type": "Translation"
    }
}

eu_ca_2015

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:ted_talks_iwslt/eu_ca_2015')
  • وصف :
The core of WIT3 is the TED Talks corpus, that basically redistributes the original content published by the TED Conference website (http://www.ted.com). Since 2007,
the TED Conference, based in California, has been posting all video recordings of its talks together with subtitles in English
and their translations in more than 80 languages. Aside from its cultural and social relevance, this content, which is published under the Creative Commons BYNC-ND license, also represents a precious
language resource for the machine translation research community, thanks to its size, variety of topics, and covered languages.
This effort repurposes the original content in a way which is more convenient for machine translation researchers.
  • الترخيص : CC-BY-NC-4.0
  • الإصدار : 1.1.0
  • الإنشقاقات :
ينقسم أمثلة
'train' 52
  • سمات :
{
    "translation": {
        "languages": [
            "eu",
            "ca"
        ],
        "id": null,
        "_type": "Translation"
    }
}

eu_ca_2016

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:ted_talks_iwslt/eu_ca_2016')
  • وصف :
The core of WIT3 is the TED Talks corpus, that basically redistributes the original content published by the TED Conference website (http://www.ted.com). Since 2007,
the TED Conference, based in California, has been posting all video recordings of its talks together with subtitles in English
and their translations in more than 80 languages. Aside from its cultural and social relevance, this content, which is published under the Creative Commons BYNC-ND license, also represents a precious
language resource for the machine translation research community, thanks to its size, variety of topics, and covered languages.
This effort repurposes the original content in a way which is more convenient for machine translation researchers.
  • الترخيص : CC-BY-NC-4.0
  • الإصدار : 1.1.0
  • الإنشقاقات :
ينقسم أمثلة
'train' 54
  • سمات :
{
    "translation": {
        "languages": [
            "eu",
            "ca"
        ],
        "id": null,
        "_type": "Translation"
    }
}

nl_ar_2014

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:ted_talks_iwslt/nl_en_2014')
  • وصف :
The core of WIT3 is the TED Talks corpus, that basically redistributes the original content published by the TED Conference website (http://www.ted.com). Since 2007,
the TED Conference, based in California, has been posting all video recordings of its talks together with subtitles in English
and their translations in more than 80 languages. Aside from its cultural and social relevance, this content, which is published under the Creative Commons BYNC-ND license, also represents a precious
language resource for the machine translation research community, thanks to its size, variety of topics, and covered languages.
This effort repurposes the original content in a way which is more convenient for machine translation researchers.
  • الترخيص : CC-BY-NC-4.0
  • الإصدار : 1.1.0
  • الإنشقاقات :
ينقسم أمثلة
'train' 2966
  • سمات :
{
    "translation": {
        "languages": [
            "nl",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

nl_ar_2015

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:ted_talks_iwslt/nl_en_2015')
  • وصف :
The core of WIT3 is the TED Talks corpus, that basically redistributes the original content published by the TED Conference website (http://www.ted.com). Since 2007,
the TED Conference, based in California, has been posting all video recordings of its talks together with subtitles in English
and their translations in more than 80 languages. Aside from its cultural and social relevance, this content, which is published under the Creative Commons BYNC-ND license, also represents a precious
language resource for the machine translation research community, thanks to its size, variety of topics, and covered languages.
This effort repurposes the original content in a way which is more convenient for machine translation researchers.
  • الترخيص : CC-BY-NC-4.0
  • الإصدار : 1.1.0
  • الإنشقاقات :
ينقسم أمثلة
'train' 3550
  • سمات :
{
    "translation": {
        "languages": [
            "nl",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

nl_ar_2016

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:ted_talks_iwslt/nl_en_2016')
  • وصف :
The core of WIT3 is the TED Talks corpus, that basically redistributes the original content published by the TED Conference website (http://www.ted.com). Since 2007,
the TED Conference, based in California, has been posting all video recordings of its talks together with subtitles in English
and their translations in more than 80 languages. Aside from its cultural and social relevance, this content, which is published under the Creative Commons BYNC-ND license, also represents a precious
language resource for the machine translation research community, thanks to its size, variety of topics, and covered languages.
This effort repurposes the original content in a way which is more convenient for machine translation researchers.
  • الترخيص : CC-BY-NC-4.0
  • الإصدار : 1.1.0
  • الإنشقاقات :
ينقسم أمثلة
'train' 3852
  • سمات :
{
    "translation": {
        "languages": [
            "nl",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

nl_hi_2014

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:ted_talks_iwslt/nl_hi_2014')
  • وصف :
The core of WIT3 is the TED Talks corpus, that basically redistributes the original content published by the TED Conference website (http://www.ted.com). Since 2007,
the TED Conference, based in California, has been posting all video recordings of its talks together with subtitles in English
and their translations in more than 80 languages. Aside from its cultural and social relevance, this content, which is published under the Creative Commons BYNC-ND license, also represents a precious
language resource for the machine translation research community, thanks to its size, variety of topics, and covered languages.
This effort repurposes the original content in a way which is more convenient for machine translation researchers.
  • الترخيص : CC-BY-NC-4.0
  • الإصدار : 1.1.0
  • الإنشقاقات :
ينقسم أمثلة
'train' 367
  • سمات :
{
    "translation": {
        "languages": [
            "nl",
            "hi"
        ],
        "id": null,
        "_type": "Translation"
    }
}

nl_hi_2015

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:ted_talks_iwslt/nl_hi_2015')
  • وصف :
The core of WIT3 is the TED Talks corpus, that basically redistributes the original content published by the TED Conference website (http://www.ted.com). Since 2007,
the TED Conference, based in California, has been posting all video recordings of its talks together with subtitles in English
and their translations in more than 80 languages. Aside from its cultural and social relevance, this content, which is published under the Creative Commons BYNC-ND license, also represents a precious
language resource for the machine translation research community, thanks to its size, variety of topics, and covered languages.
This effort repurposes the original content in a way which is more convenient for machine translation researchers.
  • الترخيص : CC-BY-NC-4.0
  • الإصدار : 1.1.0
  • الإنشقاقات :
ينقسم أمثلة
'train' 421
  • سمات :
{
    "translation": {
        "languages": [
            "nl",
            "hi"
        ],
        "id": null,
        "_type": "Translation"
    }
}

nl_hi_2016

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:ted_talks_iwslt/nl_hi_2016')
  • وصف :
The core of WIT3 is the TED Talks corpus, that basically redistributes the original content published by the TED Conference website (http://www.ted.com). Since 2007,
the TED Conference, based in California, has been posting all video recordings of its talks together with subtitles in English
and their translations in more than 80 languages. Aside from its cultural and social relevance, this content, which is published under the Creative Commons BYNC-ND license, also represents a precious
language resource for the machine translation research community, thanks to its size, variety of topics, and covered languages.
This effort repurposes the original content in a way which is more convenient for machine translation researchers.
  • الترخيص : CC-BY-NC-4.0
  • الإصدار : 1.1.0
  • الإنشقاقات :
ينقسم أمثلة
'train' 496
  • سمات :
{
    "translation": {
        "languages": [
            "nl",
            "hi"
        ],
        "id": null,
        "_type": "Translation"
    }
}

دي_جا_2014

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:ted_talks_iwslt/de_ja_2014')
  • وصف :
The core of WIT3 is the TED Talks corpus, that basically redistributes the original content published by the TED Conference website (http://www.ted.com). Since 2007,
the TED Conference, based in California, has been posting all video recordings of its talks together with subtitles in English
and their translations in more than 80 languages. Aside from its cultural and social relevance, this content, which is published under the Creative Commons BYNC-ND license, also represents a precious
language resource for the machine translation research community, thanks to its size, variety of topics, and covered languages.
This effort repurposes the original content in a way which is more convenient for machine translation researchers.
  • الترخيص : CC-BY-NC-4.0
  • الإصدار : 1.1.0
  • الإنشقاقات :
ينقسم أمثلة
'train' 2536
  • سمات :
{
    "translation": {
        "languages": [
            "de",
            "ja"
        ],
        "id": null,
        "_type": "Translation"
    }
}

دي_جا_2015

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:ted_talks_iwslt/de_ja_2015')
  • وصف :
The core of WIT3 is the TED Talks corpus, that basically redistributes the original content published by the TED Conference website (http://www.ted.com). Since 2007,
the TED Conference, based in California, has been posting all video recordings of its talks together with subtitles in English
and their translations in more than 80 languages. Aside from its cultural and social relevance, this content, which is published under the Creative Commons BYNC-ND license, also represents a precious
language resource for the machine translation research community, thanks to its size, variety of topics, and covered languages.
This effort repurposes the original content in a way which is more convenient for machine translation researchers.
  • الترخيص : CC-BY-NC-4.0
  • الإصدار : 1.1.0
  • الإنشقاقات :
ينقسم أمثلة
'train' 3247
  • سمات :
{
    "translation": {
        "languages": [
            "de",
            "ja"
        ],
        "id": null,
        "_type": "Translation"
    }
}

دي_جا_2016

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:ted_talks_iwslt/de_ja_2016')
  • وصف :
The core of WIT3 is the TED Talks corpus, that basically redistributes the original content published by the TED Conference website (http://www.ted.com). Since 2007,
the TED Conference, based in California, has been posting all video recordings of its talks together with subtitles in English
and their translations in more than 80 languages. Aside from its cultural and social relevance, this content, which is published under the Creative Commons BYNC-ND license, also represents a precious
language resource for the machine translation research community, thanks to its size, variety of topics, and covered languages.
This effort repurposes the original content in a way which is more convenient for machine translation researchers.
  • الترخيص : CC-BY-NC-4.0
  • الإصدار : 1.1.0
  • الإنشقاقات :
ينقسم أمثلة
'train' 3590
  • سمات :
{
    "translation": {
        "languages": [
            "de",
            "ja"
        ],
        "id": null,
        "_type": "Translation"
    }
}

fr-ca_hi_2014

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:ted_talks_iwslt/fr-ca_hi_2014')
  • وصف :
The core of WIT3 is the TED Talks corpus, that basically redistributes the original content published by the TED Conference website (http://www.ted.com). Since 2007,
the TED Conference, based in California, has been posting all video recordings of its talks together with subtitles in English
and their translations in more than 80 languages. Aside from its cultural and social relevance, this content, which is published under the Creative Commons BYNC-ND license, also represents a precious
language resource for the machine translation research community, thanks to its size, variety of topics, and covered languages.
This effort repurposes the original content in a way which is more convenient for machine translation researchers.
  • الترخيص : CC-BY-NC-4.0
  • الإصدار : 1.1.0
  • الإنشقاقات :
ينقسم أمثلة
'train' 127
  • سمات :
{
    "translation": {
        "languages": [
            "fr-ca",
            "hi"
        ],
        "id": null,
        "_type": "Translation"
    }
}

fr-ca_hi_2015

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:ted_talks_iwslt/fr-ca_hi_2015')
  • وصف :
The core of WIT3 is the TED Talks corpus, that basically redistributes the original content published by the TED Conference website (http://www.ted.com). Since 2007,
the TED Conference, based in California, has been posting all video recordings of its talks together with subtitles in English
and their translations in more than 80 languages. Aside from its cultural and social relevance, this content, which is published under the Creative Commons BYNC-ND license, also represents a precious
language resource for the machine translation research community, thanks to its size, variety of topics, and covered languages.
This effort repurposes the original content in a way which is more convenient for machine translation researchers.
  • الترخيص : CC-BY-NC-4.0
  • الإصدار : 1.1.0
  • الإنشقاقات :
ينقسم أمثلة
'train' 141
  • سمات :
{
    "translation": {
        "languages": [
            "fr-ca",
            "hi"
        ],
        "id": null,
        "_type": "Translation"
    }
}

fr-ca_hi_2016

استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:

ds = tfds.load('huggingface:ted_talks_iwslt/fr-ca_hi_2016')
  • وصف :
The core of WIT3 is the TED Talks corpus, that basically redistributes the original content published by the TED Conference website (http://www.ted.com). Since 2007,
the TED Conference, based in California, has been posting all video recordings of its talks together with subtitles in English
and their translations in more than 80 languages. Aside from its cultural and social relevance, this content, which is published under the Creative Commons BYNC-ND license, also represents a precious
language resource for the machine translation research community, thanks to its size, variety of topics, and covered languages.
This effort repurposes the original content in a way which is more convenient for machine translation researchers.
  • الترخيص : CC-BY-NC-4.0
  • الإصدار : 1.1.0
  • الإنشقاقات :
ينقسم أمثلة
'train' 156
  • سمات :
{
    "translation": {
        "languages": [
            "fr-ca",
            "hi"
        ],
        "id": null,
        "_type": "Translation"
    }
}