مراجع:
كاليفورنيا دي
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/ca-de')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 4445 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"ca",
"de"
],
"id": null,
"_type": "Translation"
}
}
كاليفورنيا أون
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/ca-en')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 4605 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"ca",
"en"
],
"id": null,
"_type": "Translation"
}
}
دي أون
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-en')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 51467 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"en"
],
"id": null,
"_type": "Translation"
}
}
إل إن
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/el-en')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1285 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"el",
"en"
],
"id": null,
"_type": "Translation"
}
}
دي إيو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-eo')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1363 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"eo"
],
"id": null,
"_type": "Translation"
}
}
en-eo
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-eo')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1562 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"eo"
],
"id": null,
"_type": "Translation"
}
}
دي وفاق
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-es')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 27526 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"es"
],
"id": null,
"_type": "Translation"
}
}
إل-إس
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/el-es')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1096 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"el",
"es"
],
"id": null,
"_type": "Translation"
}
}
ar-es
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-es')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 93470 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"es"
],
"id": null,
"_type": "Translation"
}
}
eo-es
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/eo-es')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1677 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"eo",
"es"
],
"id": null,
"_type": "Translation"
}
}
en-fi
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-fi')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3645 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"fi"
],
"id": null,
"_type": "Translation"
}
}
es-fi
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/es-fi')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3344 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"es",
"fi"
],
"id": null,
"_type": "Translation"
}
}
دي الاب
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-fr')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 34916 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"fr"
],
"id": null,
"_type": "Translation"
}
}
الاب
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/el-fr')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1237 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"el",
"fr"
],
"id": null,
"_type": "Translation"
}
}
ar-fr
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-fr')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 127085 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"fr"
],
"id": null,
"_type": "Translation"
}
}
EO-fr
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/eo-fr')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1588 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"eo",
"fr"
],
"id": null,
"_type": "Translation"
}
}
es-fr
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/es-fr')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 56319 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"es",
"fr"
],
"id": null,
"_type": "Translation"
}
}
فاي الاب
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fi-fr')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3537 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fi",
"fr"
],
"id": null,
"_type": "Translation"
}
}
كا هو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/ca-hu')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 4463 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"ca",
"hu"
],
"id": null,
"_type": "Translation"
}
}
دي هو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-hu')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 51780 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"hu"
],
"id": null,
"_type": "Translation"
}
}
إل هو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/el-hu')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1090 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"el",
"hu"
],
"id": null,
"_type": "Translation"
}
}
أون هو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-hu')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 137151 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"hu"
],
"id": null,
"_type": "Translation"
}
}
يو هو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/eo-hu')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1636 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"eo",
"hu"
],
"id": null,
"_type": "Translation"
}
}
الأب هو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fr-hu')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 89337 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fr",
"hu"
],
"id": null,
"_type": "Translation"
}
}
تخلص منه
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-it')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 27381 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"it"
],
"id": null,
"_type": "Translation"
}
}
أون-إت
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-it')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 32332 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"it"
],
"id": null,
"_type": "Translation"
}
}
eo-it
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/eo-it')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1453 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"eo",
"it"
],
"id": null,
"_type": "Translation"
}
}
es-it
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/es-it')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 28868 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"es",
"it"
],
"id": null,
"_type": "Translation"
}
}
الاب ذلك
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fr-it')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 14692 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fr",
"it"
],
"id": null,
"_type": "Translation"
}
}
هو جين تاو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/hu-it')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 30949 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"hu",
"it"
],
"id": null,
"_type": "Translation"
}
}
كاليفورنيا nl
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/ca-nl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 4329 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"ca",
"nl"
],
"id": null,
"_type": "Translation"
}
}
دي nl
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-nl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 15622 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"nl"
],
"id": null,
"_type": "Translation"
}
}
ar-nl
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-nl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 38652 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"nl"
],
"id": null,
"_type": "Translation"
}
}
es-nl
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/es-nl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 32247 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"es",
"nl"
],
"id": null,
"_type": "Translation"
}
}
الاب-nl
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fr-nl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 40017 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fr",
"nl"
],
"id": null,
"_type": "Translation"
}
}
هو جين تاو-nl
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/hu-nl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 43428 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"hu",
"nl"
],
"id": null,
"_type": "Translation"
}
}
it-nl
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/it-nl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 2359 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"it",
"nl"
],
"id": null,
"_type": "Translation"
}
}
أون-لا
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-no')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3499 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"no"
],
"id": null,
"_type": "Translation"
}
}
وفاق لا
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/es-no')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3585 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"es",
"no"
],
"id": null,
"_type": "Translation"
}
}
فاي لا
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fi-no')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3414 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fi",
"no"
],
"id": null,
"_type": "Translation"
}
}
الاب-لا
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fr-no')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3449 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fr",
"no"
],
"id": null,
"_type": "Translation"
}
}
هو جين تاو لا
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/hu-no')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3410 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"hu",
"no"
],
"id": null,
"_type": "Translation"
}
}
ar-pl
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-pl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 2831 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"pl"
],
"id": null,
"_type": "Translation"
}
}
فاي-رر
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fi-pl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 2814 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fi",
"pl"
],
"id": null,
"_type": "Translation"
}
}
الاب رر
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fr-pl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 2825 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fr",
"pl"
],
"id": null,
"_type": "Translation"
}
}
هو جين تاو رر
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/hu-pl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 2859 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"hu",
"pl"
],
"id": null,
"_type": "Translation"
}
}
دي بي تي
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-pt')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1102 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"pt"
],
"id": null,
"_type": "Translation"
}
}
أون-حزب العمال
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-pt')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1404 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"pt"
],
"id": null,
"_type": "Translation"
}
}
eo-pt
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/eo-pt')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1259 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"eo",
"pt"
],
"id": null,
"_type": "Translation"
}
}
es-pt
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/es-pt')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1327 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"es",
"pt"
],
"id": null,
"_type": "Translation"
}
}
الاب-حزب العمال
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fr-pt')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1263 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fr",
"pt"
],
"id": null,
"_type": "Translation"
}
}
هو جين تاو-pt
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/hu-pt')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1184 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"hu",
"pt"
],
"id": null,
"_type": "Translation"
}
}
it-pt
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/it-pt')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1163 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"it",
"pt"
],
"id": null,
"_type": "Translation"
}
}
دي رو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-ru')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 17373 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"ru"
],
"id": null,
"_type": "Translation"
}
}
ar-ru
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-ru')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 17496 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"ru"
],
"id": null,
"_type": "Translation"
}
}
es-ru
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/es-ru')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 16793 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"es",
"ru"
],
"id": null,
"_type": "Translation"
}
}
fr-ru
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fr-ru')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 8197 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fr",
"ru"
],
"id": null,
"_type": "Translation"
}
}
هو رو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/hu-ru')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 26127 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"hu",
"ru"
],
"id": null,
"_type": "Translation"
}
}
it-ru
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/it-ru')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 17906 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"it",
"ru"
],
"id": null,
"_type": "Translation"
}
}
ar-sv
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-sv')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3095 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"sv"
],
"id": null,
"_type": "Translation"
}
}
الاب-سانت
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fr-sv')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3002 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fr",
"sv"
],
"id": null,
"_type": "Translation"
}
}
it-sv
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/it-sv')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 2998 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"it",
"sv"
],
"id": null,
"_type": "Translation"
}
}
,مراجع:
كاليفورنيا دي
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/ca-de')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 4445 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"ca",
"de"
],
"id": null,
"_type": "Translation"
}
}
كاليفورنيا أون
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/ca-en')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 4605 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"ca",
"en"
],
"id": null,
"_type": "Translation"
}
}
دي أون
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-en')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 51467 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"en"
],
"id": null,
"_type": "Translation"
}
}
إل إن
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/el-en')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1285 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"el",
"en"
],
"id": null,
"_type": "Translation"
}
}
دي إيو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-eo')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1363 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"eo"
],
"id": null,
"_type": "Translation"
}
}
en-eo
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-eo')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1562 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"eo"
],
"id": null,
"_type": "Translation"
}
}
دي وفاق
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-es')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 27526 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"es"
],
"id": null,
"_type": "Translation"
}
}
إل-إس
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/el-es')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1096 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"el",
"es"
],
"id": null,
"_type": "Translation"
}
}
ar-es
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-es')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 93470 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"es"
],
"id": null,
"_type": "Translation"
}
}
eo-es
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/eo-es')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1677 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"eo",
"es"
],
"id": null,
"_type": "Translation"
}
}
en-fi
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-fi')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3645 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"fi"
],
"id": null,
"_type": "Translation"
}
}
es-fi
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/es-fi')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3344 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"es",
"fi"
],
"id": null,
"_type": "Translation"
}
}
دي الاب
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-fr')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 34916 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"fr"
],
"id": null,
"_type": "Translation"
}
}
الاب
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/el-fr')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1237 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"el",
"fr"
],
"id": null,
"_type": "Translation"
}
}
ar-fr
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-fr')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 127085 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"fr"
],
"id": null,
"_type": "Translation"
}
}
EO-fr
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/eo-fr')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1588 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"eo",
"fr"
],
"id": null,
"_type": "Translation"
}
}
es-fr
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/es-fr')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 56319 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"es",
"fr"
],
"id": null,
"_type": "Translation"
}
}
فاي الاب
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fi-fr')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3537 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fi",
"fr"
],
"id": null,
"_type": "Translation"
}
}
كا هو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/ca-hu')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 4463 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"ca",
"hu"
],
"id": null,
"_type": "Translation"
}
}
دي هو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-hu')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 51780 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"hu"
],
"id": null,
"_type": "Translation"
}
}
إل هو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/el-hu')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1090 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"el",
"hu"
],
"id": null,
"_type": "Translation"
}
}
أون هو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-hu')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 137151 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"hu"
],
"id": null,
"_type": "Translation"
}
}
يو هو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/eo-hu')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1636 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"eo",
"hu"
],
"id": null,
"_type": "Translation"
}
}
الأب هو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fr-hu')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 89337 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fr",
"hu"
],
"id": null,
"_type": "Translation"
}
}
تخلص منه
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-it')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 27381 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"it"
],
"id": null,
"_type": "Translation"
}
}
أون-إت
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-it')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 32332 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"it"
],
"id": null,
"_type": "Translation"
}
}
eo-it
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/eo-it')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1453 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"eo",
"it"
],
"id": null,
"_type": "Translation"
}
}
es-it
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/es-it')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 28868 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"es",
"it"
],
"id": null,
"_type": "Translation"
}
}
الاب ذلك
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fr-it')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 14692 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fr",
"it"
],
"id": null,
"_type": "Translation"
}
}
هو جين تاو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/hu-it')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 30949 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"hu",
"it"
],
"id": null,
"_type": "Translation"
}
}
كاليفورنيا nl
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/ca-nl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 4329 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"ca",
"nl"
],
"id": null,
"_type": "Translation"
}
}
دي nl
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-nl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 15622 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"nl"
],
"id": null,
"_type": "Translation"
}
}
ar-nl
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-nl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 38652 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"nl"
],
"id": null,
"_type": "Translation"
}
}
es-nl
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/es-nl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 32247 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"es",
"nl"
],
"id": null,
"_type": "Translation"
}
}
الاب-nl
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fr-nl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 40017 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fr",
"nl"
],
"id": null,
"_type": "Translation"
}
}
هو جين تاو-nl
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/hu-nl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 43428 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"hu",
"nl"
],
"id": null,
"_type": "Translation"
}
}
it-nl
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/it-nl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 2359 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"it",
"nl"
],
"id": null,
"_type": "Translation"
}
}
أون-لا
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-no')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3499 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"no"
],
"id": null,
"_type": "Translation"
}
}
وفاق لا
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/es-no')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3585 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"es",
"no"
],
"id": null,
"_type": "Translation"
}
}
فاي لا
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fi-no')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3414 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fi",
"no"
],
"id": null,
"_type": "Translation"
}
}
الاب-لا
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fr-no')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3449 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fr",
"no"
],
"id": null,
"_type": "Translation"
}
}
هو جين تاو لا
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/hu-no')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3410 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"hu",
"no"
],
"id": null,
"_type": "Translation"
}
}
ar-pl
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-pl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 2831 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"pl"
],
"id": null,
"_type": "Translation"
}
}
فاي-رر
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fi-pl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 2814 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fi",
"pl"
],
"id": null,
"_type": "Translation"
}
}
الاب رر
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fr-pl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 2825 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fr",
"pl"
],
"id": null,
"_type": "Translation"
}
}
هو جين تاو رر
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/hu-pl')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 2859 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"hu",
"pl"
],
"id": null,
"_type": "Translation"
}
}
دي بي تي
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-pt')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1102 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"pt"
],
"id": null,
"_type": "Translation"
}
}
أون-حزب العمال
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-pt')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1404 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"pt"
],
"id": null,
"_type": "Translation"
}
}
eo-pt
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/eo-pt')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1259 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"eo",
"pt"
],
"id": null,
"_type": "Translation"
}
}
es-pt
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/es-pt')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1327 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"es",
"pt"
],
"id": null,
"_type": "Translation"
}
}
الاب-حزب العمال
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fr-pt')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1263 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fr",
"pt"
],
"id": null,
"_type": "Translation"
}
}
هو جين تاو-pt
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/hu-pt')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1184 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"hu",
"pt"
],
"id": null,
"_type": "Translation"
}
}
it-pt
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/it-pt')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 1163 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"it",
"pt"
],
"id": null,
"_type": "Translation"
}
}
دي رو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/de-ru')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 17373 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"de",
"ru"
],
"id": null,
"_type": "Translation"
}
}
ar-ru
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-ru')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 17496 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"ru"
],
"id": null,
"_type": "Translation"
}
}
es-ru
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/es-ru')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 16793 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"es",
"ru"
],
"id": null,
"_type": "Translation"
}
}
fr-ru
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fr-ru')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 8197 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fr",
"ru"
],
"id": null,
"_type": "Translation"
}
}
هو رو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/hu-ru')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 26127 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"hu",
"ru"
],
"id": null,
"_type": "Translation"
}
}
it-ru
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/it-ru')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 17906 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"it",
"ru"
],
"id": null,
"_type": "Translation"
}
}
ar-sv
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/en-sv')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3095 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"en",
"sv"
],
"id": null,
"_type": "Translation"
}
}
الاب-سانت
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/fr-sv')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 3002 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"fr",
"sv"
],
"id": null,
"_type": "Translation"
}
}
it-sv
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:opus_books/it-sv')
- وصف :
This is a collection of copyright free books aligned by Andras Farkas, which are available from http://www.farkastranslations.com/bilingual_books.php
Note that the texts are rather dated due to copyright issues and that some of them are manually reviewed (check the meta-data at the top of the corpus files in XML). The source is multilingually aligned, which is available from http://www.farkastranslations.com/bilingual_books.php. In OPUS, the alignment is formally bilingual but the multilingual alignment can be recovered from the XCES sentence alignment files. Note also that the alignment units from the original source may include multi-sentence paragraphs, which are split and sentence-aligned in OPUS.
All texts are freely available for personal, educational and research use. Commercial use (e.g. reselling as parallel books) and mass redistribution without explicit permission are not granted. Please acknowledge the source when using the data!
16 languages, 64 bitexts
total number of files: 158
total number of tokens: 19.50M
total number of sentence fragments: 0.91M
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'train' | 2998 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"languages": [
"it",
"sv"
],
"id": null,
"_type": "Translation"
}
}