- Description:
This dataset contains machine translations of the English PAWS training data. The translations are provided by the XTREME benchmark and cover the following languages:
- French
- Spanish
- German
- Chinese
- Japanese
- Korean
For further details on PAWS, see the papers: PAWS: Paraphrase Adversaries from Word Scrambling at https://arxiv.org/abs/1904.01130 and PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification at https://arxiv.org/abs/1908.11828
For details related to XTREME, please refer to: XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization at https://arxiv.org/abs/2003.11080
Source code:
tfds.text.xtreme_pawsx.XtremePawsx
Versions:
1.0.0
(default): No release notes.
Auto-cached (documentation): Yes
Feature structure:
FeaturesDict({
'label': ClassLabel(shape=(), dtype=int64, num_classes=2),
'sentence1': Text(shape=(), dtype=string),
'sentence2': Text(shape=(), dtype=string),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
label | ClassLabel | int64 | ||
sentence1 | Text | string | ||
sentence2 | Text | string |
Supervised keys (See
as_supervised
doc):None
Figure (tfds.show_examples): Not supported.
Citation:
@article{hu2020xtreme,
author = {Junjie Hu and Sebastian Ruder and Aditya Siddhant and Graham Neubig and Orhan Firat and Melvin Johnson},
title = {XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization},
journal = {CoRR},
volume = {abs/2003.11080},
year = {2020},
archivePrefix = {arXiv},
eprint = {2003.11080}
}
xtreme_pawsx/de (default config)
Config description: Translated to de
Download size:
22.34 MiB
Dataset size:
14.19 MiB
Splits:
Split | Examples |
---|---|
'train' |
49,340 |
- Examples (tfds.as_dataframe):
xtreme_pawsx/es
Config description: Translated to es
Download size:
22.27 MiB
Dataset size:
14.09 MiB
Splits:
Split | Examples |
---|---|
'train' |
49,244 |
- Examples (tfds.as_dataframe):
xtreme_pawsx/fr
Config description: Translated to fr
Download size:
22.70 MiB
Dataset size:
14.53 MiB
Splits:
Split | Examples |
---|---|
'train' |
49,208 |
- Examples (tfds.as_dataframe):
xtreme_pawsx/ja
Config description: Translated to ja
Download size:
25.12 MiB
Dataset size:
16.98 MiB
Splits:
Split | Examples |
---|---|
'train' |
49,086 |
- Examples (tfds.as_dataframe):
xtreme_pawsx/ko
Config description: Translated to ko
Download size:
22.99 MiB
Dataset size:
14.86 MiB
Splits:
Split | Examples |
---|---|
'train' |
49,298 |
- Examples (tfds.as_dataframe):
xtreme_pawsx/zh
Config description: Translated to zh
Download size:
21.45 MiB
Dataset size:
13.21 MiB
Splits:
Split | Examples |
---|---|
'train' |
49,149 |
- Examples (tfds.as_dataframe):