- Description:
There are two sub datasets:
(1) RottenTomatoes: The movie critics and consensus crawled from http://rottentomatoes.com/ It has fields of "_movie_name", "_movie_id", "_critics", and "_critic_consensus".
(2) IDebate: The arguments crawled from http://idebate.org/ It has fields of "_debate_name", "_debate_id", "_claim", "_claim_id", "_argument_sentences".
See also https://web.eecs.umich.edu/~wangluxy/datasets/opinion_README.txt
Source code:
tfds.datasets.opinion_abstracts.BuilderVersions:
1.0.0(default): No release notes.
Download size:
20.08 MiBAuto-cached (documentation): Yes
Figure (tfds.show_examples): Not supported.
Citation:
@inproceedings{wang-ling-2016-neural,
title = "Neural Network-Based Abstract Generation for Opinions and Arguments",
author = "Wang, Lu and
Ling, Wang",
booktitle = "Proceedings of the 2016 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies",
month = jun,
year = "2016",
address = "San Diego, California",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/N16-1007",
doi = "10.18653/v1/N16-1007",
pages = "47--57",
}
opinion_abstracts/rotten_tomatoes (default config)
Config description: Professional critics and consensus of 3,731 movies.
Dataset size:
50.10 MiBSplits:
| Split | Examples |
|---|---|
'train' |
3,731 |
- Feature structure:
FeaturesDict({
'_critic_consensus': string,
'_critics': Sequence({
'key': string,
'value': string,
}),
'_movie_id': string,
'_movie_name': string,
})
- Feature documentation:
| Feature | Class | Shape | Dtype | Description |
|---|---|---|---|---|
| FeaturesDict | ||||
| _critic_consensus | Tensor | string | ||
| _critics | Sequence | |||
| _critics/key | Tensor | string | ||
| _critics/value | Tensor | string | ||
| _movie_id | Tensor | string | ||
| _movie_name | Tensor | string |
Supervised keys (See
as_superviseddoc):('_critics', '_critic_consensus')Examples (tfds.as_dataframe):
opinion_abstracts/idebate
Config description: 2,259 claims for 676 debates.
Dataset size:
3.15 MiBSplits:
| Split | Examples |
|---|---|
'train' |
2,259 |
- Feature structure:
FeaturesDict({
'_argument_sentences': Sequence({
'key': string,
'value': string,
}),
'_claim': string,
'_claim_id': string,
'_debate_name': string,
})
- Feature documentation:
| Feature | Class | Shape | Dtype | Description |
|---|---|---|---|---|
| FeaturesDict | ||||
| _argument_sentences | Sequence | |||
| _argument_sentences/key | Tensor | string | ||
| _argument_sentences/value | Tensor | string | ||
| _claim | Tensor | string | ||
| _claim_id | Tensor | string | ||
| _debate_name | Tensor | string |
Supervised keys (See
as_superviseddoc):('_argument_sentences', '_claim')Examples (tfds.as_dataframe):