TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

eraser_multi_rc

Description:

Eraser Multi RC is a dataset for queries over multi-line passages, along with answers and a rationalte. Each example in this dataset has the following 5 parts

A Mutli-line Passage 2. A Query about the passage 3. An Answer to the query
A Classification as to whether the answer is right or wrong 5. An Explanation justifying the classification

Additional Documentation: Explore on Papers With Code
Homepage: https://cogcomp.seas.upenn.edu/multirc/
Source code: tfds.text.EraserMultiRc
Versions:
- 0.1.1 (default): No release notes.
Download size: 1.59 MiB
Dataset size: 62.59 MiB
Auto-cached (documentation): Yes
Splits:

Split	Examples
`'test'`	4,848
`'train'`	24,029
`'validation'`	3,214

Feature structure:

FeaturesDict({
    'evidences': Sequence(Text(shape=(), dtype=string)),
    'label': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'passage': Text(shape=(), dtype=string),
    'query_and_answer': Text(shape=(), dtype=string),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
evidences	Sequence(Text)	(None,)	string
label	ClassLabel		int64
passage	Text		string
query_and_answer	Text		string

Supervised keys (See as_supervised doc): None
Figure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):

Citation:

@unpublished{eraser2019,
    title = {ERASER: A Benchmark to Evaluate Rationalized NLP Models},
    author = {Jay DeYoung and Sarthak Jain and Nazneen Fatema Rajani and Eric Lehman and Caiming Xiong and Richard Socher and Byron C. Wallace}
}
@inproceedings{MultiRC2018,
    author = {Daniel Khashabi and Snigdha Chaturvedi and Michael Roth and Shyam Upadhyay and Dan Roth},
    title = {Looking Beyond the Surface:A Challenge Set for Reading Comprehension over Multiple Sentences},
    booktitle = {NAACL},
    year = {2018}
}