- Description:
This is a dataset for classifying citation intents in academic papers. The main citation intent label for each Json object is specified with the label key while the citation context is specified in with a context key. Example:
{
'string': 'In chacma baboons, male-infant relationships can be linked to both
formation of friendships and paternity success [30,31].'
'sectionName': 'Introduction',
'label': 'background',
'citingPaperId': '7a6b2d4b405439',
'citedPaperId': '9d1abadc55b5e0',
...
}
You may obtain the full information about the paper using the provided paper ids with the Semantic Scholar API (https://api.semanticscholar.org/).
The labels are: Method, Background, Result
Additional Documentation: Explore on Papers With Code
Homepage: https://github.com/allenai/scicite
Source code:
tfds.datasets.scicite.Builder
Versions:
1.0.0
(default): No release notes.
Download size:
22.12 MiB
Dataset size:
7.26 MiB
Auto-cached (documentation): Yes
Splits:
Split | Examples |
---|---|
'test' |
1,859 |
'train' |
8,194 |
'validation' |
916 |
- Feature structure:
FeaturesDict({
'citeEnd': int64,
'citeStart': int64,
'citedPaperId': Text(shape=(), dtype=string),
'citingPaperId': Text(shape=(), dtype=string),
'excerpt_index': int32,
'id': Text(shape=(), dtype=string),
'isKeyCitation': bool,
'label': ClassLabel(shape=(), dtype=int64, num_classes=3),
'label2': ClassLabel(shape=(), dtype=int64, num_classes=4),
'label2_confidence': float32,
'label_confidence': float32,
'sectionName': Text(shape=(), dtype=string),
'source': ClassLabel(shape=(), dtype=int64, num_classes=7),
'string': Text(shape=(), dtype=string),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
citeEnd | Tensor | int64 | ||
citeStart | Tensor | int64 | ||
citedPaperId | Text | string | ||
citingPaperId | Text | string | ||
excerpt_index | Tensor | int32 | ||
id | Text | string | ||
isKeyCitation | Tensor | bool | ||
label | ClassLabel | int64 | ||
label2 | ClassLabel | int64 | ||
label2_confidence | Tensor | float32 | ||
label_confidence | Tensor | float32 | ||
sectionName | Text | string | ||
source | ClassLabel | int64 | ||
string | Text | string |
Supervised keys (See
as_supervised
doc):('string', 'label')
Figure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):
- Citation:
@InProceedings{Cohan2019Structural,
author={Arman Cohan and Waleed Ammar and Madeleine Van Zuylen and Field Cady},
title={Structural Scaffolds for Citation Intent Classification in Scientific Publications},
booktitle="NAACL",
year="2019"
}