参考:
all
使用以下命令在 TFDS 中加载此数据集:
ds = tfds.load('huggingface:big_patent/all')
- 说明:
BIGPATENT, consisting of 1.3 million records of U.S. patent documents
along with human written abstractive summaries.
Each US patent application is filed under a Cooperative Patent Classification
(CPC) code. There are nine such classification categories:
A (Human Necessities), B (Performing Operations; Transporting),
C (Chemistry; Metallurgy), D (Textiles; Paper), E (Fixed Constructions),
F (Mechanical Engineering; Lightning; Heating; Weapons; Blasting),
G (Physics), H (Electricity), and
Y (General tagging of new or cross-sectional technology)
There are two features:
- description: detailed description of patent.
- abstract: Patent abastract.
- 许可:Creative Commons Attribution 4.0 International
- 版本:1.0.0
- 拆分:
拆分 | 样本 |
---|---|
'test' |
67072 |
'train' |
1207222 |
'validation' |
67068 |
- 特征:
{
"description": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"abstract": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
a
使用以下命令在 TFDS 中加载此数据集:
ds = tfds.load('huggingface:big_patent/a')
- 说明:
BIGPATENT, consisting of 1.3 million records of U.S. patent documents
along with human written abstractive summaries.
Each US patent application is filed under a Cooperative Patent Classification
(CPC) code. There are nine such classification categories:
A (Human Necessities), B (Performing Operations; Transporting),
C (Chemistry; Metallurgy), D (Textiles; Paper), E (Fixed Constructions),
F (Mechanical Engineering; Lightning; Heating; Weapons; Blasting),
G (Physics), H (Electricity), and
Y (General tagging of new or cross-sectional technology)
There are two features:
- description: detailed description of patent.
- abstract: Patent abastract.
- 许可:Creative Commons Attribution 4.0 International
- 版本:1.0.0
- 拆分:
拆分 | 样本 |
---|---|
'test' |
9675 |
'train' |
174134 |
'validation' |
9674 |
- 特征:
{
"description": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"abstract": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
b
使用以下命令在 TFDS 中加载此数据集:
ds = tfds.load('huggingface:big_patent/b')
- 说明:
BIGPATENT, consisting of 1.3 million records of U.S. patent documents
along with human written abstractive summaries.
Each US patent application is filed under a Cooperative Patent Classification
(CPC) code. There are nine such classification categories:
A (Human Necessities), B (Performing Operations; Transporting),
C (Chemistry; Metallurgy), D (Textiles; Paper), E (Fixed Constructions),
F (Mechanical Engineering; Lightning; Heating; Weapons; Blasting),
G (Physics), H (Electricity), and
Y (General tagging of new or cross-sectional technology)
There are two features:
- description: detailed description of patent.
- abstract: Patent abastract.
- 许可:Creative Commons Attribution 4.0 International
- 版本:1.0.0
- 拆分:
拆分 | 样本 |
---|---|
'test' |
8974 |
'train' |
161520 |
'validation' |
8973 |
- 特征:
{
"description": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"abstract": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
c
使用以下命令在 TFDS 中加载此数据集:
ds = tfds.load('huggingface:big_patent/c')
- 说明:
BIGPATENT, consisting of 1.3 million records of U.S. patent documents
along with human written abstractive summaries.
Each US patent application is filed under a Cooperative Patent Classification
(CPC) code. There are nine such classification categories:
A (Human Necessities), B (Performing Operations; Transporting),
C (Chemistry; Metallurgy), D (Textiles; Paper), E (Fixed Constructions),
F (Mechanical Engineering; Lightning; Heating; Weapons; Blasting),
G (Physics), H (Electricity), and
Y (General tagging of new or cross-sectional technology)
There are two features:
- description: detailed description of patent.
- abstract: Patent abastract.
- 许可:Creative Commons Attribution 4.0 International
- 版本:1.0.0
- 拆分:
拆分 | 样本 |
---|---|
'test' |
5614 |
'train' |
101042 |
'validation' |
5613 |
- 特征:
{
"description": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"abstract": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
d
使用以下命令在 TFDS 中加载此数据集:
ds = tfds.load('huggingface:big_patent/d')
- 说明:
BIGPATENT, consisting of 1.3 million records of U.S. patent documents
along with human written abstractive summaries.
Each US patent application is filed under a Cooperative Patent Classification
(CPC) code. There are nine such classification categories:
A (Human Necessities), B (Performing Operations; Transporting),
C (Chemistry; Metallurgy), D (Textiles; Paper), E (Fixed Constructions),
F (Mechanical Engineering; Lightning; Heating; Weapons; Blasting),
G (Physics), H (Electricity), and
Y (General tagging of new or cross-sectional technology)
There are two features:
- description: detailed description of patent.
- abstract: Patent abastract.
- 许可:Creative Commons Attribution 4.0 International
- 版本:1.0.0
- 拆分:
拆分 | 样本 |
---|---|
'test' |
565 |
'train' |
10164 |
'validation' |
565 |
- 特征:
{
"description": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"abstract": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
e
使用以下命令在 TFDS 中加载此数据集:
ds = tfds.load('huggingface:big_patent/e')
- 说明:
BIGPATENT, consisting of 1.3 million records of U.S. patent documents
along with human written abstractive summaries.
Each US patent application is filed under a Cooperative Patent Classification
(CPC) code. There are nine such classification categories:
A (Human Necessities), B (Performing Operations; Transporting),
C (Chemistry; Metallurgy), D (Textiles; Paper), E (Fixed Constructions),
F (Mechanical Engineering; Lightning; Heating; Weapons; Blasting),
G (Physics), H (Electricity), and
Y (General tagging of new or cross-sectional technology)
There are two features:
- description: detailed description of patent.
- abstract: Patent abastract.
- 许可:Creative Commons Attribution 4.0 International
- 版本:1.0.0
- 拆分:
拆分 | 样本 |
---|---|
'test' |
1914 |
'train' |
34443 |
'validation' |
1914 |
- 特征:
{
"description": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"abstract": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
f
使用以下命令在 TFDS 中加载此数据集:
ds = tfds.load('huggingface:big_patent/f')
- 说明:
BIGPATENT, consisting of 1.3 million records of U.S. patent documents
along with human written abstractive summaries.
Each US patent application is filed under a Cooperative Patent Classification
(CPC) code. There are nine such classification categories:
A (Human Necessities), B (Performing Operations; Transporting),
C (Chemistry; Metallurgy), D (Textiles; Paper), E (Fixed Constructions),
F (Mechanical Engineering; Lightning; Heating; Weapons; Blasting),
G (Physics), H (Electricity), and
Y (General tagging of new or cross-sectional technology)
There are two features:
- description: detailed description of patent.
- abstract: Patent abastract.
- 许可:Creative Commons Attribution 4.0 International
- 版本:1.0.0
- 拆分:
拆分 | 样本 |
---|---|
'test' |
4754 |
'train' |
85568 |
'validation' |
4754 |
- 特征:
{
"description": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"abstract": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
g
使用以下命令在 TFDS 中加载此数据集:
ds = tfds.load('huggingface:big_patent/g')
- 说明:
BIGPATENT, consisting of 1.3 million records of U.S. patent documents
along with human written abstractive summaries.
Each US patent application is filed under a Cooperative Patent Classification
(CPC) code. There are nine such classification categories:
A (Human Necessities), B (Performing Operations; Transporting),
C (Chemistry; Metallurgy), D (Textiles; Paper), E (Fixed Constructions),
F (Mechanical Engineering; Lightning; Heating; Weapons; Blasting),
G (Physics), H (Electricity), and
Y (General tagging of new or cross-sectional technology)
There are two features:
- description: detailed description of patent.
- abstract: Patent abastract.
- 许可:Creative Commons Attribution 4.0 International
- 版本:1.0.0
- 拆分:
拆分 | 样本 |
---|---|
'test' |
14386 |
'train' |
258935 |
'validation' |
14385 |
- 特征:
{
"description": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"abstract": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
h
使用以下命令在 TFDS 中加载此数据集:
ds = tfds.load('huggingface:big_patent/h')
- 说明:
BIGPATENT, consisting of 1.3 million records of U.S. patent documents
along with human written abstractive summaries.
Each US patent application is filed under a Cooperative Patent Classification
(CPC) code. There are nine such classification categories:
A (Human Necessities), B (Performing Operations; Transporting),
C (Chemistry; Metallurgy), D (Textiles; Paper), E (Fixed Constructions),
F (Mechanical Engineering; Lightning; Heating; Weapons; Blasting),
G (Physics), H (Electricity), and
Y (General tagging of new or cross-sectional technology)
There are two features:
- description: detailed description of patent.
- abstract: Patent abastract.
- 许可:Creative Commons Attribution 4.0 International
- 版本:1.0.0
- 拆分:
拆分 | 样本 |
---|---|
'test' |
14279 |
'train' |
257019 |
'validation' |
14279 |
- 特征:
{
"description": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"abstract": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
y
使用以下命令在 TFDS 中加载此数据集:
ds = tfds.load('huggingface:big_patent/y')
- 说明:
BIGPATENT, consisting of 1.3 million records of U.S. patent documents
along with human written abstractive summaries.
Each US patent application is filed under a Cooperative Patent Classification
(CPC) code. There are nine such classification categories:
A (Human Necessities), B (Performing Operations; Transporting),
C (Chemistry; Metallurgy), D (Textiles; Paper), E (Fixed Constructions),
F (Mechanical Engineering; Lightning; Heating; Weapons; Blasting),
G (Physics), H (Electricity), and
Y (General tagging of new or cross-sectional technology)
There are two features:
- description: detailed description of patent.
- abstract: Patent abastract.
- 许可:Creative Commons Attribution 4.0 International
- 版本:1.0.0
- 拆分:
拆分 | 样本 |
---|---|
'test' |
6911 |
'train' |
124397 |
'validation' |
6911 |
- 特征:
{
"description": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"abstract": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}