گروه خبری

مراجع:

18828_alt.atheism

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_alt.atheism')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 799
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_comp.graphics

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_comp.graphics')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 973
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_comp.os.ms-windows.misc

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_comp.os.ms-windows.misc')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 985
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_comp.sys.ibm.pc.hardware

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_comp.sys.ibm.pc.hardware')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 982
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_comp.sys.mac.hardware

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_comp.sys.mac.hardware')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 961
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_comp.windows.x

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_comp.windows.x')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 980
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_misc.forsale

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_misc.forsale')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 972
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_rec.autos

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_rec.autos')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 990
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_rec.motorcycles

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_rec.motorcycles')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 994
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_rec.sport.baseball

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_rec.sport.baseball')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 994
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_rec.sport.hockey

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_rec.sport.hockey')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 999
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_sci.crypt

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_sci.crypt')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 991
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_sci.electronics

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_sci.electronics')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 981
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_sci.med

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_sci.med')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 990
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_sci.space

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_sci.space')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 987
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_soc.religion.christian

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_soc.religion.christian')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 997
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_talk.politics.guns

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_talk.politics.guns')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 910
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_talk.politics.mideast

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_talk.politics.mideast')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 940
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_talk.politics.misc

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_talk.politics.misc')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 775
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

18828_talk.religion.misc

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/18828_talk.religion.misc')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

does not include cross-posts and includes only the "From" and "Subject" headers.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 3.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 628
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_alt.atheism

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_alt.atheism')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_comp.graphics

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_comp.graphics')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_comp.os.ms-windows.misc

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_comp.os.ms-windows.misc')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_comp.sys.ibm.pc.hardware

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_comp.sys.ibm.pc.hardware')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_comp.sys.mac.hardware

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_comp.sys.mac.hardware')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_comp.windows.x

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_comp.windows.x')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_misc.forsale

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_misc.forsale')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_rec.autos

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_rec.autos')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_rec.motorcycles

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_rec.motorcycles')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_rec.sport.baseball

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_rec.sport.baseball')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_rec.sport.hockey

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_rec.sport.hockey')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_sci.crypt

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_sci.crypt')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_sci.electronics

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_sci.electronics')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_sci.med

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_sci.med')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_sci.space

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_sci.space')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_soc.religion.christian

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_soc.religion.christian')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 997
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_talk.politics.guns

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_talk.politics.guns')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_talk.politics.mideast

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_talk.politics.mideast')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_talk.politics.misc

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_talk.politics.misc')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

19997_talk.religion.misc

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/19997_talk.religion.misc')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

the original, unmodified version.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'train' 1000
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_alt.آتئیسم

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_alt.atheism')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 319
'train' 480
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_comp.graphics

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_comp.graphics')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 389
'train' 584
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_comp.os.ms-windows.misc

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_comp.os.ms-windows.misc')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 394
'train' 591
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_comp.sys.ibm.pc.hardware

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_comp.sys.ibm.pc.hardware')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 392
'train' 590
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_comp.sys.mac.hardware

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_comp.sys.mac.hardware')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 385
'train' 578
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_comp.windows.x

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_comp.windows.x')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 395
'train' 593
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_misc.forsale

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_misc.forsale')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 390
'train' 585
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_rec.autos

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_rec.autos')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 396
'train' 594
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_rec.motorcycles

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_rec.motorcycles')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 398
'train' 598
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_rec.sport.baseball

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_rec.sport.baseball')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 397
'train' 597
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_rec.sport.hockey

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_rec.sport.hockey')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 399
'train' 600
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_sci.crypt

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_sci.crypt')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 396
'train' 595
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_sci.electronics

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_sci.electronics')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 393
'train' 591
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_sci.med

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_sci.med')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 396
'train' 594
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_sci.space

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_sci.space')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 394
'train' 593
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_soc.religion.christian

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_soc.religion.christian')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 398
'train' 599
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_talk.politics.guns

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_talk.politics.guns')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 364
'train' 546
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_talk.politics.mideast

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_talk.politics.mideast')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 376
'train' 564
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_talk.politics.misc

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_talk.politics.misc')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 310
'train' 465
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

bydate_talk.religion.misc

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:newsgroup/bydate_talk.religion.misc')
  • توضیحات :
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across
20 different newsgroups. The 20 newsgroups collection has become a popular data set for experiments in text applications of
machine learning techniques, such as text classification and text clustering.

sorted by date into training(60%) and test(40%) sets, does not include cross-posts (duplicates) and does not include newsgroup-identifying headers (Xref, Newsgroups, Path, Followup-To, Date)
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 2.0.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 251
'train' 377
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}