so_stacksample

참고자료:

답변

TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.

ds = tfds.load('huggingface:so_stacksample/Answers')
  • 설명 :
Dataset with the text of 10% of questions and answers from the Stack Overflow programming Q&A website.

This is organized as three tables:

Questions contains the title, body, creation date, closed date (if applicable), score, and owner ID for all non-deleted Stack Overflow questions whose Id is a multiple of 10.
Answers contains the body, creation date, score, and owner ID for each of the answers to these questions. The ParentId column links back to the Questions table.
Tags contains the tags on each of these questions
  • 라이선스 : 모든 스택 오버플로 사용자 기여는 저작자가 필요한 CC-BY-SA 3.0에 따라 라이선스가 부여됩니다.
  • 버전 : 1.1.0
  • 분할 :
나뉘다
'Answers' 2014516
  • 특징 :
{
    "Id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "OwnerUserId": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "CreationDate": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "ParentId": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "Score": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "Body": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

질문

TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.

ds = tfds.load('huggingface:so_stacksample/Questions')
  • 설명 :
Dataset with the text of 10% of questions and answers from the Stack Overflow programming Q&A website.

This is organized as three tables:

Questions contains the title, body, creation date, closed date (if applicable), score, and owner ID for all non-deleted Stack Overflow questions whose Id is a multiple of 10.
Answers contains the body, creation date, score, and owner ID for each of the answers to these questions. The ParentId column links back to the Questions table.
Tags contains the tags on each of these questions
  • 라이선스 : 모든 스택 오버플로 사용자 기여는 저작자가 필요한 CC-BY-SA 3.0에 따라 라이선스가 부여됩니다.
  • 버전 : 1.1.0
  • 분할 :
나뉘다
'Questions' 1264216
  • 특징 :
{
    "Id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "OwnerUserId": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "CreationDate": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "ClosedDate": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "Score": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "Title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "Body": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

태그

TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.

ds = tfds.load('huggingface:so_stacksample/Tags')
  • 설명 :
Dataset with the text of 10% of questions and answers from the Stack Overflow programming Q&A website.

This is organized as three tables:

Questions contains the title, body, creation date, closed date (if applicable), score, and owner ID for all non-deleted Stack Overflow questions whose Id is a multiple of 10.
Answers contains the body, creation date, score, and owner ID for each of the answers to these questions. The ParentId column links back to the Questions table.
Tags contains the tags on each of these questions
  • 라이선스 : 모든 스택 오버플로 사용자 기여는 저작자가 필요한 CC-BY-SA 3.0에 따라 라이선스가 부여됩니다.
  • 버전 : 1.1.0
  • 분할 :
나뉘다
'Tags' 3750994
  • 특징 :
{
    "Id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "Tag": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}