TFDS รองรับ รูปแบบ Croissant 🥐 แล้ว! อ่าน เอกสาร เพื่อทราบข้อมูลเพิ่มเติม

หน้านี้ได้รับการแปลโดย Cloud Translation API

mrqa

คำอธิบาย :

งานที่ใช้ร่วมกันของ MRQA 2019 มุ่งเน้นไปที่การสรุปทั่วไปในการตอบคำถาม ระบบการตอบคำถามที่มีประสิทธิภาพควรทำมากกว่าเพียงการสอดแทรกจากชุดการฝึกอบรมเพื่อตอบตัวอย่างการทดสอบที่มาจากการแจกแจงแบบเดียวกัน: ควรจะสามารถคาดการณ์ถึงตัวอย่างที่ไม่อยู่ในการแจกแจง ซึ่งเป็นความท้าทายที่ยากกว่ามาก

MRQA ปรับและรวมชุดข้อมูลการตอบคำถามที่แตกต่างกันหลายชุด (ชุดย่อยที่เลือกอย่างระมัดระวังของชุดข้อมูลที่มีอยู่) ให้เป็นรูปแบบเดียวกัน (รูปแบบ SQuAD) ในหมู่พวกเขา มีชุดข้อมูล 6 ชุดสำหรับการฝึกอบรม และชุดข้อมูล 6 ชุดพร้อมสำหรับการทดสอบ ชุดข้อมูลการฝึกอบรมส่วนเล็กๆ ถูกจัดให้เป็นข้อมูลในโดเมนที่อาจใช้สำหรับการพัฒนา ชุดข้อมูลการทดสอบประกอบด้วยข้อมูลนอกโดเมนเท่านั้น การวัดประสิทธิภาพนี้เผยแพร่โดยเป็นส่วนหนึ่งของงานที่ใช้ร่วมกันของ MRQA 2019

สามารถดูข้อมูลเพิ่มเติมได้ที่: <a href="https://mrqa.github.io/2019/shared.html">https://mrqa.github.io/2019/shared.html</a>

เอกสารประกอบเพิ่มเติม : สำรวจเอกสารด้วยรหัส
หน้าแรก : https://mrqa.github.io/2019/shared.html
รหัสที่มา : tfds.text.mrqa.MRQA
รุ่น :
- 1.0.0 (ค่าเริ่มต้น): การเปิดตัวครั้งแรก
โครงสร้างคุณลักษณะ :

FeaturesDict({
    'answers': Sequence(string),
    'context': string,
    'context_tokens': Sequence({
        'offsets': int32,
        'tokens': string,
    }),
    'detected_answers': Sequence({
        'char_spans': Sequence({
            'end': int32,
            'start': int32,
        }),
        'text': string,
        'token_spans': Sequence({
            'end': int32,
            'start': int32,
        }),
    }),
    'qid': string,
    'question': string,
    'question_tokens': Sequence({
        'offsets': int32,
        'tokens': string,
    }),
    'subset': string,
})

เอกสารคุณสมบัติ :

ลักษณะเฉพาะ	ระดับ	รูปร่าง	Dประเภท
	คุณสมบัติDict
คำตอบ	ลำดับ (เทนเซอร์)	(ไม่มี,)	สตริง
บริบท	เทนเซอร์		สตริง
โทเค็นบริบท	ลำดับ
Context_tokens/ออฟเซ็ต	เทนเซอร์		int32
Context_tokens/โทเค็น	เทนเซอร์		สตริง
ตรวจพบ_คำตอบ	ลำดับ
Detect_answers/char_spans	ลำดับ
Detect_answers/char_spans/end	เทนเซอร์		int32
Detect_answers/char_spans/start	เทนเซอร์		int32
ตรวจพบ_คำตอบ/ข้อความ	เทนเซอร์		สตริง
Detected_answers/token_spans	ลำดับ
Detected_answers/token_spans/end	เทนเซอร์		int32
Detected_answers/token_spans/start	เทนเซอร์		int32
คิด	เทนเซอร์		สตริง
คำถาม	เทนเซอร์		สตริง
โทเค็นคำถาม	ลำดับ
Question_tokens/ออฟเซ็ต	เทนเซอร์		int32
Question_tokens/โทเค็น	เทนเซอร์		สตริง
ชุดย่อย	เทนเซอร์		สตริง

คีย์ภายใต้การดูแล (ดู as_supervised doc ): None
รูปภาพ ( tfds.show_examples ): ไม่รองรับ

mrqa/squad (การกำหนดค่าเริ่มต้น)

คำอธิบาย การกำหนดค่า : ชุดข้อมูล SQuAD (Stanford Question Answering Dataset) ใช้เป็นพื้นฐานสำหรับรูปแบบงานที่ใช้ร่วมกัน ฝูงชนจะแสดงย่อหน้าจาก Wikipedia และถูกขอให้เขียนคำถามพร้อมคำตอบที่แยกออกมา
ขนาดการดาวน์โหลด : 29.66 MiB
ขนาดชุดข้อมูล : 271.43 MiB
แคชอัตโนมัติ ( เอกสารประกอบ ): ไม่
แยก :

แยก	ตัวอย่าง
`'train'`	86,588
`'validation'`	10,507

ตัวอย่าง ( tfds.as_dataframe ):

การอ้างอิง :

@inproceedings{rajpurkar-etal-2016-squad,
    title = "{SQ}u{AD}: 100,000+ Questions for Machine Comprehension of Text",
    author = "Rajpurkar, Pranav  and
      Zhang, Jian  and
      Lopyrev, Konstantin  and
      Liang, Percy",
    booktitle = "Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2016",
    address = "Austin, Texas",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D16-1264",
    doi = "10.18653/v1/D16-1264",
    pages = "2383--2392",
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/news_qa

คำอธิบาย การกำหนดค่า : กลุ่มคนทำงาน 2 ชุดถามและตอบคำถามตามบทความข่าวของ CNN “ผู้ถาม” จะเห็นเฉพาะหัวข้อข่าวและบทสรุปของบทความ ในขณะที่ “ผู้ตอบ” จะเห็นบทความทั้งหมด คำถามที่ไม่มีคำตอบหรือถูกตั้งค่าสถานะในชุดข้อมูลว่าไม่มีข้อตกลงเกี่ยวกับคำอธิบายประกอบจะถูกยกเลิก
ขนาดการดาวน์โหลด : 56.83 MiB
ขนาดชุดข้อมูล : 654.25 MiB
แคชอัตโนมัติ ( เอกสารประกอบ ): ไม่
แยก :

แยก	ตัวอย่าง
`'train'`	74,160
`'validation'`	4,212

ตัวอย่าง ( tfds.as_dataframe ):

การอ้างอิง :

@inproceedings{trischler-etal-2017-newsqa,
        title = "{N}ews{QA}: A Machine Comprehension Dataset",
        author = "Trischler, Adam  and
          Wang, Tong  and
          Yuan, Xingdi  and
          Harris, Justin  and
          Sordoni, Alessandro  and
          Bachman, Philip  and
          Suleman, Kaheer",
        booktitle = "Proceedings of the 2nd Workshop on Representation Learning for {NLP}",
        month = aug,
        year = "2017",
        address = "Vancouver, Canada",
        publisher = "Association for Computational Linguistics",
        url = "https://aclanthology.org/W17-2623",
        doi = "10.18653/v1/W17-2623",
        pages = "191--200",
    }
#
@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/trivia_qa

คำอธิบาย การกำหนดค่า : คู่คำถามและคำตอบมาจากเว็บไซต์เรื่องไม่สำคัญและแบบทดสอบลีก มีการใช้ TriviaQA เวอร์ชันเว็บซึ่งบริบทถูกดึงมาจากผลลัพธ์ของข้อความค้นหา Bing
ขนาดการดาวน์โหลด : 383.14 MiB
ขนาดชุดข้อมูล : 772.75 MiB
แคชอัตโนมัติ ( เอกสารประกอบ ): ไม่
แยก :

แยก	ตัวอย่าง
`'train'`	61,688
`'validation'`	7,785

ตัวอย่าง ( tfds.as_dataframe ):

การอ้างอิง :

@inproceedings{joshi-etal-2017-triviaqa,
    title = "{T}rivia{QA}: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension",
    author = "Joshi, Mandar  and
      Choi, Eunsol  and
      Weld, Daniel  and
      Zettlemoyer, Luke",
    booktitle = "Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2017",
    address = "Vancouver, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/P17-1147",
    doi = "10.18653/v1/P17-1147",
    pages = "1601--1611",
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/search_qa

คำอธิบาย การกำหนดค่า : คู่คำถามและคำตอบมาจาก Jeopardy! รายการทีวี. บริบทประกอบด้วยตัวอย่างข้อมูลที่ดึงมาจากข้อความค้นหาของ Google
ขนาดการดาวน์โหลด : 699.86 MiB
ขนาดชุดข้อมูล : 1.38 GiB
แคชอัตโนมัติ ( เอกสารประกอบ ): ไม่
แยก :

แยก	ตัวอย่าง
`'train'`	117,384
`'validation'`	16,980

ตัวอย่าง ( tfds.as_dataframe ):

การอ้างอิง :

@article{dunn2017searchqa,
    title={Searchqa: A new q\&a dataset augmented with context from a search engine},
    author={Dunn, Matthew and Sagun, Levent and Higgins, Mike and Guney, V Ugur and Cirik, Volkan and Cho, Kyunghyun},
    journal={arXiv preprint arXiv:1704.05179},
    year={2017}
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/hotpot_qa

คำอธิบาย การกำหนดค่า: Crowdworkers จะแสดงสองย่อหน้าที่เชื่อมโยงเอนทิตีจากวิกิพีเดีย และถูกขอให้เขียนและตอบคำถามที่ต้องใช้เหตุผลแบบหลายฮอปในการแก้ปัญหา ในการตั้งค่าดั้งเดิม ย่อหน้าเหล่านี้จะผสมกับย่อหน้าเพิ่มเติมเพื่อทำให้การอนุมานยากขึ้น ที่นี่ไม่รวมย่อหน้าที่ทำให้เสียสมาธิ
ขนาดการดาวน์โหลด : 111.98 MiB
ขนาดชุดข้อมูล : 272.87 MiB
แคชอัตโนมัติ ( เอกสารประกอบ ): ไม่
แยก :

แยก	ตัวอย่าง
`'train'`	72,928
`'validation'`	5,901

ตัวอย่าง ( tfds.as_dataframe ):

การอ้างอิง :

@inproceedings{yang-etal-2018-hotpotqa,
    title = "{H}otpot{QA}: A Dataset for Diverse, Explainable Multi-hop Question Answering",
    author = "Yang, Zhilin  and
      Qi, Peng  and
      Zhang, Saizheng  and
      Bengio, Yoshua  and
      Cohen, William  and
      Salakhutdinov, Ruslan  and
      Manning, Christopher D.",
    booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
    month = oct # "-" # nov,
    year = "2018",
    address = "Brussels, Belgium",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D18-1259",
    doi = "10.18653/v1/D18-1259",
    pages = "2369--2380",
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/natural_questions

คำอธิบาย การกำหนดค่า : คำถามถูกรวบรวมจากการสืบค้นข้อมูลไปยังเครื่องมือค้นหาของ Google โดยผู้ใช้จริงภายใต้สภาวะธรรมชาติ คำตอบสำหรับคำถามมีคำอธิบายประกอบในหน้า Wikipedia ที่ดึงมาโดยกลุ่มคนทำงาน มีการรวบรวมคำอธิบายประกอบสองประเภท: 1) กล่องขอบ HTML ที่มีข้อมูลเพียงพอที่จะอนุมานคำตอบของคำถามได้อย่างสมบูรณ์ (คำตอบแบบยาว) และ 2) ช่วงย่อยหรือช่วงย่อยภายในกรอบขอบที่ประกอบด้วยคำตอบจริง (คำตอบแบบสั้น ). ใช้เฉพาะตัวอย่างที่มีคำตอบสั้น ๆ และคำตอบยาวใช้เป็นบริบท
ขนาดการดาวน์โหลด : 121.15 MiB
ขนาดชุดข้อมูล : 339.03 MiB
แคชอัตโนมัติ ( เอกสารประกอบ ): ไม่
แยก :

แยก	ตัวอย่าง
`'train'`	104,071
`'validation'`	12,836

ตัวอย่าง ( tfds.as_dataframe ):

การอ้างอิง :

@article{kwiatkowski-etal-2019-natural,
    title = "Natural Questions: A Benchmark for Question Answering Research",
    author = "Kwiatkowski, Tom  and
      Palomaki, Jennimaria  and
      Redfield, Olivia  and
      Collins, Michael  and
      Parikh, Ankur  and
      Alberti, Chris  and
      Epstein, Danielle  and
      Polosukhin, Illia  and
      Devlin, Jacob  and
      Lee, Kenton  and
      Toutanova, Kristina  and
      Jones, Llion  and
      Kelcey, Matthew  and
      Chang, Ming-Wei  and
      Dai, Andrew M.  and
      Uszkoreit, Jakob  and
      Le, Quoc  and
      Petrov, Slav",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "7",
    year = "2019",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q19-1026",
    doi = "10.1162/tacl_a_00276",
    pages = "452--466",
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/bio_asq

คำอธิบาย การกำหนดค่า: BioASQ ซึ่งเป็นความท้าทายเกี่ยวกับการจัดทำดัชนีความหมายทางชีวการแพทย์ขนาดใหญ่และการตอบคำถาม ประกอบด้วยคู่คำถามและคำตอบที่สร้างขึ้นโดยผู้เชี่ยวชาญด้านโดเมน จากนั้นพวกเขาจะเชื่อมโยงกับบทความทางวิทยาศาสตร์ (PubMed) หลายบทความด้วยตนเอง บทคัดย่อฉบับเต็มของแต่ละบทความที่เชื่อมโยงจะถูกดาวน์โหลดและใช้เป็นบริบทเฉพาะ (เช่น คำถามเดียวสามารถเชื่อมโยงกับบทความอิสระหลายบทความเพื่อสร้างคู่บริบท QA หลายคู่) บทคัดย่อที่ไม่มีคำตอบตรงประเด็นจะถูกตัดทิ้ง
ขนาดการดาวน์โหลด : 2.54 MiB
ขนาดชุดข้อมูล : 6.70 MiB
แคชอัตโนมัติ ( เอกสาร ): ใช่
แยก :

แยก	ตัวอย่าง
`'test'`	1,504

ตัวอย่าง ( tfds.as_dataframe ):

การอ้างอิง :

@article{tsatsaronis2015overview,
    title={An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition},
    author={Tsatsaronis, George and Balikas, Georgios and Malakasiotis, Prodromos and Partalas, Ioannis and Zschunke, Matthias and Alvers, Michael R and Weissenborn, Dirk and Krithara, Anastasia and Petridis, Sergios and Polychronopoulos, Dimitris and others},
    journal={BMC bioinformatics},
    volume={16},
    number={1},
    pages={1--28},
    year={2015},
    publisher={Springer}
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/drop

คำอธิบาย การกำหนดค่า : ตัวอย่าง DROP (การให้เหตุผลแบบไม่ต่อเนื่องเหนือเนื้อหาของย่อหน้า) ถูกรวบรวมในลักษณะเดียวกับ SQuAD ซึ่งมีการขอให้ฝูงชนสร้างคู่คำถาม-คำตอบจากย่อหน้า Wikipedia คำถามเน้นที่การให้เหตุผลเชิงปริมาณ และชุดข้อมูลดั้งเดิมประกอบด้วยคำตอบที่เป็นตัวเลขแบบไม่แยกส่วนรวมถึงคำตอบแบบข้อความแยกส่วน มีการใช้ชุดคำถามที่แยกออกมา
ขนาดการดาวน์โหลด : 578.25 KiB
ขนาดชุดข้อมูล : 5.41 MiB
แคชอัตโนมัติ ( เอกสาร ): ใช่
แยก :

แยก	ตัวอย่าง
`'test'`	1,503

ตัวอย่าง ( tfds.as_dataframe ):

การอ้างอิง :

@inproceedings{dua-etal-2019-drop,
    title = "{DROP}: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs",
    author = "Dua, Dheeru  and
      Wang, Yizhong  and
      Dasigi, Pradeep  and
      Stanovsky, Gabriel  and
      Singh, Sameer  and
      Gardner, Matt",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N19-1246",
    doi = "10.18653/v1/N19-1246",
    pages = "2368--2378",
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/duo_rc

คำอธิบาย การกำหนดค่า : ใช้การแยก ParaphraseRC ของชุดข้อมูล DuoRC ในการตั้งค่านี้ จะมีการรวบรวมพล็อตเรื่องย่อที่แตกต่างกัน 2 เรื่องจากภาพยนตร์เรื่องเดียวกัน เรื่องหนึ่งมาจากวิกิพีเดียและอีกเรื่องจาก IMDb กลุ่มคนทำงาน 2 กลุ่มที่แตกต่างกันจะถามและตอบคำถามเกี่ยวกับโครงเรื่องภาพยนตร์ โดย "ผู้ถาม" จะแสดงเฉพาะหน้า Wikipedia และ "ผู้ตอบ" จะแสดงเฉพาะหน้า IMDb คำถามที่ถูกทำเครื่องหมายว่าไม่มีคำตอบจะถูกยกเลิก
ขนาดการดาวน์โหลด : 1.14 MiB
ขนาดชุดข้อมูล : 15.04 MiB
แคชอัตโนมัติ ( เอกสาร ): ใช่
แยก :

แยก	ตัวอย่าง
`'test'`	1,501

ตัวอย่าง ( tfds.as_dataframe ):

การอ้างอิง :

@inproceedings{saha-etal-2018-duorc,
    title = "{D}uo{RC}: Towards Complex Language Understanding with Paraphrased Reading Comprehension",
    author = "Saha, Amrita  and
      Aralikatte, Rahul  and
      Khapra, Mitesh M.  and
      Sankaranarayanan, Karthik",
    booktitle = "Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2018",
    address = "Melbourne, Australia",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/P18-1156",
    doi = "10.18653/v1/P18-1156",
    pages = "1683--1693",
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/เรส

คำอธิบาย การกำหนดค่า : ชุดข้อมูล ReAding Comprehension จากการสอบ (RACE) รวบรวมจากข้อสอบวัดความเข้าใจในการอ่านภาษาอังกฤษสำหรับนักเรียนจีนระดับมัธยมต้นและมัธยมปลาย การแบ่งชั้นมัธยมศึกษาตอนปลาย (ซึ่งมีความท้าทายมากกว่า) ถูกนำมาใช้และคำถามสไตล์ "เติมในช่องว่าง" โดยนัย (ซึ่งไม่เป็นธรรมชาติสำหรับงานนี้) จะถูกกรองออก
ขนาดการดาวน์โหลด : 1.49 MiB
ขนาดชุดข้อมูล : 3.53 MiB
แคชอัตโนมัติ ( เอกสาร ): ใช่
แยก :

แยก	ตัวอย่าง
`'test'`	674

ตัวอย่าง ( tfds.as_dataframe ):

การอ้างอิง :

@inproceedings{lai-etal-2017-race,
    title = "{RACE}: Large-scale {R}e{A}ding Comprehension Dataset From Examinations",
    author = "Lai, Guokun  and
      Xie, Qizhe  and
      Liu, Hanxiao  and
      Yang, Yiming  and
      Hovy, Eduard",
    booktitle = "Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing",
    month = sep,
    year = "2017",
    address = "Copenhagen, Denmark",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D17-1082",
    doi = "10.18653/v1/D17-1082",
    pages = "785--794",
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/relation_extraction

คำอธิบาย การกำหนดค่า : ด้วยชุดข้อมูลที่เติมช่อง ความสัมพันธ์ระหว่างเอนทิตีจะถูกแปลงเป็นคู่คำถามคำตอบอย่างเป็นระบบโดยใช้เทมเพลต ตัวอย่างเช่น ความสัมพันธ์ของ educational_at(x, y) ระหว่างสองสิ่งที่ x และ y ปรากฏในประโยคสามารถแสดงเป็น "ที่ x ได้รับการศึกษาที่" พร้อมคำตอบ y มีการรวบรวมเทมเพลตหลายรายการสำหรับความสัมพันธ์แต่ละประเภท มีการใช้การแบ่งเกณฑ์มาตรฐานแบบ zeroshot ของชุดข้อมูล (การทำให้เป็นแบบทั่วไปกับความสัมพันธ์ที่มองไม่เห็น) และเก็บเฉพาะตัวอย่างที่เป็นบวกเท่านั้น
ขนาดการดาวน์โหลด : 830.88 KiB
ขนาดชุดข้อมูล : 3.71 MiB
แคชอัตโนมัติ ( เอกสาร ): ใช่
แยก :

แยก	ตัวอย่าง
`'test'`	2,948

ตัวอย่าง ( tfds.as_dataframe ):

การอ้างอิง :

@inproceedings{levy-etal-2017-zero,
    title = "Zero-Shot Relation Extraction via Reading Comprehension",
    author = "Levy, Omer  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Zettlemoyer, Luke",
    booktitle = "Proceedings of the 21st Conference on Computational Natural Language Learning ({C}o{NLL} 2017)",
    month = aug,
    year = "2017",
    address = "Vancouver, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/K17-1034",
    doi = "10.18653/v1/K17-1034",
    pages = "333--342",
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/textbook_qa

คำอธิบายการกำหนด ค่า : TextbookQA รวบรวมจากบทเรียนจากหนังสือเรียนวิทยาศาสตร์ชีวภาพ วิทยาศาสตร์โลก และวิทยาศาสตร์กายภาพระดับมัธยมต้น ไม่รวมคำถามที่มาพร้อมกับแผนภาพ หรือที่เป็นคำถาม "จริงหรือเท็จ"
ขนาดการดาวน์โหลด : 1.79 MiB
ขนาดชุดข้อมูล : 14.04 MiB
แคชอัตโนมัติ ( เอกสาร ): ใช่
แยก :

แยก	ตัวอย่าง
`'test'`	1,503

ตัวอย่าง ( tfds.as_dataframe ):

การอ้างอิง :

@inproceedings{kembhavi2017you,
    title={Are you smarter than a sixth grader? textbook question answering for multimodal machine comprehension},
    author={Kembhavi, Aniruddha and Seo, Minjoon and Schwenk, Dustin and Choi, Jonghyun and Farhadi, Ali and Hajishirzi, Hannaneh},
    booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern recognition},
    pages={4999--5007},
    year={2017}
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa จัดทุกอย่างให้เป็นระเบียบอยู่เสมอด้วยคอลเล็กชัน บันทึกและจัดหมวดหมู่เนื้อหาตามค่ากำหนดของคุณ