q_re_cc

  • Keterangan :

Kumpulan data yang berisi 14 ribu percakapan dengan 81 ribu pasangan tanya jawab. QReCC dibuat berdasarkan pertanyaan dari TREC CAsT, QuAC, dan Google Natural Questions.

Membelah Contoh
'test' 16.451
'train' 63.501
  • Struktur fitur :
FeaturesDict({
    'answer': Text(shape=(), dtype=string),
    'answer_url': Text(shape=(), dtype=string),
    'context': Sequence(Text(shape=(), dtype=string)),
    'conversation_id': Scalar(shape=(), dtype=int32, description=The id of the conversation.),
    'question': Text(shape=(), dtype=string),
    'question_rewrite': Text(shape=(), dtype=string),
    'source': Text(shape=(), dtype=string),
    'turn_id': Scalar(shape=(), dtype=int32, description=The id of the conversation turn, within a conversation.),
})
  • Dokumentasi fitur :
Fitur Kelas Membentuk Tipe D Keterangan
FiturDict
menjawab Teks rangkaian
jawaban_url Teks rangkaian
konteks Urutan (Teks) (Tidak ada,) rangkaian
percakapan_id Skalar int32 Id percakapan.
pertanyaan Teks rangkaian
pertanyaan_tulis ulang Teks rangkaian
sumber Teks rangkaian Sumber data asli -- QuAC, CAST, atau Natural Questions
turn_id Skalar int32 Id percakapan berubah, dalam percakapan.
  • Kutipan :
@article{qrecc,
  title={Open-Domain Question Answering Goes Conversational via Question Rewriting},
  author={Anantha, Raviteja and Vakulenko, Svitlana and Tu, Zhucheng and Longpre, Shayne and Pulman, Stephen and Chappidi, Srinivas},
  journal={Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
  year={2021}
}