q_re_cc

  • Deskripsi :

Kumpulan data yang berisi 14 ribu percakapan dengan 81 ribu pasangan pertanyaan-jawaban. QReCC dibuat berdasarkan pertanyaan dari TREC CAST, QuAC, dan Google Natural Questions.

Membelah Contoh
'test' 16.451
'train' 63.501
  • Struktur fitur :
FeaturesDict({
    'answer': Text(shape=(), dtype=string),
    'answer_url': Text(shape=(), dtype=string),
    'context': Sequence(Text(shape=(), dtype=string)),
    'conversation_id': Scalar(shape=(), dtype=int32),
    'question': Text(shape=(), dtype=string),
    'question_rewrite': Text(shape=(), dtype=string),
    'source': Text(shape=(), dtype=string),
    'turn_id': Scalar(shape=(), dtype=int32),
})
  • Dokumentasi fitur :
Fitur Kelas Membentuk Dtype Keterangan
fiturDict
menjawab Teks rangkaian
answer_url Teks rangkaian
konteks Urutan (Teks) (Tidak ada,) rangkaian
id_percakapan Skalar int32 Id percakapan.
pertanyaan Teks rangkaian
pertanyaan_tulis ulang Teks rangkaian
sumber Teks rangkaian Sumber asli data -- baik QuAC, CAsT, atau Natural Questions
turn_id Skalar int32 ID percakapan berubah, di dalam percakapan.
  • Kutipan :
@article{qrecc,
  title={Open-Domain Question Answering Goes Conversational via Question Rewriting},
  author={Anantha, Raviteja and Vakulenko, Svitlana and Tu, Zhucheng and Longpre, Shayne and Pulman, Stephen and Chappidi, Srinivas},
  journal={Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
  year={2021}
}