• Description:

BoolQ is a question answering dataset for yes/no questions containing 15942 examples. These questions are naturally occurring, they are generated in unprompted and unconstrained settings.

Each example is a triplet of (question, passage, answer), with the title of the page as optional additional context. The text-pair classification setup is similar to existing natural language inference tasks.

Split Examples
'train' 9,427
'validation' 3,270
  • Feature structure:
    'answer': tf.bool,
    'passage': Text(shape=(), dtype=tf.string),
    'question': Text(shape=(), dtype=tf.string),
    'title': Text(shape=(), dtype=tf.string),
  • Feature documentation:
Feature Class Shape Dtype Description
answer Tensor tf.bool
passage Text tf.string
question Text tf.string
title Text tf.string
  • Citation:
  title =     {BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions},
  author =    {Clark, Christopher and Lee, Kenton and Chang, Ming-Wei, and Kwiatkowski, Tom and Collins, Michael, and Toutanova, Kristina},
  booktitle = {NAACL},
  year =      {2019},