Google I/O is a wrap! Catch up on TensorFlow sessions View sessions

common_voice

  • Description:

Mozilla Common Voice Dataset

Split Examples

common_voice/en (default config)

  • Config description: Language Code: en

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=17),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64

common_voice/de

  • Config description: Language Code: de

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=10),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64

common_voice/fr

  • Config description: Language Code: fr

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=19),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64

common_voice/cy

  • Config description: Language Code: cy

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=2),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64

common_voice/br

  • Config description: Language Code: br

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64

common_voice/cv

  • Config description: Language Code: cv

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=0),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64

common_voice/tr

  • Config description: Language Code: tr

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64

common_voice/tt

  • Config description: Language Code: tt

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=0),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64

common_voice/ky

  • Config description: Language Code: ky

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64

common_voice/ga-IE

  • Config description: Language Code: ga-IE

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64

common_voice/kab

  • Config description: Language Code: kab

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64

common_voice/ca

  • Config description: Language Code: ca

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=6),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64

common_voice/zh-TW

  • Config description: Language Code: zh-TW

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64

common_voice/sl

  • Config description: Language Code: sl

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64

common_voice/it

  • Config description: Language Code: it

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64

common_voice/nl

  • Config description: Language Code: nl

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64

common_voice/cnh

  • Config description: Language Code: cnh

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=1),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64

common_voice/eo

  • Config description: Language Code: eo

  • Feature structure:

FeaturesDict({
    'accent': ClassLabel(shape=(), dtype=tf.int64, num_classes=2),
    'age': Text(shape=(), dtype=tf.string),
    'client_id': Text(shape=(), dtype=tf.string),
    'downvotes': tf.int32,
    'gender': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    'sentence': Text(shape=(), dtype=tf.string),
    'upvotes': tf.int32,
    'voice': Audio(shape=(None,), dtype=tf.int64),
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
accent ClassLabel tf.int64
age Text tf.string
client_id Text tf.string
downvotes Tensor tf.int32
gender ClassLabel tf.int64
sentence Text tf.string
upvotes Tensor tf.int32
voice Audio (None,) tf.int64