• Description:

VoxForge is a language classification dataset. It consists of user submitted audio clips submitted to the website. In this release, data from 6 languages is collected - English, Spanish, French, German, Russian, and Italian. Since the website is constantly updated, and for the sake of reproducibility, this release contains only recordings submitted prior to 2020-01-01. The samples are splitted between train, validation and testing so that samples from each speaker belongs to exactly one split.

Split Examples
  • Feature structure:
    'audio': Audio(shape=(None,), dtype=int64),
    'label': ClassLabel(shape=(), dtype=int64, num_classes=6),
    'speaker_id': string,
  • Feature documentation:
Feature Class Shape Dtype Description
audio Audio (None,) int64
label ClassLabel int64
speaker_id Tensor string
