voxceleb
Stay organized with collections
Save and categorize content based on your preferences.
Warning: Manual download required. See instructions below.
An large scale dataset for speaker identification. This data is collected from
over 1,251 speakers, with over 150k samples in total. This release contains the
audio part of the voxceleb1.1 dataset.
Split
Examples
'test'
7,972
'train'
134,000
'validation'
6,670
FeaturesDict ({
'audio' : Audio ( shape = ( None ,), dtype = int64 ),
'label' : ClassLabel ( shape = (), dtype = int64 , num_classes = 1252 ),
'youtube_id' : Text ( shape = (), dtype = string ),
})
Feature
Class
Shape
Dtype
Description
FeaturesDict
audio
Audio
(None,)
int64
label
ClassLabel
int64
youtube_id
Text
string
@InProceedings { Nagrani17 ,
author = "Nagrani, A. and Chung, J.~S. and Zisserman, A." ,
title = "VoxCeleb: a large-scale speaker identification dataset" ,
booktitle = "INTERSPEECH" ,
year = "2017" ,
}
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License , and code samples are licensed under the Apache 2.0 License . For details, see the Google Developers Site Policies . Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2022-12-06 UTC.
[null,null,["Last updated 2022-12-06 UTC."],[],[],null,["# voxceleb\n\n\u003cbr /\u003e\n\n| **Warning:** Manual download required. See instructions below.\n\n- **Description**:\n\nAn large scale dataset for speaker identification. This data is collected from\nover 1,251 speakers, with over 150k samples in total. This release contains the\naudio part of the voxceleb1.1 dataset.\n\n- **Additional Documentation** :\n [Explore on Papers With Code\n north_east](https://paperswithcode.com/dataset/voxceleb1)\n\n- **Homepage** :\n [http://www.robots.ox.ac.uk/\\~vgg/data/voxceleb/vox1.html](http://www.robots.ox.ac.uk/%7Evgg/data/voxceleb/vox1.html)\n\n- **Source code** :\n [`tfds.audio.Voxceleb`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/audio/voxceleb.py)\n\n- **Versions**:\n\n - **`1.2.1`** (default): Add youtube_id field\n- **Download size** : `4.68 MiB`\n\n- **Dataset size** : `107.98 GiB`\n\n- **Manual download instructions** : This dataset requires you to\n download the source data manually into `download_config.manual_dir`\n (defaults to `~/tensorflow_datasets/downloads/manual/`): \n\n manual_dir should contain the file vox_dev_wav.zip. The instructions for\n downloading this file are found in [http://www.robots.ox.ac.uk/\\~vgg/data/voxceleb/vox1.html](http://www.robots.ox.ac.uk/~vgg/data/voxceleb/vox1.html) This dataset requires registration.\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n No\n\n- **Splits**:\n\n| Split | Examples |\n|----------------|----------|\n| `'test'` | 7,972 |\n| `'train'` | 134,000 |\n| `'validation'` | 6,670 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'audio': Audio(shape=(None,), dtype=int64),\n 'label': ClassLabel(shape=(), dtype=int64, num_classes=1252),\n 'youtube_id': Text(shape=(), dtype=string),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|------------|--------------|---------|--------|-------------|\n| | FeaturesDict | | | |\n| audio | Audio | (None,) | int64 | |\n| label | ClassLabel | | int64 | |\n| youtube_id | Text | | string | |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `('audio', 'label')`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n Not supported.\n\n- **Examples**\n ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\n- **Citation**:\n\n @InProceedings{Nagrani17,\n author = \"Nagrani, A. and Chung, J.~S. and Zisserman, A.\",\n title = \"VoxCeleb: a large-scale speaker identification dataset\",\n booktitle = \"INTERSPEECH\",\n year = \"2017\",\n }"]]