ljspeech
Stay organized with collections
Save and categorize content based on your preferences.
This is a public domain speech dataset consisting of 13,100 short audio clips of
a single speaker reading passages from 7 non-fiction books. A transcription is
provided for each clip. Clips vary in length from 1 to 10 seconds and have a
total length of approximately 24 hours.
The texts were published between 1884 and 1964, and are in the public domain.
The audio was recorded in 2016-17 by the LibriVox project and is also in the
public domain.
Split |
Examples |
'train' |
13,100 |
FeaturesDict({
'id': string,
'speech': Audio(shape=(None,), dtype=int16),
'text': Text(shape=(), dtype=string),
'text_normalized': Text(shape=(), dtype=string),
})
Feature |
Class |
Shape |
Dtype |
Description |
|
FeaturesDict |
|
|
|
id |
Tensor |
|
string |
|
speech |
Audio |
(None,) |
int16 |
|
text |
Text |
|
string |
|
text_normalized |
Text |
|
string |
|
@misc{ljspeech17,
author = {Keith Ito},
title = {The LJ Speech Dataset},
howpublished = {\url{https://keithito.com/LJ-Speech-Dataset/} },
year = 2017
}
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2022-12-13 UTC.
[null,null,["Last updated 2022-12-13 UTC."],[],[],null,["# ljspeech\n\n\u003cbr /\u003e\n\n- **Description**:\n\nThis is a public domain speech dataset consisting of 13,100 short audio clips of\na single speaker reading passages from 7 non-fiction books. A transcription is\nprovided for each clip. Clips vary in length from 1 to 10 seconds and have a\ntotal length of approximately 24 hours.\n\nThe texts were published between 1884 and 1964, and are in the public domain.\nThe audio was recorded in 2016-17 by the LibriVox project and is also in the\npublic domain.\n\n- **Additional Documentation** :\n [Explore on Papers With Code\n north_east](https://paperswithcode.com/dataset/ljspeech)\n\n- **Homepage** :\n \u003chttps://keithito.com/LJ-Speech-Dataset/\u003e\n\n- **Source code** :\n [`tfds.datasets.ljspeech.Builder`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/datasets/ljspeech/ljspeech_dataset_builder.py)\n\n- **Versions**:\n\n - **`1.1.1`** (default): Fix speech data type with dtype=tf.int16.\n- **Download size** : `2.56 GiB`\n\n- **Dataset size** : `10.73 GiB`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n No\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'train'` | 13,100 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'id': string,\n 'speech': Audio(shape=(None,), dtype=int16),\n 'text': Text(shape=(), dtype=string),\n 'text_normalized': Text(shape=(), dtype=string),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|-----------------|--------------|---------|--------|-------------|\n| | FeaturesDict | | | |\n| id | Tensor | | string | |\n| speech | Audio | (None,) | int16 | |\n| text | Text | | string | |\n| text_normalized | Text | | string | |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `('text_normalized', 'speech')`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n Not supported.\n\n- **Examples**\n ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\n- **Citation**:\n\n @misc{ljspeech17,\n author = {Keith Ito},\n title = {The LJ Speech Dataset},\n howpublished = {\\url{https://keithito.com/LJ-Speech-Dataset/} },\n year = 2017\n }"]]