- Description:
Language modeling resources to be used in conjunction with the LibriSpeech ASR corpus.
Homepage: http://www.openslr.org/11
Source code:
tfds.datasets.librispeech_lm.BuilderVersions:
0.1.0(default): No release notes.
Download size:
1.40 GiBDataset size:
4.62 GiBAuto-cached (documentation): No
Splits:
| Split | Examples |
|---|---|
'train' |
40,418,260 |
- Feature structure:
FeaturesDict({
'text': Text(shape=(), dtype=string),
})
- Feature documentation:
| Feature | Class | Shape | Dtype | Description |
|---|---|---|---|---|
| FeaturesDict | ||||
| text | Text | string |
Supervised keys (See
as_superviseddoc):('text', 'text')Figure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):
- Citation:
@inproceedings{panayotov2015librispeech,
title={Librispeech: an ASR corpus based on public domain audio books},
author={Panayotov, Vassil and Chen, Guoguo and Povey, Daniel and Khudanpur, Sanjeev},
booktitle={Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on},
pages={5206--5210},
year={2015},
organization={IEEE}
}