บรรณารักษ์

คำอธิบาย :

LibriSpeech เป็นการรวบรวมสุนทรพจน์ภาษาอังกฤษสำหรับการอ่านประมาณ 1,000 ชั่วโมง ด้วยอัตราการสุ่มตัวอย่าง 16 kHz ซึ่งจัดทำโดย Vassil Panayotov ด้วยความช่วยเหลือจาก Daniel Povey ข้อมูลได้มาจากการอ่านหนังสือเสียงจากโครงการ LibriVox และได้รับการแบ่งส่วนและจัดตำแหน่งอย่างระมัดระวัง

ขอแนะนำให้ใช้การถอดรหัสเสียงแบบขี้เกียจเพื่อการอ่านที่เร็วขึ้นและขนาดชุดข้อมูลที่เล็กลง: - ติดตั้งไลบรารี tensorflow_io : pip install tensorflow-io - เปิดใช้งานการถอดรหัสแบบขี้เกียจ: tfds.load('librispeech', builder_kwargs={'config': 'lazy_decode'})

เอกสารประกอบเพิ่มเติม : สำรวจเอกสารด้วยรหัส
โฮมเพจ : http://www.openslr.org/12
รหัสแหล่งที่มา : tfds.datasets.librispeech.Builder
ขนาดการดาวน์โหลด : 57.14 GiB
แคชอัตโนมัติ ( เอกสารประกอบ ): ไม่
แยก :

แยก	ตัวอย่าง
`'dev_clean'`	2,703
`'dev_other'`	2,864
`'test_clean'`	2,620
`'test_other'`	2,939
`'train_clean100'`	28,539
`'train_clean360'`	104,014
`'train_other500'`	148,688

โครงสร้างคุณลักษณะ :

FeaturesDict({
    'chapter_id': int64,
    'id': string,
    'speaker_id': int64,
    'speech': Audio(shape=(None,), dtype=int16),
    'text': Text(shape=(), dtype=string),
})

เอกสารคุณสมบัติ :

คุณสมบัติ	ระดับ	รูปร่าง	Dประเภท
	คุณสมบัติDict
Chapter_id	เทนเซอร์		int64
รหัส	เทนเซอร์		สตริง
รหัสลำโพง	เทนเซอร์		int64
คำพูด	เครื่องเสียง	(ไม่มี,)	int16
ข้อความ	ข้อความ		สตริง

คีย์ภายใต้การดูแล (ดู as_supervised doc ): ('speech', 'text')
รูปภาพ ( tfds.show_examples ): ไม่รองรับ
การอ้างอิง :

@inproceedings{panayotov2015librispeech,
  title={Librispeech: an ASR corpus based on public domain audio books},
  author={Panayotov, Vassil and Chen, Guoguo and Povey, Daniel and Khudanpur, Sanjeev},
  booktitle={Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on},
  pages={5206--5210},
  year={2015},
  organization={IEEE}
}

librispeech/default (การกำหนดค่าเริ่มต้น)

คำอธิบาย การกำหนดค่า : ชุดข้อมูลเริ่มต้น
รุ่น :
- 2.1.1 (ค่าเริ่มต้น): แก้ไขประเภทข้อมูลคำพูดด้วย dtype=tf.int16
- 2.1.2 : เพิ่มการกำหนดค่า 'lazy_decode'
ขนาดชุดข้อมูล : 304.47 GiB
ตัวอย่าง ( tfds.as_dataframe ):

librispeech/lazy_decode

คำอธิบาย การกำหนดค่า : ชุดข้อมูลเสียงดิบ
รุ่น :
- 2.1.1 : แก้ไขประเภทข้อมูลคำพูดด้วย dtype=tf.int16
- 2.1.2 (ค่าเริ่มต้น): เพิ่มการกำหนดค่า 'lazy_decode'
ขนาดชุดข้อมูล : 59.37 GiB
ตัวอย่าง ( tfds.as_dataframe ): ไม่มี

คำอธิบาย :

เอกสารประกอบเพิ่มเติม : สำรวจเอกสารด้วยรหัส
โฮมเพจ : http://www.openslr.org/12
รหัสแหล่งที่มา : tfds.datasets.librispeech.Builder
ขนาดการดาวน์โหลด : 57.14 GiB
แคชอัตโนมัติ ( เอกสารประกอบ ): ไม่
แยก :

แยก	ตัวอย่าง
`'dev_clean'`	2,703
`'dev_other'`	2,864
`'test_clean'`	2,620
`'test_other'`	2,939
`'train_clean100'`	28,539
`'train_clean360'`	104,014
`'train_other500'`	148,688

โครงสร้างคุณลักษณะ :

FeaturesDict({
    'chapter_id': int64,
    'id': string,
    'speaker_id': int64,
    'speech': Audio(shape=(None,), dtype=int16),
    'text': Text(shape=(), dtype=string),
})

เอกสารคุณสมบัติ :

คุณสมบัติ	ระดับ	รูปร่าง	Dประเภท
	คุณสมบัติDict
Chapter_id	เทนเซอร์		int64
รหัส	เทนเซอร์		สตริง
รหัสลำโพง	เทนเซอร์		int64
คำพูด	เครื่องเสียง	(ไม่มี,)	int16
ข้อความ	ข้อความ		สตริง

คีย์ภายใต้การดูแล (ดู as_supervised doc ): ('speech', 'text')
รูปภาพ ( tfds.show_examples ): ไม่รองรับ
การอ้างอิง :

@inproceedings{panayotov2015librispeech,
  title={Librispeech: an ASR corpus based on public domain audio books},
  author={Panayotov, Vassil and Chen, Guoguo and Povey, Daniel and Khudanpur, Sanjeev},
  booktitle={Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on},
  pages={5206--5210},
  year={2015},
  organization={IEEE}
}

librispeech/default (การกำหนดค่าเริ่มต้น)

คำอธิบาย การกำหนดค่า : ชุดข้อมูลเริ่มต้น
รุ่น :
- 2.1.1 (ค่าเริ่มต้น): แก้ไขประเภทข้อมูลคำพูดด้วย dtype=tf.int16
- 2.1.2 : เพิ่มการกำหนดค่า 'lazy_decode'
ขนาดชุดข้อมูล : 304.47 GiB
ตัวอย่าง ( tfds.as_dataframe ):

librispeech/lazy_decode

คำอธิบาย การกำหนดค่า : ชุดข้อมูลเสียงดิบ
รุ่น :
- 2.1.1 : แก้ไขประเภทข้อมูลคำพูดด้วย dtype=tf.int16
- 2.1.2 (ค่าเริ่มต้น): เพิ่มการกำหนดค่า 'lazy_decode'
ขนาดชุดข้อมูล : 59.37 GiB
ตัวอย่าง ( tfds.as_dataframe ): ไม่มี