gtzan
Stay organized with collections
Save and categorize content based on your preferences.
The dataset consists of 1000 audio tracks each 30 seconds long. It contains 10
genres, each represented by 100 tracks. The tracks are all 22050Hz Mono 16-bit
audio files in .wav format.
The genres are:
Split |
Examples |
'train' |
1,000 |
FeaturesDict({
'audio': Audio(shape=(None,), dtype=int64),
'audio/filename': Text(shape=(), dtype=string),
'label': ClassLabel(shape=(), dtype=int64, num_classes=10),
})
Feature |
Class |
Shape |
Dtype |
Description |
|
FeaturesDict |
|
|
|
audio |
Audio |
(None,) |
int64 |
|
audio/filename |
Text |
|
string |
|
label |
ClassLabel |
|
int64 |
|
@misc{tzanetakis_essl_cook_2001,
author = "Tzanetakis, George and Essl, Georg and Cook, Perry",
title = "Automatic Musical Genre Classification Of Audio Signals",
url = "http://ismir2001.ismir.net/pdf/tzanetakis.pdf",
publisher = "The International Society for Music Information Retrieval",
year = "2001"
}
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2022-12-06 UTC.
[null,null,["Last updated 2022-12-06 UTC."],[],[],null,["# gtzan\n\n\u003cbr /\u003e\n\n- **Description**:\n\nThe dataset consists of 1000 audio tracks each 30 seconds long. It contains 10\ngenres, each represented by 100 tracks. The tracks are all 22050Hz Mono 16-bit\naudio files in .wav format.\n\nThe genres are:\n\n- blues\n- classical\n- country\n- disco\n- hiphop\n- jazz\n- metal\n- pop\n- reggae\n- rock\n\n- **Additional Documentation** :\n [Explore on Papers With Code\n north_east](https://paperswithcode.com/dataset/gtzan)\n\n- **Homepage** :\n \u003chttp://marsyas.info/index.html\u003e\n\n- **Source code** :\n [`tfds.audio.gtzan.GTZAN`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/audio/gtzan/gtzan.py)\n\n- **Versions**:\n\n - **`1.0.0`** (default): No release notes.\n- **Download size** : `1.14 GiB`\n\n- **Dataset size** : `3.71 GiB`\n\n- **Auto-cached**\n ([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):\n No\n\n- **Splits**:\n\n| Split | Examples |\n|-----------|----------|\n| `'train'` | 1,000 |\n\n- **Feature structure**:\n\n FeaturesDict({\n 'audio': Audio(shape=(None,), dtype=int64),\n 'audio/filename': Text(shape=(), dtype=string),\n 'label': ClassLabel(shape=(), dtype=int64, num_classes=10),\n })\n\n- **Feature documentation**:\n\n| Feature | Class | Shape | Dtype | Description |\n|----------------|--------------|---------|--------|-------------|\n| | FeaturesDict | | | |\n| audio | Audio | (None,) | int64 | |\n| audio/filename | Text | | string | |\n| label | ClassLabel | | int64 | |\n\n- **Supervised keys** (See\n [`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):\n `('audio', 'label')`\n\n- **Figure**\n ([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):\n Not supported.\n\n- **Examples**\n ([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):\n\nDisplay examples... \n\n- **Citation**:\n\n @misc{tzanetakis_essl_cook_2001,\n author = \"Tzanetakis, George and Essl, Georg and Cook, Perry\",\n title = \"Automatic Musical Genre Classification Of Audio Signals\",\n url = \"http://ismir2001.ismir.net/pdf/tzanetakis.pdf\",\n publisher = \"The International Society for Music Information Retrieval\",\n year = \"2001\"\n }"]]