Train an audio classifier

Train your own audio classifier with your custom dataset. It comes also pretrained on the 527 AudioSet classes.

Model | Trainable | Inference | Pre-trained

Published by DEEP-Hybrid-DataCloud Consortium
Created: - Updated:

Model Description

Build Status

This is a plug-and-play tool to perform audio classification with Deep Learning. It allows the user to classify their samples of audio as well as training their own classifier for a custom problem.

The classifier is currently pretrained on the 527 high-level classes from the AudioSet dataset.

The PREDICT method expects an audio file as input (or the url of a audio file) and will return a JSON with the top 5 predictions. Most audio file formats are supported (see FFMPEG compatible formats).


Jort F. Gemmeke, Daniel P. W. Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R. Channing Moore, Manoj Plakal, Marvin Ritter,'Audio set: An ontology and human-labeled dataset for audio events', IEEE ICASSP, 2017.

Qiuqiang Kong, Yong Xu, Wenwu Wang, Mark D. Plumbley,'Audio Set classification with attention model: A probabilistic perspective.' arXiv preprint arXiv:1711.00927 (2017).

Changsong Yu, Karim Said Barsim, Qiuqiang Kong, Bin Yang ,'Multi-level Attention Model for Weakly Supervised Audio Classification.' arXiv preprint arXiv:1803.02353 (2018).

S. Hershey, S. Chaudhuri, D. P. W. Ellis, J. F. Gemmeke, A. Jansen, R. C. Moore, M. Plakal, D. Platt, R. A. Saurous, B. Seybold et al., 'CNN architectures for large-scale audio classification,' arXiv preprint arXiv:1609.09430, 2016.

Test this module

You can test and execute this module in various ways.

Excecute locally on your computer using Docker

You can run this module directly on your computer, assuming that you have Docker installed, by following these steps:

$ docker pull deephdc/deep-oc-audio-classification-tf
$ docker run -ti -p 5000:5000 deephdc/deep-oc-audio-classification-tf

Execute on your computer using udocker

If you do not have Docker available or you do not want to install it, you can use udocker within a Python virtualenv:

$ virtualenv udocker
$ source udocker/bin/activate
(udocker) $ pip install udocker
(udocker) $ udocker pull deephdc/deep-oc-audio-classification-tf
(udocker) $ udocker create deephdc/deep-oc-audio-classification-tf
(udocker) $ udocker run -p 5000:5000  deephdc/deep-oc-audio-classification-tf

In either case, once the module is running, point your browser to and you will see the API documentation, where you can test the module functionality, as well as perform other actions (such as training).

For more information, refer to the user documentation.

Train this module

You can train this model using the DEEP framework. In order to execute this module in our pilot e-Infrastructure you would need to be registered in the DEEP IAM.

Once you are registedered, you can go to our training dashboard to configure and train it.

For more information, refer to the user documentation.

Configure and train

Get the code

Github Docker Hub

Get the data

Dataset Training files

Citing this module