Train an audio classifier

By DEEP-Hybrid-DataCloud Consortium | Created: - Updated:

tensorflow, docker, deep learning, trainable, inference, pre-trained, api-v1

License: MIT

Build Status

This is a plug-and-play tool to perform audio classification with Deep Learning. It allows the user to classify their samples of audio as well as training their own classifier for a custom problem.

The classifier is currently pretrained on the 527 high-level classes from the AudioSet dataset.

The PREDICT method expects an audio file as input (or the url of a audio file) and will return a JSON with the top 5 predictions. Most audio file formats are supported (see FFMPEG compatible formats).

References

Jort F. Gemmeke, Daniel P. W. Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R. Channing Moore, Manoj Plakal, Marvin Ritter,'Audio set: An ontology and human-labeled dataset for audio events', IEEE ICASSP, 2017.

Qiuqiang Kong, Yong Xu, Wenwu Wang, Mark D. Plumbley,'Audio Set classification with attention model: A probabilistic perspective.' arXiv preprint arXiv:1711.00927 (2017).

Changsong Yu, Karim Said Barsim, Qiuqiang Kong, Bin Yang ,'Multi-level Attention Model for Weakly Supervised Audio Classification.' arXiv preprint arXiv:1803.02353 (2018).

S. Hershey, S. Chaudhuri, D. P. W. Ellis, J. F. Gemmeke, A. Jansen, R. C. Moore, M. Plakal, D. Platt, R. A. Saurous, B. Seybold et al., 'CNN architectures for large-scale audio classification,' arXiv preprint arXiv:1609.09430, 2016.

Run locally on your computer

Using Docker

You can run this module directly on your computer, assuming that you have Docker installed, by following these steps:

$ docker pull deephdc/deep-oc-audio-classification-tf
$ docker run -ti -p 5000:5000 deephdc/deep-oc-audio-classification-tf

Using udocker

If you do not have Docker available or you do not want to install it, you can use udocker within a Python virtualenv:

$ virtualenv udocker
$ source udocker/bin/activate
$ git clone https://github.com/indigo-dc/udocker
$ cd udocker
$ pip install .
$ udocker pull deephdc/deep-oc-audio-classification-tf
$ udocker create deephdc/deep-oc-audio-classification-tf
$ udocker run -p 5000:5000  deephdc/deep-oc-audio-classification-tf

Once running, point your browser to http://127.0.0.1:5000/ and you will see the API documentation, where you can test the module functionality, as well as perform other actions (such as training).

For more information, refer to the user documentation.