Audio by the Meter


Figure 1: Pinot Gallizio, 'Industrial Painting'

Introduction

After meeting with Will and Philip, I thought it would be a good idea to start trying to run some machine learning on RPPtv web servers, at least as a proof of concept. As an starting point, I thought it would be nice if people could go to a web page, enter how much audio they want to generate, and then be served a wav file. So I made the simplest page possible, just to try to get the pipeline flowing. It generates lakeside sounds from a model trained by wavenet. The user can only specify the length of the recording. In my tests it takes about a minute and a half to generate 1 second of audio.

Installation

The code for this page lives in the ambisynth private repository. The main page is just simple html. On submit, the audio is generated by a CGI script written in python. The python script expects there to be a virtual environment where TensorFlow and librosa are installed. The repository contains an install script that creates the virtual environment. After running that, the website can be served with
python -m CGIHTTPServer 8000
So far, I have only run it locally. I don't have the resources to run this live on the web.

Screen Shots



Figure 2: Screenshots of the main page, and the audio file page, after it has been generated




Figure 3: Server log while serving the index.html, generating the audio file, and serving it

Future Work

First, this needs to be set up on RPPtv servers. After that we can start trying to streamline it the user experience. For example, there should be a progress meter, it should save checkpoints so if a user kills it halfway through they can still get half of their audio. One person using it should not create a DoS for other users. Also, we need to add more audio classes that the user can select, and eventually other features we have discussed like multichannel, and realtime audio generation.

Comments

  1. Excellent Michael.
    We will get Sam and Julian to have a look and assess how we can move this forwards.
    Thank you, this is all very interesting.
    Russ

    ReplyDelete

Post a Comment

Popular posts from this blog

WaveRNN

How I calibrated my contact microphone

Ambisonic Rendering in the Story Bubble