Posts

Showing posts from September, 2018

WaveRNN

Image
Figure 1: CVSSP's Pet Wyvern. Introduction I've been thinking about how we can make audio synthesis faster. In particular, I would be interested in realtime soundscape synthesis, partially because I think it would be good for the project, and partly because it would be well aligned with my own personal research goals. I found two options: Parallel Wavenet For a while, I was looking at parallel wavenet . This can supposedly generate samples in realtime, and is now used in the Google Assistant. However, I have been unable to find a good, vanilla implementation of it, and the paper is really sparse on details. There are some projects on GitHub where people have started implementing it more than 6 months ago and have not finished, so given the time constraints of this project, implementing it myself doesn't seem feasible. Moreover, the training process is very intricate and involves training a couple of networks separately and then together -- which makes it really har...

Audio by the Meter

Image
Figure 1: Pinot Gallizio, 'Industrial Painting' Introduction After meeting with Will and Philip, I thought it would be a good idea to start trying to run some machine learning on RPPtv web servers, at least as a proof of concept. As an starting point, I thought it would be nice if people could go to a web page, enter how much audio they want to generate, and then be served a wav file. So I made the simplest page possible, just to try to get the pipeline flowing. It generates lakeside sounds from a model trained by wavenet. The user can only specify the length of the recording. In my tests it takes about a minute and a half to generate 1 second of audio. Installation The code for this page lives in the ambisynth private repository . The main page is just simple html. On submit, the audio is generated by a CGI script written in python. The python script expects there to be a virtual environment where TensorFlow and librosa are installed. The repository contains an instal...

This Web Page Will Send You on the Adventure of a Lifetime - Find Out How

Image
Figure 1: Interface for the synthesizer. Introduction The granular synthesizer I made for a previous post generated some interest, so I made a web-based version so it could be used more easily. I am not going to post a link to it here, because we might try to monetize it later, but I am creating this post as a way of documenting my work. The code for this demo is in a private repository . Example 1: Audio generated by the synthesizer with the default settings. Theory of Operation The granular synthesizer loads in a corpus of recordings. It then randomly chooses small pieces (grains) out of the recordings, fades the pieces in and out, and and pastes the pieces randomly into the output audio stream, potentially with the pieces overlapping one another. There is a good basic description of granular synth here . The pieces are inherently mono; if a piece comes out of a multichannel recording, then the piece is taken from one randomly selected channel. If the output strea...