You Won't BELIEVE How I Profiled My Noise
Introduction After my previous post about data augmentation, Philip suggested that noise profiling might be an effective way of augmenting data. The idea is that if we can extract the general background noise from a dataset, then we can synthesize more of it into all of the recordings. This would effectively scramble the last few bits of the data with plausible sounds while retaining the louder events. So I wrote a script for profiling the noise in a dataset, and it occurred to me that this, in its own right, might be a useful part of a suite of tools for soundscape synthesis, so it is getting its own post. Algorithm The script is part of the private ambisynth synth utilities . This is an outline of how it works. Analysis The script first analyzes the dataset to find a good noise sample. It does this by looking for the quietest moment in the corpus. I slide a 1-second window over the entire corpus, and for each window I calculate the perceptual loudness, as defined in Sect