The recent work of Satvik Venkatesh, on the YOHO paper. In this recently published paper, we present a neural network approach for audio detection. In this paper transition points, or sonic objects, are identifed directly through the neural network design, rather than the traditional approach of block based processing of audio and performing classification per block. The traditional approach quantizes the classification of the signal, and relies on accurate classification of every time step, which can be problematic in noisy environments. In this approach, the prediction of the model is a regression, of the transition points exactly, which means the model is much less likely to oscillate, and the predictions are generally considered more robust. A rigorous review of this approach, in noise environments, was presented in a paper at NeurIPS. The full paper is available here.
less than 1 minute read
This is a recent paper submitted to the Journal of the Audio Engineering Society. In this paper, we take word embeddings, and map them directly onto EQ parameters, using a Fully-Connected Neural Network. We show that a neural network can learn equaliser settings for completely unknown words, which produce EQ results that are both intutive, and perceptually sound plausable. Further subjective evaluations are required to validate these results, but in principal, the idea of mapping semantic word descriptors directly onto any audio effect parameters. This approach could be developed in the future, rolled out to a number of different semantic approaches to create a suite of semantically driven audio effects.