Tracing Sound Objects in Audio Textures

Introduction This webpage provides supplementary audio examples, visualizations, and source code for research results on tracing and establishing sound objects within texture sounds, which is presented as Tracing Sound Objects in Audio Textures (M.Dörfler and E.Matusiak) in the proceedings of SAMPTA2013. The work has been accomplished in the framework of the AudioMiner (supported by the WWTF, Project  number MA09-024) project at the Numerical Harmonic Analysis Group (University of Vienna) and the Austrian Research Institute for Artificial Intelligence.


Object Tracing by Gabor multiplier evaluation 

The variations of the Gabor coefficients between different slices of the signal are tracked by investigating corresponding Gabor multipliers:  we compute Gabor multipliers that transform one slice of the signal into another. Due to the correlation of any two sufficiently long slices of texture sound, the slices of their Gabor transforms are also expected to be correlated. If the Gabor multiplier transforming one part of the signal into another can be chosen to be close to a constant function we conclude that no sound object is present.


Original signals: Rain, Washing Machine

Signals with sound objects: Rain with 3 sound objects , Washing machine with one sound object

Time signal:





Spectrogram and detection by Gabor multipliers: Rain and washing machine

Object Tracing by sparse Dictionary Representation 

For a given texture sound, we learn a dictionary such that each piece of the signal admits a sparse approximate representation in that dictionary.

During the observation or evaluation phase, we scan the signal piece by piece by checking its reconstruction error with respect to the learned dictionary, which characterizes the present texture, in order to decide in which intervals of time an object may occur. If, for a given sparsity level it is not possible to sparsely represent some piece of the observed signal with a prescribed accuracy, we conclude that, in that part of the signal, a sound object is present.


Learning and Evaluation Process with Dictionary representation of texture sounds:

Detection by dictionary sparse prior: for rain signal and washing machine signal:




Questions? Please contact us if have any comments and questions or if you wish to collaborate.