Title :
HRTF phase synthesis via sparse representation of anthropometric features
Author_Institution :
Microsoft Res., Redmond, WA, USA
Abstract :
We propose a method for the synthesis of the phases of Head-Related Transfer Functions (HRTFs) using a sparse representation of anthropometric features. Our approach treats the HRTF synthesis problem as finding a sparse representation of the subjects anthropometric features w.r.t. the anthropometric features in the training set. The fundamental assumption is that the group delay of a given HRTF set can be described by thWe propose a method for the synthesis of the phases of Head-Related Transfer Functions (HRTFs) using a sparse representation of anthropometric features. Our approach treats the HRTF synthesis problem as finding a sparse representation of the subjects anthropometric features w.r.t. the anthropometric features in the training set. The fundamental assumption is that the group delay of a given HRTF set can be described by the same sparse combination as the anthropometric data. Thus, we learn a sparse vector that represents the subjects anthropometric features as a linear superposition of the anthropometric features of a small subset of subjects from the training data. Then, we apply the same sparse vector directly on the HRTF group delay data. For evaluation purpose we use a new dataset, containing both anthropometric features and HRTFs. We compare the proposed sparse representation based approach with ridge regression and with the data of a manikin (which was designed based on average anthropometric data), and we simulate the best and the worst possible classifiers to select one of the HRTFs from the dataset. For objective evaluation we use the mean square error of the group delay scaling factor. Experiments show that our sparse representation outperforms all other evaluated techniques, and that the synthesized HRTFs are almost as good as the best possible HRTF classifier.e same sparse combination as the anthropometric data. Thus, we learn a sparse vector that represents the subjects anthropometric features as a linear superposition of the anthropometr- c features of a small subset of subjects from the training data. Then, we apply the same sparse vector directly on the HRTF group delay data. For evaluation purpose we use a new dataset, containing both anthropometric features and HRTFs. We compare the proposed sparse representation based approach with ridge regression and with the data of a manikin (which was designed based on average anthropometric data), and we simulate the best and the worst possible classifiers to select one of the HRTFs from the dataset. For objective evaluation we use the mean square error of the group delay scaling factor. Experiments show that our sparse representation outperforms all other evaluated techniques, and that the synthesized HRTFs are almost as good as the best possible HRTF classifier.
Keywords :
anthropometry; audio signal processing; compressed sensing; delays; feature extraction; mean square error methods; transfer functions; HRTF classifier; HRTF group delay data; HRTF phase synthesis; anthropometric data; anthropometric features; fundamental assumption; group delay scaling factor; head-related transfer functions; linear superposition; manikin; mean square error; ridge regression; sparse combination; sparse representation; sparse vector; training data; training set; Acoustics; Delay effects; Delays; Ear; Training; Transfer functions; Vectors; Anthropometric Features; HRTF Personalization; HRTF Synthesis; Head-related Transfer Function; Sparse Representation;
Conference_Titel :
Information Theory and Applications Workshop (ITA), 2014
Conference_Location :
San Diego, CA
DOI :
10.1109/ITA.2014.6804239