DocumentCode
427058
Title
Multimodal music retrieval for large databases
Author
Schuller, Björn ; Rigoll, Gerhard ; Lang, Manfred
Author_Institution
Inst. for Human-Machine Commun., Technische Univ. Munchen, Germany
Volume
2
fYear
2004
fDate
27-30 June 2004
Firstpage
755
Abstract
We present a novel multi-modal access to large MP3 music databases. Retrieval can be fulfilled either in a content-based manner or by keywords. As input modalities, speech by natural language utterances or singing, and manual interaction by handwriting, typing or hardkeys are used. In order to achieve especially robust retrieval results and automatically suggest music to the user, contextual knowledge of the time, date, season, user emotion, and listening habits is integrated in the retrieval process. The system communicates with the user by speech or visual reactions. The concepts shown are especially designed for home and mobile access on tablet-PCs, PDAs, and similar PC solutions, The paper discusses the concept and a working prototype called Shangrila. An evaluation by a user study leads to an impression of the capabilities of the suggested approach to multimodal music retrieval.
Keywords
audio databases; content-based retrieval; graphical user interfaces; music; natural language interfaces; GUI; MP3 music databases; PDA; content-based retrieval; contextual knowledge; graphical user interface; handwriting; hardkeys; keywords; large music databases; listening habits; multimodal music retrieval; natural language utterances; singing; tablet-PC; typing; user emotion; Audio databases; Content based retrieval; Context; Digital audio players; Music information retrieval; Natural languages; Personal digital assistants; Prototypes; Robustness; Speech;
fLanguage
English
Publisher
ieee
Conference_Titel
Multimedia and Expo, 2004. ICME '04. 2004 IEEE International Conference on
Print_ISBN
0-7803-8603-5
Type
conf
DOI
10.1109/ICME.2004.1394310
Filename
1394310
Link To Document