Title :
Multilingual and multimedia information retrieval from Web documents
Author :
Gatius, Marta ; Bertran, Manuel ; Rodríguez, Horacio
Author_Institution :
TALP Res. Center, Tech. Univ. of Catalunya, Barcelona, Spain
fDate :
30 Aug.-3 Sept. 2004
Abstract :
Web documents present new challenges to conventional information retrieval (IR) technologies. This paper describes how these challenges are faced in FameIR, a multilingual multimedia IR shell. In this shell cross-language IR (CLIR) and query expansion are performed using EuroWordNet (EWN), the best developed and most widely used lexical resource for several languages. Techniques to extract information from Web documents, wrapper generation (WG) techniques, are used to access a finer information granularity than the whole Web page. By combining IR and WG techniques with the use of EWN, FameIR provides a powerful facility to perform CLIR from multimedia Web documents.
Keywords :
Internet; document handling; information retrieval; language translation; linguistics; multimedia databases; natural languages; EuroWordNet; FameIR multilingual multimedia IR shell; Web documents; cross-language IR; multimedia information retrieval; wrapper generation; Data mining; Databases; Face; Frequency; HTML; Information retrieval; Internet; Natural language processing; Natural languages; Web pages;
Conference_Titel :
Database and Expert Systems Applications, 2004. Proceedings. 15th International Workshop on
Print_ISBN :
0-7695-2195-9
DOI :
10.1109/DEXA.2004.1333443