DocumentCode :
3422048
Title :
Multilingual and multimedia information retrieval from Web documents
Author :
Gatius, Marta ; Bertran, Manuel ; Rodríguez, Horacio
Author_Institution :
TALP Res. Center, Tech. Univ. of Catalunya, Barcelona, Spain
fYear :
2004
fDate :
30 Aug.-3 Sept. 2004
Firstpage :
20
Lastpage :
24
Abstract :
Web documents present new challenges to conventional information retrieval (IR) technologies. This paper describes how these challenges are faced in FameIR, a multilingual multimedia IR shell. In this shell cross-language IR (CLIR) and query expansion are performed using EuroWordNet (EWN), the best developed and most widely used lexical resource for several languages. Techniques to extract information from Web documents, wrapper generation (WG) techniques, are used to access a finer information granularity than the whole Web page. By combining IR and WG techniques with the use of EWN, FameIR provides a powerful facility to perform CLIR from multimedia Web documents.
Keywords :
Internet; document handling; information retrieval; language translation; linguistics; multimedia databases; natural languages; EuroWordNet; FameIR multilingual multimedia IR shell; Web documents; cross-language IR; multimedia information retrieval; wrapper generation; Data mining; Databases; Face; Frequency; HTML; Information retrieval; Internet; Natural language processing; Natural languages; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Database and Expert Systems Applications, 2004. Proceedings. 15th International Workshop on
ISSN :
1529-4188
Print_ISBN :
0-7695-2195-9
Type :
conf
DOI :
10.1109/DEXA.2004.1333443
Filename :
1333443
Link To Document :
بازگشت