• DocumentCode
    3422048
  • Title

    Multilingual and multimedia information retrieval from Web documents

  • Author

    Gatius, Marta ; Bertran, Manuel ; Rodríguez, Horacio

  • Author_Institution
    TALP Res. Center, Tech. Univ. of Catalunya, Barcelona, Spain
  • fYear
    2004
  • fDate
    30 Aug.-3 Sept. 2004
  • Firstpage
    20
  • Lastpage
    24
  • Abstract
    Web documents present new challenges to conventional information retrieval (IR) technologies. This paper describes how these challenges are faced in FameIR, a multilingual multimedia IR shell. In this shell cross-language IR (CLIR) and query expansion are performed using EuroWordNet (EWN), the best developed and most widely used lexical resource for several languages. Techniques to extract information from Web documents, wrapper generation (WG) techniques, are used to access a finer information granularity than the whole Web page. By combining IR and WG techniques with the use of EWN, FameIR provides a powerful facility to perform CLIR from multimedia Web documents.
  • Keywords
    Internet; document handling; information retrieval; language translation; linguistics; multimedia databases; natural languages; EuroWordNet; FameIR multilingual multimedia IR shell; Web documents; cross-language IR; multimedia information retrieval; wrapper generation; Data mining; Databases; Face; Frequency; HTML; Information retrieval; Internet; Natural language processing; Natural languages; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Database and Expert Systems Applications, 2004. Proceedings. 15th International Workshop on
  • ISSN
    1529-4188
  • Print_ISBN
    0-7695-2195-9
  • Type

    conf

  • DOI
    10.1109/DEXA.2004.1333443
  • Filename
    1333443