• DocumentCode
    3633646
  • Title

    Employing named entities for semantic retrieval of news videos in Turkish

  • Author

    Dilek Kucuk;Adnan Yazici

  • Author_Institution
    Power Electron. Group, TUBITAK Uzay Inst., Ankara, Turkey
  • fYear
    2009
  • Firstpage
    153
  • Lastpage
    158
  • Abstract
    Named entities are known to be important means for semantic annotation of news texts. Considerable work has been carried out for semantic indexing of both textual news and news videos especially in English through the employment of named entities extracted from textual news or transcriptions of the news videos. In this paper, we present our semantic retrieval architecture for news videos in Turkish based on prior semantic annotation of the videos with the corresponding named entities in the news transcription texts. We employ a rule-based named entity recognizer for Turkish which makes use of handcrafted sets of lexical resources and pattern bases. We compiled a small corpus of Turkish news videos and the named entity recognizer in its current form achieves a success rate of about 75% on this corpus. A retrieval interface is implemented to access the video corpus through the boolean queries formed with the extracted named entities. The interface currently does not involve any ranking procedure, displaying all the videos, the transcription texts of which satisfy the boolean query posed through the interface, sorted by their broadcast date. The presented study is significant for its being the first study to perform automatic semantic video annotation on a genuine news video corpus in Turkish and demonstrating the utilization of the annotations through a retrieval interface.
  • Keywords
    "Videos","Data mining","Indexing","Automatic speech recognition","Ontologies","Employment","Multimedia communication","Natural languages","TV","Power electronics"
  • Publisher
    ieee
  • Conference_Titel
    Computer and Information Sciences, 2009. ISCIS 2009. 24th International Symposium on
  • Print_ISBN
    978-1-4244-5021-3
  • Type

    conf

  • DOI
    10.1109/ISCIS.2009.5291836
  • Filename
    5291836