DocumentCode :
531815
Title :
Wikipedia based semantic metadata annotation of audio transcripts
Author :
Paci, Giulio ; Pedrazzi, Giorgio ; Turra, Roberta
Author_Institution :
CINECA - Consorzio Interuniversitario, Casalecchio di Reno, Italy
fYear :
2010
fDate :
12-14 April 2010
Firstpage :
1
Lastpage :
4
Abstract :
A method to automatically annotate video items with semantic metadata is presented. The method has been developed in the context of the Papyrus project to annotate documentary- like broadcast videos with a set of relevant keywords using automatic speech recognition (ASR) transcripts as a primary complementary resource. The task is complicated by the high word error rate (WER) of the ASR for this kind of videos. For this reason a novel relevance criterion based on domain information is proposed. Wikipedia is used both as a source of metadata and as a linguistic resource for disambiguating keywords and for eliminating the out of topic/out of domain keywords. Documents are annotated with relevant links to Wikipedia pages, concepts definitions, synonyms, translations and concepts categories.
Keywords :
Internet; audio signal processing; speech recognition; video signal processing; ASR transcript; Papyrus project; WER; Wikipedia; audio transcript; automatic speech recognition; broadcast video; semantic metadata annotation; video item annotation; word error rate; Context; Electronic publishing; Encyclopedias; Internet; Semantics; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Image Analysis for Multimedia Interactive Services (WIAMIS), 2010 11th International Workshop on
Conference_Location :
Desenzano del Garda
Print_ISBN :
978-1-4244-7848-4
Electronic_ISBN :
978-88-905328-0-1
Type :
conf
Filename :
5617667
Link To Document :
بازگشت