• DocumentCode
    2016948
  • Title

    High performance Chinese Spoken Term Detection based on term expansion

  • Author

    Li, Wei ; Wu, Ji ; Lv, Ping

  • Author_Institution
    Dept. Electron. Eng., Tsinghua Univ., Beijing, China
  • fYear
    2010
  • fDate
    Nov. 29 2010-Dec. 3 2010
  • Firstpage
    430
  • Lastpage
    434
  • Abstract
    This paper mainly focuses on improving the performance of Chinese Spoken Term Detection (S TD) systems using words as searching units. These systems are designed to find instances of particular phrases (called Terms) in voices. Terms are usually segmented into word sequences and searched with voices´ recognition results. Mismatches between recognition results and word-segmentation might affect their performance. To solve this problem, two algorithms are designed to expand the searching spaces. Th e exp anded algorithms improve systems´ performance while lead to a side-effect for its efficiency. To speed up the retrieval tasks, the Finite State Automation (FSA) is used. A token-passing algorithm is the n developed for fast search. Experiments have shown that the proposed term expansion method could effectively improve the STD system´s performance. And using FSA with token-passing algorithm to search could effectively improve searching efficiency.
  • Keywords
    finite state machines; natural language processing; search problems; speech recognition; word processing; expanded algorithm; finite state automation; high performance Chinese spoken term detection; searching efficiency; term expansion; token passing algorithm; voice recognition; word segmentation; word sequences; Algorithm design and analysis; Conferences; Minimization; Redundancy; Speech; Speech recognition; Text recognition; Chinese Spoken Term Detection; Finite State Automation; term expansion; token passing; word segmentation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
  • Conference_Location
    Tainan
  • Print_ISBN
    978-1-4244-6244-5
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2010.5684852
  • Filename
    5684852