• DocumentCode
    722427
  • Title

    Key frame extraction for text based video retrieval using Maximally Stable Extremal Regions

  • Author

    Wattanarachothai, Werachard ; Patanukhom, Karn

  • Author_Institution
    Dept. of Comput. Eng., Chiang Mai Univ., Chiang Mai, Thailand
  • fYear
    2015
  • fDate
    2-4 March 2015
  • Firstpage
    29
  • Lastpage
    37
  • Abstract
    This paper presents a new approach for text-based video content retrieval system. The proposed scheme consists of three main processes that are key frame extraction, text localization and keyword matching. For the key-frame extraction, we proposed a Maximally Stable Extremal Region (MSER) based feature which is oriented to segment shots of the video with different text contents. In text localization process, in order to form the text lines, the MSERs in each key frame are clustered based on their similarity in position, size, color, and stroke width. Then, Tesseract OCR engine is used for recognizing the text regions. In this work, to improve the recognition results, we input four images obtained from different pre-processing methods to Tesseract engine. Finally, the target keyword for querying is matched with OCR results based on an approximate string search scheme. The experiment shows that, by using the MSER feature, the videos can be segmented by using efficient number of shots and provide the better precision and recall in comparison with a sum of absolute difference and edge based method.
  • Keywords
    content-based retrieval; image matching; image segmentation; optical character recognition; text analysis; video retrieval; MSER; Tesseract OCR engine; key frame extraction; keyword matching; maximally stable extremal region; string search scheme; text based video retrieval; text localization process; text region recognition; text-based video content retrieval system; Databases; Integrated optics; Optical character recognition software; Optical devices; Optical imaging; Target recognition; Text recognition; CBVR; MSER; key frame extraction; shot boundary; text-based video retrieval;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Industrial Networks and Intelligent Systems (INISCom), 2015 1st International Conference on
  • Conference_Location
    Tokyo
  • Type

    conf

  • DOI
    10.4108/icst.iniscom.2015.258410
  • Filename
    7157819