• DocumentCode
    2603865
  • Title

    Design of Paper Duplicate Detection System Based on Lucene

  • Author

    Ding, YueHua ; Yi, Kui ; Xiang, RiHua

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., WuHan Polytech. Univ., Wuhan, China
  • fYear
    2010
  • fDate
    17-18 April 2010
  • Firstpage
    36
  • Lastpage
    39
  • Abstract
    Full-text retrieval is a very popular technology in recent information search area. Lucene is an open-source full-text search engine toolkit, and has excellent system architecture and wide application foreground. Based on paper duplicate detection system research, we introduce Lucene theory and analyze two pivotal work steps of Lucene which are index creation module and index search module. We describe the paper duplicate detection system design and implementation, and discuss key technology of highlight and combination of B/S mode and C/S mode. We provide excellent solution for similar system development.
  • Keywords
    indexing; information retrieval; public domain software; search engines; text analysis; Lucene theory; Lucene toolkit; duplicate detection system research; full text retrieval; index creation module; index search module; information search; open source full text search engine toolkit; paper duplicate detection system; Computer science; Design engineering; Information retrieval; Java; Packaging; Paper technology; Search engines; Spatial databases; User interfaces; Wearable computers; Browse/Server; Client/Server; Lucene; full-text search; highlight;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Wearable Computing Systems (APWCS), 2010 Asia-Pacific Conference on
  • Conference_Location
    Shenzhen
  • Print_ISBN
    978-1-4244-6467-8
  • Electronic_ISBN
    978-1-4244-6468-5
  • Type

    conf

  • DOI
    10.1109/APWCS.2010.16
  • Filename
    5481111