• DocumentCode
    1773944
  • Title

    The impact of sections headings on the document retrieval

  • Author

    Abdelli, Belkacem ; Pinon, Jean-Marie ; Kazar, Okba

  • Author_Institution
    Univ. of Biskra, Biskra, Algeria
  • fYear
    2014
  • fDate
    Sept. 29 2014-Oct. 1 2014
  • Firstpage
    128
  • Lastpage
    134
  • Abstract
    With online publications, the current Web has become the largest source of digital documents, often stored in HTML, XML, PDF or DOC. Among the features of documents, note especially their logical structure, which represents their components such as chapters, sections, paragraphs, the document title, chapter titles, sections, etc. The section headings are meaningful; they are a good indicator of the content of paragraphs. For this reason we pay particular attention to these titles during the indexing process and research. Our objective is to provide relevant access to digital documents, by the process of all sections titles to take advantage of their mining and importance in the research process. Experiments on a large corpus, INEX 2009 show effectiveness of our proposition an improvement in the precision of the results in IR.
  • Keywords
    XML; electronic publishing; indexing; information retrieval; text analysis; DOC; HTML; INEX 2009; PDF; XML; chapter titles; corpus; digital documents; document mining; document retrieval; document titles; indexing process; logical structure; online publications; section headings; Abstracts; Indexing; Information retrieval; Prototypes; Sections; XML; XML; information retrieval; logical structure; metadata; mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Digital Information Management (ICDIM), 2014 Ninth International Conference on
  • Conference_Location
    Phitsanulok
  • Type

    conf

  • DOI
    10.1109/ICDIM.2014.6991398
  • Filename
    6991398