Title of article :
Content-based hierarchical document organization using multi-layer hybrid network and tree-structured features
Author/Authors :
Rahman، نويسنده , , M.K.M. and Chow، نويسنده , , Tommy W.S.، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2010
Abstract :
Automatic organizing documents through a hierarchical tree is demanding in many real applications. In this work, we focus on the problem of content-based document organization through a hierarchical tree which can be viewed as a classification problem. We proposed a new document representation to enhance the classification accuracy. We developed a new hybrid neural network model to handle the new document representation. In our document representation, a document is represented by a tree-structure that has a superior capability of encoding document characteristics. Compared to traditional feature representation that encodes only global characteristics of a document, the proposed approach can encode both global and local characteristics of a document through a hierarchical tree. Unlike traditional representation, the tree representation reflects the spatial organizations of words through pages and paragraphs of a document that help to encode better semantics of a document. Processing hierarchical tree is another challenging task in terms of computational complexity. We developed a hybrid neural network model, composed of SOM and MLP, for this task. Experimental results corroborate that our approach is efficient and effective in registering documents into organized tree compared with other approach.
Keywords :
Tree-structured features , Self-organizing map , Multi-layer hybrid network , Document classification , Hierarchical organization
Journal title :
Expert Systems with Applications
Journal title :
Expert Systems with Applications