مرکز منطقه ای اطلاع رساني علوم و فناوري - Automatic titling of electronic documents with noun phrase extraction

DocumentCode :

2039375

Title :

Automatic titling of electronic documents with noun phrase extraction

Author :

Lopez, Cedric ; Prince, Violaine ; Roche, Mathieu

Author_Institution :

LIRMM, Univ. Montpellier 2, Montpellier, France

fYear :

2010

fDate :

7-10 Dec. 2010

Firstpage :

168

Lastpage :

171

Abstract :

Automatic titling (i.e. providing titles) is one of key domains of Web site accessibility. This paper provides an approach allowing the automatic titling of texts (e.g. emails, fora, etc.) relying on the morphosyntactic study of human written titles in a corpus of various texts. The method is developed in four stages: Corpus acquisition, candidate sentences determination for titling, noun phrase extraction in the candidate sentences, and finally, selecting a particular noun phrase to play the role of the text title (ChTitres approach). The method has been evaluated by ten users, and the satisfaction enquiry shows that the titles selected through this process are relevant.

Keywords :

Web sites; information retrieval; text analysis; Web site accessibility; automatic titling; candidate sentences determination; corpus acquisition; electronic documents; information retrieval; noun phrase extraction; text title; Data mining; Electronic mail; Encyclopedias; Humans; Indexes; Internet; Semantics; application; information retrieval; morphosyntactic characteristics; noun phrases; titling;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Soft Computing and Pattern Recognition (SoCPaR), 2010 International Conference of

Conference_Location :

Paris

Print_ISBN :

978-1-4244-7897-2

Type :

conf

DOI :

10.1109/SOCPAR.2010.5686088

Filename :

5686088

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2039375