DocumentCode
3779460
Title
TALAA-ASC: A sentence compression corpus for Arabic
Author
Riadh Belkebir;Ahmed Guessoum
Author_Institution
Natural Language Processing and Machine Learning Research Group, Laboratory of Research in Artificial Intelligence, Computer Science Department, Universit? des Sciences et de la Technologie Houari Boumediene (USTHB), Algiers, Algeria
fYear
2015
Firstpage
1
Lastpage
8
Abstract
A lot of work has been performed for many languages other than Arabic in sentence compression. Unfortunately, there is a lack of effort devoted to Arabic sentence compression. One of the reasons behind the lack of work in Arabic sentence compression is the absence of Arabic sentence compression corpora. In order to build and evaluate sentence compression systems, parallel corpora consisting of source sentences and their corresponding compressions are needed. In this paper, we present TALAA-ASC, the first Arabic sentence compression corpus. We present the methodology we followed in order to construct the corpus. We also give the different statistics and analyses that we have performed on this corpus.
Keywords
"XML","Buildings","Guidelines","Natural language processing","Supervised learning","Integer linear programming","Noise measurement"
Publisher
ieee
Conference_Titel
Computer Systems and Applications (AICCSA), 2015 IEEE/ACS 12th International Conference of
Electronic_ISBN
2161-5330
Type
conf
DOI
10.1109/AICCSA.2015.7507228
Filename
7507228
Link To Document