Title :
Ranking vs. Classification: A Case Study in Mining Organization Name Translation from Snippets
Author :
Yang, Muyun ; Shi, Zhenyong ; Li, Sheng ; Zhao, Tiejun ; Qi, Haoliang
Author_Institution :
Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., Harbin, China
Abstract :
Both classification and ranking strategy have been reported positively in mining the named entity (NE) translation from the snippets re-turned by the Web search engine. Taking the most challenging issue of the organization name and its translation as an example, this paper conducts a contrastive study on the two strategies under SVM framework. We empirically show that the method of translation ranking achieves the best performance in various data settings, with the best top-1 precision up to 65.75%. We conclude that, compared with the classification strategy, the ranking strategy is more suitable in such snippet based translation mining, in which the unbalance data issue prevails.
Keywords :
Internet; data mining; language translation; natural language processing; search engines; SVM framework; Web search engine; classification strategy; data mining; organization name translation mining; snippet mining; support vector network; translation ranking method; Computer science; Data mining; Internet; Machine learning; Modular construction; Search engines; Support vector machine classification; Support vector machines; Web pages; Web search; SVM; classification; organization name translation; ranking; snippet mining;
Conference_Titel :
Asian Language Processing, 2009. IALP '09. International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-0-7695-3904-1
DOI :
10.1109/IALP.2009.73