DocumentCode :
1841749
Title :
On international business intelligence Out-Of-Vocabulary processing based on sentence-aligned web corpus
Author :
He, Yanxiang ; Tian, Ye ; Lin, Lu ; Du, Yanyan ; Deng, Jie
Author_Institution :
Comput. Sch., Wuhan Univ., Wuhan, China
Volume :
1
fYear :
2011
fDate :
13-15 May 2011
Firstpage :
451
Lastpage :
455
Abstract :
International business intelligence processing is an important problem of cross-disciplinary research in artificial intelligence. The recognition of Out-Of-Vocabulary (OOV in short) in international commercial activities and its derivate OOV phrase brings challenge to widely used machine translation technology. Electronic dictionary with a fixed lexicon cannot catch up with the fast increase of international commercial OOV phrase. In this paper, we present a recognition and translation technology for OOV phrases in international business intelligence based on sentence-aligned web corpus. We first obtain the latest and most related textual resource from the Internet and build up a sentence-aligned corpus. Then calculate the relevancy of adjacent word string by Markov model to get a maximum likelihood of segmentation, and determine the OOV and OOV phrase in such business context. Then wipe off the redundancy and calculate the probabilities and weight of co-occurrence word pairs. Thus we have the OOV word pair and the translation of OOV phrase in business intelligence. Experiments show a good result in international business domain and timely update.
Keywords :
Internet; Markov processes; competitive intelligence; dictionaries; language translation; natural language processing; Internet; Markov model; artificial intelligence; electronic dictionary; international business intelligence; international commercial activities; machine translation technology; out-of-vocabulary processing; sentence aligned Web corpus; textual resource; Accuracy; Business; Dictionaries; Markov processes; Probability; Semantics; Vocabulary; Out-Of-Vocabulary; international business intelligence; natural language processing; sentence-aligned corpus;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Business Management and Electronic Information (BMEI), 2011 International Conference on
Conference_Location :
Guangzhou
Print_ISBN :
978-1-61284-108-3
Type :
conf
DOI :
10.1109/ICBMEI.2011.5916970
Filename :
5916970
Link To Document :
بازگشت