مرکز منطقه ای اطلاع رساني علوم و فناوري - Comparison of Hash Table Verses Lexical Transducer Based Implementations of Urdu Lexicon

DocumentCode :

2850943

Title :

Comparison of Hash Table Verses Lexical Transducer Based Implementations of Urdu Lexicon

Author :

Rizvi, S. M Jafar ; Hussain, Mutawarra ; Qaiser, Naeem

Author_Institution :

Department of Computer & Information Sciences, Pakistan Institute of Engineering & Applied Sciences (PIEAS), Islamabad, Pakistan. JafarRizvi@Gmail.com

fYear :

2004

fDate :

30-31 Dec. 2004

Firstpage :

Lastpage :

Abstract :

Lexicon is the base for many natural language processing applications. This paper describes and compares the approaches for the Urdu lexicon implementation. Raw lexicon as a simple word list is expensive both for search time and space. Using hash table with appropriate hash functions fast searching times, close to perfect hashing are achieved. Hashing results in a simpler acceptable lexicon design on the cost of some extra space. Lexicon storage using trie reduces both search time and size. Further enhancement is achieved by converting trie into directed acyclic word graph. Using which automatic separation of word stems from prefixes and suffixes is performed. Only high frequency prefixes and suffixes having productive morphological information are retained for the final lexical transducer. Comparison reveals that lexical transducer implementation is relatively more complex than hashing, due to morphological analysis requirement, but it is efficient for both search time and storage space requirements.

Keywords :

Finite State Automata; Hash Table; Lexical Transducer; Urdu Lexicon; Application software; Automata; Costs; Frequency; Information retrieval; Morphology; Natural language processing; Speech synthesis; Synthesizers; Transducers; Finite State Automata; Hash Table; Lexical Transducer; Urdu Lexicon;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Engineering, Sciences and Technology, Student Conference On

Print_ISBN :

0-7803-8871-2

Type :

conf

DOI :

10.1109/SCONES.2004.1564764

Filename :

1564764

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2850943