Title :
Transliteration Based Bengali Text Compression using Huffman principle
Author :
Hossain, M. Mofazzal ; Habib, Ahsan ; Rahman, Md Saifur
Author_Institution :
Comput. Sci. & Eng., Shahjalal Univ. of Sci. & Technol., Sylhet, Bangladesh
Abstract :
In this paper, we propose a new technique to compress more symbolic language like Bengali through less symbolic language like English using Huffman principle. First we transliterate the text of more symbolic language to less symbolic language, and then we apply Huffman principle on the transliterated text. We have also shown that our transliteration based proposed method outperform the existing basic Huffman technique for every piece of Bengali text and significant compression ratio can be achieved.
Keywords :
data compression; natural language processing; text analysis; Huffman principle; symbolic language; transliteration based Bengali text compression; Computer science; Conferences; Data compression; Encoding; Floors; Informatics; Vegetation; ASCII code; Avro; Bengali text; Data compression; Huffman principle; Transliteration; UNICODE;
Conference_Titel :
Informatics, Electronics & Vision (ICIEV), 2014 International Conference on
Conference_Location :
Dhaka
Print_ISBN :
978-1-4799-5179-6
DOI :
10.1109/ICIEV.2014.6850745