• DocumentCode
    2914115
  • Title

    Optimizing alphabet using genetic algorithms

  • Author

    Platos, Jan ; Kromer, Pavel

  • Author_Institution
    Dept. of Comput. Sci., VSB-Tech. Univ. of Ostrava, Ostrava, Czech Republic
  • fYear
    2011
  • fDate
    22-24 Nov. 2011
  • Firstpage
    498
  • Lastpage
    503
  • Abstract
    Data compression algorithms were usually designed for data processing symbol by symbol. The input symbols of these algorithms are usually taken from the ASCII table, i.e. the size of the input alphabet is 256 symbols which are representable by 8-bit numbers. Several other techniques were developed-syllable-based compression, which uses the syllable as a basic compression symbol, and word-based compression, which uses words as basic symbols. These three approaches are strictly bounded and no overlap is allowed. This may be a problem because it may be helpful to have an overlap between them and use a character-based approach with a few symbols as a sequence of characters. This paper describes an algorithm that looks for the optimal alphabet for different text files. The alphabet may contain characters and 2-grams.
  • Keywords
    data compression; genetic algorithms; 2-gram; ASCII table; basic compression symbol; character-based approach; data compression algorithm; data processing symbol; genetic algorithm; syllable-based compression; word-based compression; Algorithm design and analysis; Biological cells; Compression algorithms; Data compression; Dictionaries; Genetic algorithms; Genetics; alphabet optimization; data compression; genetic algorithm; lzw;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems Design and Applications (ISDA), 2011 11th International Conference on
  • Conference_Location
    Cordoba
  • ISSN
    2164-7143
  • Print_ISBN
    978-1-4577-1676-8
  • Type

    conf

  • DOI
    10.1109/ISDA.2011.6121705
  • Filename
    6121705