• DocumentCode
    3511026
  • Title

    Research on Text Feature Extraction Based on Hybrid Parallel Genetic Algorithm

  • Author

    Dai, Wenhua ; Jiao, Cuizhen ; He, Tingting

  • Author_Institution
    Dept. of Comput., Xianning Coll., Xianning
  • fYear
    2007
  • fDate
    21-25 Sept. 2007
  • Firstpage
    5581
  • Lastpage
    5584
  • Abstract
    Issues of synonymy and strong relational semantic information increase the feature dimension of text vector, which embarrasses the efficiency and precision of text classification. In order to decrease the feature dimension of text vector, a method of text feature extraction based on hybrid parallel genetic clustering algorithm was proposed in this paper. Firstly, K-means algorithm is used to perform thick-granularity clustering for feature words; successively, hybrid parallel genetic algorithm is used to perform thin-granularity clustering for feature words; finally, feature words in each cluster are analyzed and compressed to form feature word set which reflects the feature of text classes and semantic information. The experiments validate our method for text feature extraction is effective.
  • Keywords
    classification; data compression; feature extraction; genetic algorithms; parallel algorithms; text analysis; K-means algorithm; feature word compression; parallel genetic clustering algorithm; relational semantic information; text classification; text feature extraction; text synonymous word; Clustering algorithms; Computer science; Concurrent computing; Educational institutions; Educational technology; Feature extraction; Genetic algorithms; Helium; Large scale integration; Text categorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Wireless Communications, Networking and Mobile Computing, 2007. WiCom 2007. International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4244-1311-9
  • Type

    conf

  • DOI
    10.1109/WICOM.2007.1367
  • Filename
    4341142