• DocumentCode
    2000117
  • Title

    Information retrieval: A new multilingual stemmer based on a statistical approach

  • Author

    Gadri, Said ; Moussaoui, Abdelouahab

  • Author_Institution
    Dept. of ICST, Univ. of M´sila, M´sila, Algeria
  • fYear
    2015
  • fDate
    25-27 May 2015
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Stemming is a technique used to reduce inflected and derived words to their basic forms (stem or root). It is a very important step of pre-processing in text mining, and generally used in many areas of research such as: Natural language Processing NLP, Text Categorization TC, Text Summarizing TS, Information Retrieval IR, and other tasks in text mining. Stemming is frequently useful in text categorization to reduce the size of terms vocabulary, and in information retrieval to improve the search effectiveness and then gives us relevant results. In this paper, we propose a new multilingual stemmer based on the extraction of word root and in which we use the technique of n-grams. We validated our stemmer on three languages which are: Arabic, French and English.
  • Keywords
    data mining; information retrieval; natural language processing; statistical analysis; text analysis; vocabulary; Arabic language; English language; French language; NLP; information retrieval; multilingual stemmer; natural language processing; statistical approach; terms vocabulary; text categorization; text mining; text summarizing; word root extraction; Error analysis; Information retrieval; Integrated circuits; Natural language processing; Statistical analysis; Text categorization; Text mining; Bigrams technique; Information retrieval; Machine learning; Natural language processing; Root extraction; Stemming; Text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Control, Engineering & Information Technology (CEIT), 2015 3rd International Conference on
  • Conference_Location
    Tlemcen
  • Type

    conf

  • DOI
    10.1109/CEIT.2015.7233113
  • Filename
    7233113