• DocumentCode
    1124061
  • Title

    Zipping out relevant information

  • Author

    Benedetto, Dario ; Caglioti, Emanuele ; Loreto, Vittorio

  • Author_Institution
    Math. Dept., La Sapienza Univ., Rome, Italy
  • Volume
    5
  • Issue
    1
  • fYear
    2003
  • Firstpage
    80
  • Lastpage
    85
  • Abstract
    Although the abundance of information and its accessibility represents an important cultural advance, it also introduces a new challenge: retrieving relevant information. However, the growing body of available data provides an ideal test bed for theoretical constructions and models. This opportunity has stimulated considerable interest from researchers in many different communities-physicists, mathematicians, economists, and statisticians, to name a few. In this spirit, we seek to discover the most suitable tools for examining large masses of data and extracting useful information from it. The information-theoretic method described in this article applies to any kind of corpora of character strings, independent of the type of coding behind them. The method has great potential for fields where human intuition might fail: DNA and protein sequences, geological time series, stock market data, medical monitoring, and so on.
  • Keywords
    information needs; information retrieval; information theory; scientific information systems; DNA sequences; geological time series; information extraction; information theoretic method; medical monitoring; protein sequences; relevant information retrieval; stock market data; Cultural differences; DNA; Data mining; Geology; Global communication; Humans; Information retrieval; Proteins; Sequences; Testing;
  • fLanguage
    English
  • Journal_Title
    Computing in Science & Engineering
  • Publisher
    ieee
  • ISSN
    1521-9615
  • Type

    jour

  • DOI
    10.1109/MCISE.2003.1166556
  • Filename
    1166556