• DocumentCode
    3656568
  • Title

    Efficient storage and retrieval of very large document databases

  • Author

    Fumihiro Matsuo;Shouichi Futamura;Takeshi Shinohara

  • Author_Institution
    Computer Center, Kyushu University 91, Hakozaki, Fukuoka 812, Japan
  • fYear
    1986
  • Firstpage
    456
  • Lastpage
    463
  • Abstract
    The authors have developed an information retrieval system named AIR (Augmented Information Retrieval system), which might be one of the most efficient systems for very large document databases. AIR can store the document data compactly and retrieve them quickly. The techniques bringing AIR to the high efficiency, the data compression, the quick keyword index, and the automatic keyword selection, are discussed. These techniques, which are based on the statistical properties of word occurrence, are fairly simple, so that the information retrieval systems employing them can be implemented with ease. The data compression technique reduces English text by a factor of 4. The quick keyword index decreases the average number of disk accesses to retrieve a keyword to about 0.3. The automatic keyword selection technique roughly halves both the number of different keywords and the size of the inverted file with only 2% loss of retrieval power.
  • Keywords
    "Indexes","Information retrieval","Data compression","Vegetation","Natural languages","Decoding"
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 1986 IEEE Second International Conference on
  • Print_ISBN
    978-0-8186-0655-7
  • Type

    conf

  • DOI
    10.1109/ICDE.1986.7266252
  • Filename
    7266252