• DocumentCode
    610070
  • Title

    Quadratic Similarity Queries on Compressed Data

  • Author

    Ingber, Amir ; Courtade, Thomas ; Weissman, Tsachy

  • Author_Institution
    Dept. of Electr. Eng., Stanford Univ., Stanford, CA, USA
  • fYear
    2013
  • fDate
    20-22 March 2013
  • Firstpage
    441
  • Lastpage
    450
  • Abstract
    The problem of performing similarity queries on compressed data is considered. We study the fundamental tradeoff between compression rate, sequence length, and reliability of queries performed on compressed data. For a Gaussian source and quadratic similarity criterion, we show that queries can be answered reliably if and only if the compression rate exceeds a given threshold - the identification rate - which we explicitly characterize. When compression is performed at a rate greater than the identification rate, responses to queries on the compressed data can be made exponentially reliable. We give a complete characterization of this exponent, which is analogous to the error and excess-distortion exponents in channel and source coding, respectively. For a general source, we prove that the identification rate is at most that of a Gaussian source with the same variance. Therefore, as with classical compression, the Gaussian source requires the largest compression rate. Moreover, a scheme is described that attains this maximal rate for any source distribution.
  • Keywords
    channel coding; data compression; query processing; source coding; Gaussian source; channel coding; compressed data; compression rate; excess-distortion exponents; identification rate; quadratic similarity criterion; quadratic similarity queries; query reliability; sequence length; source coding; source distribution; Data compression; Compression; Fundamental limits; Hash; Search; similarity query;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Compression Conference (DCC), 2013
  • Conference_Location
    Snowbird, UT
  • ISSN
    1068-0314
  • Print_ISBN
    978-1-4673-6037-1
  • Type

    conf

  • DOI
    10.1109/DCC.2013.52
  • Filename
    6543080