مرکز منطقه ای اطلاع رساني علوم و فناوري - Finding Correlations in Subquadratic Time, with Applications to Learning Parities and Juntas

DocumentCode :

579972

Title :

Finding Correlations in Subquadratic Time, with Applications to Learning Parities and Juntas

Author :

Valiant, Gregory

Author_Institution :

UC Berkeley, Berkeley, CA, USA

fYear :

2012

fDate :

20-23 Oct. 2012

Firstpage :

Lastpage :

Abstract :

Given a set of n d-dimensional Boolean vectors with the promise that the vectors are chosen uniformly at random with the exception of two vectors that have Pearson-correlation ρ (Hamming distance d · 1-ρ/2), how quickly can one find the two correlated vectors? We present an algorithm which, for any constants ε, ρ >; 0 and d >;>; logn/ρ² , finds the correlated pair with high probability, and runs in time O(n 3ω/4 + ϵ) <; O(n^1.8), where w <; 2.38 is the exponent of matrix multiplication. Provided that d is sufficiently large, this runtime can be further reduced. These are the first subquadratic-time algorithms for this problem for which ρ does not appear in the exponent of n, and improves upon O(n^2-O(ρ)), given by Paturi et al. [15], Locality Sensitive Hashing (LSH) [11] and the Bucketing Codes approach [6]. Applications and extensions of this basic algorithm yield improved algorithms for several other problems: ApproximateClosest Pair: For any sufficiently small constant ϵ >; 0, given n vectors in R^d, our algorithm returns a pair of vectors whose Euclidean distance differs from that of the closest pair by a factor of at most 1+ϵ, and runs in time O(n^2-Θ(√ϵ)). The best previous algorithms (including LSH) have runtime O(n^2-O(ϵ)). Learning Sparse Parity with Noise: Given samples from an instance of the learning parity with noise problem where each example has length n, the true parity set has size at most k <;<; n, and the noise rate is η, our algorithm identifies the set of k indices in time n ω+ϵ/3 ^k poly(1/1-2η) <; n^0.8kpoly(1/1-2η). This is the first algorithm with no depenJence on η in the exponent of n, aside from the trivial brute-force algorithm. Learning k-Juntas wi- h Noise: Given uniformly random length n Boolean vectors, together with a label, which is some function of just k <;<; n of the bits, perturbed by noise rate η, return the set of relevant indices. Leveraging the reduction of Feldman et al. [7] our result for learning k-parities implies an algorithm for this problem with runtime n ω+ϵ/3 ^k poly(1/1-2η) <; n^0.8k poly(1/1-2η), 2 which improves on the previous best of >; n^k(1-2/2k)poly( 1/1-2η ), from [8]. Learning k-Juntas without Noise:1 Our results for learning sparse parities with noise imply an algorithm for learning juntas without noise with runtime n ω+ϵ/4^k poly(n) <; n^0.6 kpoly(n), which improves on the runtime n ω+1/ω poly(n) ≈ n^0.7k poly(n) of Mossel n et al. [13].

Keywords :

Boolean algebra; computational complexity; computational geometry; cryptography; learning (artificial intelligence); matrix multiplication; vectors; Euclidean distance; Hamming distance; LSH; Pearson-correlation; approximate closest pair; bucketing codes; correlated vectors; correlation finding; d-dimensional Boolean vectors; learning k-juntas; learning k-parities; learning sparse parity; locality sensitive hashing; matrix multiplication; noise problem; noise rate; subquadratic-time algorithms; Approximation algorithms; Chebyshev approximation; Correlation; Noise; Noise measurement; Runtime; Vectors; Correlation; closest pair; learning juntas; learning parity with noise; locality sensitive hashing; metric embedding; nearest neighbor;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Foundations of Computer Science (FOCS), 2012 IEEE 53rd Annual Symposium on

Conference_Location :

New Brunswick, NJ

ISSN :

0272-5428

Print_ISBN :

978-1-4673-4383-1

Type :

conf

DOI :

10.1109/FOCS.2012.27

Filename :

6375277

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=579972