FOCS 1999
An Approximate L 1 -Difference Algorithm for Massive Data Streams
Abstract
We give a space-efficient, one-pass algorithm for approximating the L/sup 1/ difference /spl Sigma//sub i/|a/sub i/-b/sub i/| between two functions, when the function values a/sub i/ and b/sub i/ are given as data streams, and their order is chosen by an adversary. Our main technical innovation is a method of constructing families {V/sub j/} of limited independence random variables that are range summable by which we mean that /spl Sigma//sub j=0//sup c-1/ V/sub j/(s) is computable in time polylog(c), for all seeds s. These random variable families may be of interest outside our current application domain, i. e. , massive data streams generated by communication networks. Our L/sup 1/-difference algorithm can be viewed as a "sketching" algorithm, in the sense of (A. Broder et al. , 1998), and our algorithm performs better than that of Broder et al. , when used to approximate the symmetric difference of two sets with small symmetric difference.
Authors
Keywords
Context
- Venue
- IEEE Symposium on Foundations of Computer Science
- Archive span
- 1975-2025
- Indexed papers
- 3809
- Paper id
- 808401617865808394