You are currently browsing the monthly archive for July 2009.
We’re presented with a numerical stream and we want to compute some function where is the frequency of and is an arbitrary function with . One of the many neat ideas in [Alon, Matias, Szegedy] is to use the following basic estimator for this type of problem:
- Pick uniformly at random from
Note that can be computed using space as the data arrives. It’s not hard to show that because,
Hence, computing multiple basic estimators in parallel and averaging appropriately will give a good estimate of . Quite how many basic estimators are required depends on the variance of and this determines the total space used by the algorithm.
Using this approach, it was shown that the -th order frequency moment, , can be approximated up to a factor with probability using only bits of space. This was subsequently improved to an (essentially) optimal result in [Indyk, Woodruff].