You are currently browsing the monthly archive for October 2010.

Today, we consider estimating the median of a set of n values given a single pass over a stream formed by randomly permuting the set. Seems like a pretty simple problem, right?

It’s known that we can find an O(\sqrt{n})-approximate median, i.e., an element with rank n/2 \pm O(\sqrt{n}), in polylog space, with probability 9/10 where the probability is taken over both the coin flips of the algorithm and the random permutation. However, the best known lower bound states that finding a n^\gamma-approximate median requires \Omega(n^{1/2-3\gamma/2}) space [Guha, McGregor]. If this bound is optimal, it would be possible to find a n^{1/3}-approximate median in polylogarithmic space. Does there exist such an algorithm or can the lower bound be improved?

Background. The exact median can be found in O(\sqrt{n}) space [Munro, Paterson]. It seems reasonable to conjecture that \Theta(n^{1/2-\gamma}) space is necessary and sufficient to find a O(n^\gamma)-approximate median.

(Disclaimer: I’ve suppressed some poly-log and poly-constant terms above. And, actually, you can shave-off one of the O(\log n) terms I didn’t mention in the lower bound if you consider some public random bits in the argument.)


A research blog about data streams and related topics.

Recently Tweeted