The last four BSS posts considered streams of numerical values. While there are still many more bite-sized results to cover for such streams, I thought we’d mix things up a bit and switch to graph streams for the next four posts. We’ll consider a stream of unweighted, undirected edges and try to estimate properties of the graph
defined by this stream subject to the usual constraints of stream computation (sequential access to the stream and limited memory). For this post, we focus on estimating distances.
The results we’ll discuss revolve around constructing a -spanner, a sub-graph
of
such that for any
where is the length of the shortest path between
in graph
. Obviously, given such a graph we can approximate any distance up to a factor of
So how can we construct a -spanner in the data-stream model? Here’s a very simple algorithm: Intialize
to
and on receiving edge
, add the edge to
if and only if
.
It’s easy to show that is indeed a
-spanner because, for each edge
on a shortest path in
,
either includes
or a path between
and
of length at most
. The space taken by the algorithm is just the space required to store
. How large can this be? Well, note that
has girth at least
, i.e., it contains no cycles of length
or shorter. It can be shown that such a graph has at most
edges if
is odd (see e.g., [Bollobás] or [Indyk's lecture slides]) and hence we have a stream algorithm using
bits of memory that finds a
approximation for any shortest path distance.
Unfortunately, the time to process each edge in the above algorithm is rather large and this spurred some nice follow-up work including [Feigenbaum et al.], [Baswana], and [Elkin]. It’s now known how to construct a -spanner in
space and constant worst-case time per edge.
It’s not hard to extend these algorithms to weighted graphs (consider geometrically grouping the edge weights and constructing a spanner for each distinct edge weight) but there’s bad news for directed graphs: even determining if there exists a directed path between two nodes requires bits of memory.
BONUS! Some British English… Back in the UK, we call this

a spanner. Apparently that’s not the usual term in US English and I found this out after giving a talk at SODA back in 2005. The talk went reasonably well but, during the coffee break, numerous people asked why I’d kept on displaying pictures of wrenches throughout my talk. Whoops.
In technical writing, I normally endeavor [sic] to use US English (although I did once engage in some “colouring of paths” in a paper with a Canadian co-author). But I still think “math and stat” sounds funny :)

6 comments
Comments feed for this article
October 16, 2009 at 1:40 am
kunal
A spanner sounds so much better than a wrench.
So are you suggesting that in the US, using the (fine) expression “a spanner in the works” would lead to questions during the coffee break?
October 16, 2009 at 8:10 pm
Andrew
Well…
a) “a wrench in the works” has 3,090,000 hits on Google
b) “a spanner in the works” has 804,000 hits on Google
which surprised me since I wasn’t conscious of ever having heard the first expression. Perhaps my automatic translation is working too well.
October 19, 2009 at 6:27 pm
Kamalika
Wow! I didn’t know that they called them wrenches here — I had also been happily using the term spanner so far…although never in a talk….
February 22, 2010 at 6:04 am
Bite-sized streams: Graph Matchings (Part 1) « the polylogblog
[...] series of “bite-sized results” in data stream algorithms. Back in October, we discussed graph spanners and now it’s time for some graph [...]
April 6, 2010 at 2:52 am
Mert
You mention that finding u,v connectivity requires quadratic bits of memory. Can you elaborate a bit about this? In what model is this?
Thank you
April 6, 2010 at 11:51 am
anon
The quadratic lower bound is for the one-pass model when the edges are directed. The lower bound is a reduction from indexing: consider a bipartite graph on L\cup R where the edges are directed from left to right. Add nodes s, t and arcs (s,l), (r,t) for some l \in L, r \in R. Then there is a directed path from s to t iff (l,r) is in the original bipartite graph and this graph could have had any subset of O(n^2) edges.