The last four BSS posts considered streams of numerical values. While there are still many more bite-sized results to cover for such streams, I thought we’d mix things up a bit and switch to graph streams for the next four posts. We’ll consider a stream of unweighted, undirected edges and try to estimate properties of the graph defined by this stream subject to the usual constraints of stream computation (sequential access to the stream and limited memory). For this post, we focus on estimating distances.

The results we’ll discuss revolve around constructing a *-spanner*, a sub-graph of such that for any

where is the length of the shortest path between in graph . Obviously, given such a graph we can approximate any distance up to a factor of

So how can we construct a -spanner in the data-stream model? Here’s a very simple algorithm: Intialize to and on receiving edge , add the edge to if and only if .

It’s easy to show that is indeed a -spanner because, for each edge on a shortest path in , either includes or a path between and of length at most . The space taken by the algorithm is just the space required to store . How large can this be? Well, note that has girth at least , i.e., it contains no cycles of length or shorter. It can be shown that such a graph has at most edges if is odd (see e.g., [Bollobás] or [Indyk's lecture slides]) and hence we have a stream algorithm using bits of memory that finds a approximation for any shortest path distance.

Unfortunately, the time to process each edge in the above algorithm is rather large and this spurred some nice follow-up work including [Feigenbaum et al.], [Baswana], and [Elkin]. It’s now known how to construct a -spanner in space and constant worst-case time per edge.

It’s not hard to extend these algorithms to weighted graphs (consider geometrically grouping the edge weights and constructing a spanner for each distinct edge weight) but there’s bad news for directed graphs: even determining if there exists a directed path between two nodes requires bits of memory.

**BONUS! Some British English…** Back in the UK, we call this

a spanner. Apparently that’s not the usual term in US English and I found this out after giving a talk at SODA back in 2005. The talk went reasonably well but, during the coffee break, numerous people asked why I’d kept on displaying pictures of wrenches throughout my talk. Whoops.

In technical writing, I normally endeavor [sic] to use US English (although I did once engage in some “colouring of paths” in a paper with a Canadian co-author). But I still think “math and stat” sounds funny :)

## 6 comments

Comments feed for this article

October 16, 2009 at 1:40 am

kunalA spanner sounds so much better than a wrench.

So are you suggesting that in the US, using the (fine) expression “a spanner in the works” would lead to questions during the coffee break?

October 16, 2009 at 8:10 pm

AndrewWell…

a) “a wrench in the works” has 3,090,000 hits on Google

b) “a spanner in the works” has 804,000 hits on Google

which surprised me since I wasn’t conscious of ever having heard the first expression. Perhaps my automatic translation is working too well.

October 19, 2009 at 6:27 pm

KamalikaWow! I didn’t know that they called them wrenches here — I had also been happily using the term spanner so far…although never in a talk….

February 22, 2010 at 6:04 am

Bite-sized streams: Graph Matchings (Part 1) « the polylogblog[...] series of “bite-sized results” in data stream algorithms. Back in October, we discussed graph spanners and now it’s time for some graph [...]

April 6, 2010 at 2:52 am

MertYou mention that finding u,v connectivity requires quadratic bits of memory. Can you elaborate a bit about this? In what model is this?

Thank you

April 6, 2010 at 11:51 am

anonThe quadratic lower bound is for the one-pass model when the edges are directed. The lower bound is a reduction from indexing: consider a bipartite graph on L\cup R where the edges are directed from left to right. Add nodes s, t and arcs (s,l), (r,t) for some l \in L, r \in R. Then there is a directed path from s to t iff (l,r) is in the original bipartite graph and this graph could have had any subset of O(n^2) edges.