You are currently browsing the monthly archive for October 2009.

(Amit Chakrabarti reports from Atlanta on FOCS 2009)

The FOCS 2009 programme proper begins tomorrow. Today, at Georgia Tech, we listened to four hour-long talks by distinguished researchers, celebrating the 50th anniversary of FOCS and the 20th of the university’s ACO program.

  1. We heard Dick Karp talk about what makes an algorithm “great”. The talk surveyed many of the significant milestones in our ongoing journey of discovery. In passing, he mentioned a number of (mostly recently-developed) subareas he didn’t get to dwell upon, one of which was “streaming algorithms”.
  2. Mihalis Yannakakis spoke about the computational aspects of finding equilibria, and the related issues of local search algorithms and convergence. The SQUARE-ROOT-SUM problem made an appearance, as did a joke about the throwing-overboard of Hippasus.
  3. Noga Alon talked about isoperimetric inequalities and their applications in algorithms (focussing on the DISJOINT-PATHS problem) and combinatorics (focussing on finding economical toric spines). Noga recalled publishing his first FOCS paper exactly 25 years ago, and thanked 21 of the 25 PCs since then for treating well his many submissions. He consulted his notes to list the four exceptions. This was easily the most technical of today’s talks. Noga even apologized for this mid-talk, to my surprise (doesn’t this community enjoy the technical details?).
  4. Rounding out the day was a talk by Manuel Blum who hardly talked TCS (as we know it) at all, focussing instead of the deeply philosophical matter of the definition of consciousness. He did not provide an attempt at a crisp answer, but instead gave five vignettes, one of which introduced us to Global Workspace Theory. This talk was unlike any we are used to hearing from a computer scientist. Though it did not suggest any research directions to me, it had the stimulating effect of a good Borges short.

Over dinner, I couldn’t help but comment to my companions on what I wished I’d heard today: What were the controversies in the field 20, 30, 40 years ago? I’m sure the more senior among us could regale us with tales that make the Conceptual Wars of today seem tame!

The last four BSS posts considered streams of numerical values. While there are still many more bite-sized results to cover for such streams, I thought we’d mix things up a bit and switch to graph streams for the next four posts. We’ll consider a stream of unweighted, undirected edges \langle e_1, \ldots, e_m\rangle and try to estimate properties of the graph G=(V,E) defined by this stream subject to the usual constraints of stream computation (sequential access to the stream and limited memory). For this post, we focus on estimating distances.

The results we’ll discuss revolve around constructing a t-spanner, a sub-graph H=(V,E') of G such that for any u,v\in V

d_G(u,v)\leq d_H(u,v) \leq t d_G(u,v)

where d_X(u,v) is the length of the shortest path between u, v\in V in graph X. Obviously, given such a graph we can approximate any distance up to a factor of t.

So how can we construct a t-spanner in the data-stream model? Here’s a very simple algorithm: Intialize H to \emptyset and on receiving edge e=(u,v), add the edge to H if and only if d_H(u,v)>t.

It’s easy to show that H is indeed a t-spanner because, for each edge e=(u,v) on a shortest path in G, H either includes e or a path between u and v of length at most t. The space taken by the algorithm is just the space required to store H. How large can this be? Well, note that H has girth at least t+2, i.e., it contains no cycles of length t+1 or shorter. It can be shown that such a graph has at most O(n^{1+2/(t+1)}) edges if t is odd (see e.g., [Bollobás] or [Indyk’s lecture slides]) and hence we have a stream algorithm using \tilde{O}(n^{1+2/(t+1)}) bits of memory that finds a t approximation for any shortest path distance.

Unfortunately, the time to process each edge in the above algorithm is rather large and this spurred some nice follow-up work including [Feigenbaum et al.], [Baswana], and [Elkin]. It’s now known how to construct a t-spanner in \tilde{O}(n^{1+2/(t+1)}) space and constant worst-case time per edge.

It’s not hard to extend these algorithms to weighted graphs (consider geometrically grouping the edge weights and constructing a spanner for each distinct edge weight) but there’s bad news for directed graphs: even determining if there exists a directed path between two nodes requires \Omega(n^2) bits of memory.

BONUS! Some British English… Back in the UK, we call this


a spanner. Apparently that’s not the usual term in US English and I found this out after giving a talk at SODA back in 2005. The talk went reasonably well but, during the coffee break, numerous people asked why I’d kept on displaying pictures of wrenches throughout my talk. Whoops.

In technical writing, I normally endeavor [sic] to use US English (although I did once engage in some “colouring of paths” in a paper with a Canadian co-author). But I still think “math and stat” sounds funny :)


A research blog about data streams and related topics.

Recently Tweeted