Readers with a particularly long memory may recall that we’re in the midst of a series of “bite-sized results” in data stream algorithms. Back in October, we discussed graph spanners and now it’s time for some graph matchings.

So consider a stream of undirected weighted edges $\langle (e_1,w_{e_i}), \ldots, (e_m,w_{e_m})\rangle$ where $w_{e}$ is the weight of edge $e$. Can we find a matching, i.e., a subset of the edges that don’t share any endpoints, such that the sum of the associated weights is maximized?

Here’s a really simple algorithm that maintains a matching as the edges are processed:

• The initial matching is the empty matching
• When processing edge $e$, the weight of $e$ is compared with the total weight $W$ of the conflicting edges, i.e., the edges in the currently stored matching that share an endpoint with $e$. If $w_e>(1+\gamma) W$ then $e$ is added to the matching and the conflicting edges are removed.

It makes a nice exercise to prove that the above algorithm achieves a $3+1/\gamma+2\gamma$ approximation ratio (see pages 149 and 150 of this thing if you get stuck) and choosing the optimal $\gamma$ gives you a 5.828 approximation. Incidentally, this algorithm later found use in a paper on auctioning ad slots [Constantin, Feldman, Muthukrishnan, Pal].

Subsequently, a 5.585 approximation algorithm was presented in [Zelke]. The main idea was to remember some of the edges that were ejected from the matching in the second step of the above algorithm. These edges are potentially added back into the matching at some later point. While a natural idea, it requires substantially more analysis to show the improved approximation ratio.