## Faster Algorithms for Semi-Matching Problems

Author Jittat Fakcharoenphol, Bundit Lekhanukit, Danupon Nanongkai

Conference: ICALP 2010

Abstract:

We consider the problem of finding semi-matching in bipartite graphs, a problem also extensively studied under various names in the scheduling literature. We give faster algorithms for both weighted and unweighted case.

For the weighted case, we give an $O(nm\log n)$-time algorithm, where $n$ is the number of vertices and $m$ is the number of edges, by exploiting geometric structure of the problem. This improves the classical $O(n^3)$ algorithms by Horn [Operations Research 1973] and Brono, Coffman and Sethi [Communications of the ACM 1974].

For the unweighted case, the bound could be improved even further. We give a simple divide-and-conquer algorithm which runs in time $O(\sqrt{n}m\log n)$, improving two previous $O(nm)$-time algorithms by Abraham [MSc thesis, University of Glasgow 2003] and Harvey, Ladner, Lovasz and Tamir [WADS 2003 and Journal of Algorithms 2006]. We also extend this algorithm to solve the Balance Edge Cover problem in time $O(\sqrt{n}m\log n)$, improving the previous $O(nm)$-time algorithm by Harada, Ono, Sadakane and Yamashita [ISAAC 2008].

## Randomized Multi-pass Streaming Skyline Algorithms (VLDB’09)

Author (ordered alphabetically): Atish Das Sarma, Ashwin Lall, Danupon Nanongkai, Jun Xu

Journal: Soon

Conference: VLDB 2009: 35th International Conference on Very Large Databases [wiki]

Abstract:

We consider external algorithms for skyline computation without pre-processing. Our goal is to develop an algorithm with a good worst case guarantee while performing well on average. Due to the nature of disks, it is desirable that such algorithms access the input as a stream (even if in multiple passes). Using the tools of randomness, proved to be useful in many applications, we present an efficient multi-pass streaming algorithm, RAND, for skyline computation. As far as we are aware, RAND is the first randomized skyline algorithm in the literature.

RAND is near-optimal for the streaming model, which we prove via a simple lower bound. Additionally, our algorithm is distributable and can handle partially ordered domains on each attribute. Finally, we demonstrate the robustness of RAND via extensive experiments on both real and synthetic datasets. RAND is comparable to the existing algorithms in average case and additionally tolerant to simple modifications of the data, while other algorithms degrade considerably with such variation.

Update History

[v1] August 20, 2009 (Conference version)

## Fast Distributed Random Walks (PODC’09)

Authors: Atish Das Sarma, Danupon Nanongkai, Gopal Pandurangan

Journal: Journal of the ACM 2013

Conference: PODC 2009: 28th Annual ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computin[wiki]

Abstract:

Performing random walks in networks is a fundamental primitive that has found applications in many areas of computer science, including distributed computing. In this paper, we focus on the problem of performing random walks efficiently in a distributed network. Given bandwidth constraints, the goal is to minimize the number of rounds required to obtain a random walk sample.

All previous algorithms that compute a random walk sample of length $\ell$ as a subroutine always do so naively, i.e., in $O(\ell)$ rounds. The main contribution of this paper is a fast distributed
algorithm for performing random walks. We show that  a random walk sample of length $\ell$ can be computed in $\tilde{O}(\ell^{2/3}D^{1/3})$ rounds on an undirected unweighted network, where $D$ is the diameter of the network. ($\tilde{O}$ hides $\frac{log{n}}{\delta}$ factors where $n$ is the number of nodes in the network and $\delta$ is the minimum degree.) For small diameter graphs, this is a significant improvement over the naive $O(\ell)$ bound. We also show that our algorithm  can be applied to speedup the more general Metropolis-Hastings sampling.

We extend our algorithms to perform a large number, $k$, of random walks efficiently. We show how $k$ destinations can be sampled in $\tilde{O}((k\ell)^{2/3}D^{1/3})$ rounds if $k\leq \ell^2$ and $\tilde{O}((k\ell)^{1/2})$ rounds otherwise. We  also present faster algorithms for performing random walks of length larger than (or equal to) the mixing time of the underlying graph. Our techniques can be useful in speeding up distributed algorithms for a variety of applications that use random walks as a subroutine.

Keywords: Random walks, Random sampling, Distributed algorithm, Metropolis-Hastings sampling.

Update History

[v1] May 31, 2009 (Conference version)
[v2] Feb 18, 2013 (Journal version posted)