## Independent Set, Induced Matching, and Pricing: Connections and Tight (Subexponential Time) Approximation Hardnesses

We present a series of almost settled inapproximability results for three fundamental problems. The first in our series is the {\em subexponential-time} inapproximability of the {\em maximum independent set} problem, a question studied in the area of {\em parameterized complexity}. The second is the hardness of approximating the {\em maximum induced matching} on bounded-degree bipartite graphs. The last in our series is the tight hardness of approximating the {\em $k$-hypergraph pricing} problem, a fundamental problem arising from the area of {\em algorithmic game theory}. In particular, assuming the Exponential Time Hypothesis, our two main results are:

• For any $r$ larger than some constant, any $r$-approximation algorithm for the maximum independent set problem must run in time at least $2^{n^{1-\epsilon}/r^{1+\epsilon}}$. This nearly matches the upper bound of $2^{n/r}$ [Cygan et al:  Manuscript 2008]. It also improves some hardness results in the domain of parameterized complexity (e.g. [Escoffier et al: Manuscript 2012] and [Chitnis et al: Manuscript 2013]).
• For any $k$ larger than some constant, there is no polynomial time $\min \{k^{1-\epsilon}, n^{1/2-\epsilon}\}$-approximation algorithm for the $k$-hypergraph pricing problem , where $n$ is the number of vertices in an input graph. This almost matches the upper bound of  $\min \{O(k), \tilde O(\sqrt{n})\}$  (by Balcan and Blum in [Balcan and Blum: Theory of Computing 2007] and an algorithm in this paper).

We note an interesting fact that, in contrast to $n^{1/2-\epsilon}$ hardness, the $k$-hypergraph pricing problem admits $n^{\delta}$ approximation for any $\delta >0$ in quasi-polynomial time. This puts pricing problem in a rare approximability class in which approximability thresholds can be improved significantly by allowing algorithms to run in quasi-polynomial time.
The proofs of our hardness results rely on several unexpectedly tight connections between the three problems. First, we establish a connection between the first and second problems by proving a new graph-theoretic property related to an {\em induced matching number} of dispersers. Then, we show that the $n^{1/2-\epsilon}$ hardness of the last problem follows from nearly tight {\em subexponential time} inapproximability of the first problem, illustrating a rare application of the second type of inapproximability result to the first one. Finally, to prove the subexponential-time inapproximability of the first problem, we construct a new PCP with several properties; it is sparse and has nearly-linear size, large degree, and small free-bit complexity. Our PCP requires no ground-breaking ideas but rather a very careful assembly of the existing ingredients in the PCP literature.

## A Tight Lower Bound on Distributed Random Walk Computation

Author: Danupon Nanongkai, Atish Das Sarma, Gopal Pandurangan
(Author names are NOT in alphabetical order. )

Conference: PODC 2012

Journal:

Abstract:

We consider the problem of performing a random walk in a distributed network. Given bandwidth constraints, the goal of the problem is to minimize the number of rounds required to obtain a random walk sample. Das Sarma et al. [PODC’10] show that a random walk of length $\ell$ on a network of diameter $D$ can be performed in $\tilde O(\sqrt{\ell D}+D)$ time. A major question left open is whether there exists a faster algorithm, especially whether the multiplication of $\sqrt{\ell}$ and $\sqrt{D}$ is necessary.
In this paper, we show a tight unconditional lower bound on the time complexity of distributed random walk computation. Specifically, we show that for any $n$, $D$, and $D\leq \ell \leq (n/(D^3\log n))^{1/4}$, performing a random walk of length $\Theta(\ell)$ on an $n$-node network of diameter $D$ requires $\Omega(\sqrt{\ell D}+D)$ time. This bound is {\em unconditional}, i.e., it holds for any (possibly randomized) algorithm. To the best of our knowledge, this is the first lower bound that the diameter plays a role of multiplicative factor. Our bound shows that the algorithm of Das Sarma et al. is time optimal.
Our proof technique introduces a new connection between {\em bounded-round} communication complexity and distributed algorithm lower bounds with $D$ as a trade-off parameter, strengthening the previous study by Das Sarma et al. [STOC’11]. In particular, we make use of the bounded-round communication complexity of the pointer chasing problem. Our technique can be of independent interest and may be useful in showing non-trivial lower bounds on the complexity of other fundamental distributed computing problems.

Update History

6.10.2010: New link to the pdf posted. PPTX posted.

## Representative Skylines using Threshold-based Preference Distributions

Author: Atish Das Sarma, Ashwin Lall, Danupon Nanongkai, Richard J. Lipton, Jun Xu

Conference: ICDE 2011: the IEEE International Conference on Data Engineering [link]

Abstract:

The study of skylines and their variants has receivedconsiderable attention in recent years. Skylines are essentiallysets of most interesting (undominated) tuples in a database.However, since the skyline is often very large, much researcheffort has been devoted to identifying a smaller subset of (sayk) “representative skyline” points. Several different deﬁnitionsof representative skylines have been considered. Most of theseformulations are intuitive in that they try to achieve some kindof clustering “spread” over the entire skyline, with k points. Inthis work, we take a more principled approach in deﬁning therepresentative skyline objective. One of our main contributionsis to formulate the problem of displaying k representative skylinepoints such that the probability that a random user would clickon one of them is maximized.

Two major research questions arise naturally from this formu-lation. First, how does one mathematically model the likelihoodwith which a user is interested in and will “click” on a certaintuple? Second, how does one negotiate the absence of theknowledge of an explicit set of target users; in particular whatdo we mean by “a random user”? To answer the ﬁrst question,we model users based on a novel formulation of thresholdpreferences which we will motivate further in the paper. Toanswer the second question, we assume a probability distributionof users instead of a ﬁxed set of users. While this makes theproblem harder, it lends more mathematical structures that canbe exploited as well, as one can now work with probabilities ofthresholds and handle cumulative density functions.

On the theoretical front, our objective is NP-hard. For thecase of a ﬁnite set of users with known thresholds, we presenta simple greedy algorithm that attains an approximation ratio of $(1-1/e)$ of the optimal. For the case of user distributions,we show that a careful yet similar greedy algorithm achieves thesame approximation ratio. Unfortunately, it turns out that thisalgorithm is rather involved and computationally expensive. Sowe present a threshold sampling based algorithm that is morecomputationally affordable and, for any ﬁxed $\epsilon > 0$, has anapproximation ratio of $(1-1/e-\epsilon)$. We perform experimentson both real and synthetic data to show that our algorithmsigniﬁcantly outperforms previously proposed approaches.

Update History

[v1] November 14, 2010 (Conference version)

## Improved Hardness of Approximation for Stackelberg Shortest-Path Pricing

Author: Bundit Laekhanukit, Danupon Nanongkai
(Alphabetical order)

Conference: WINE 2010: 6th Workshop on Internet & Network Economics [wiki].  Published in LNCS Vol. 6484.

Abstract:

We consider the Stackelberg shortest-path pricing problem, which is deﬁned as follows. Given a graph G with ﬁxed-cost and pricable edges and two distinct vertices s and t, we may assign
prices to the pricable edges. Based on the predeﬁned ﬁxed costs and our prices, a customer purchases a cheapest s-t-path in G and we receive payment equal to the sum of prices of pricable
edges belonging to the path. Our goal is to ﬁnd prices maximizing the payment received from the customer. While Stackelberg shortest-path pricing was known to be APX-hard before, we provide
the ﬁrst explicit approximation threshold and prove hardness of approximation within 2 − o(1). We also prove that for the nicely structured type of instance resulting from our reduction, the
gap between the revenue of an optimal pricing and the only known general upper bound can still be logarithmically large.

Update History

[v1] October 1, 2010 (Conference version)

## Fast Distributed Random Walks (PODC’09)

Authors: Atish Das Sarma, Danupon Nanongkai, Gopal Pandurangan

Journal: Journal of the ACM 2013

Conference: PODC 2009: 28th Annual ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computin[wiki]

Abstract:

Performing random walks in networks is a fundamental primitive that has found applications in many areas of computer science, including distributed computing. In this paper, we focus on the problem of performing random walks efficiently in a distributed network. Given bandwidth constraints, the goal is to minimize the number of rounds required to obtain a random walk sample.

All previous algorithms that compute a random walk sample of length $\ell$ as a subroutine always do so naively, i.e., in $O(\ell)$ rounds. The main contribution of this paper is a fast distributed
algorithm for performing random walks. We show that  a random walk sample of length $\ell$ can be computed in $\tilde{O}(\ell^{2/3}D^{1/3})$ rounds on an undirected unweighted network, where $D$ is the diameter of the network. ($\tilde{O}$ hides $\frac{log{n}}{\delta}$ factors where $n$ is the number of nodes in the network and $\delta$ is the minimum degree.) For small diameter graphs, this is a significant improvement over the naive $O(\ell)$ bound. We also show that our algorithm  can be applied to speedup the more general Metropolis-Hastings sampling.

We extend our algorithms to perform a large number, $k$, of random walks efficiently. We show how $k$ destinations can be sampled in $\tilde{O}((k\ell)^{2/3}D^{1/3})$ rounds if $k\leq \ell^2$ and $\tilde{O}((k\ell)^{1/2})$ rounds otherwise. We  also present faster algorithms for performing random walks of length larger than (or equal to) the mixing time of the underlying graph. Our techniques can be useful in speeding up distributed algorithms for a variety of applications that use random walks as a subroutine.

Keywords: Random walks, Random sampling, Distributed algorithm, Metropolis-Hastings sampling.

Update History

[v1] May 31, 2009 (Conference version)
[v2] Feb 18, 2013 (Journal version posted)

## Best-Order Streaming Model (TAMC’09)

Authors: Atish Das Sarma, Richard J. Lipton, Danupon Nanongkai

Journal: Invited to Special Issue of Theoretical Computer Science [wiki]

Conference: TAMC 2009: 6th Annual Conference on Theory and Applications of Models of Computation [link]

Abstract:

We study a new model of computation called stream checking on graph problems where a space-limited verifier has to verify a proof sequentially (i.e., it reads the proof as a stream). Moreover, the proof itself is nothing but a reordering of the input data. This model has a close relationship to many models of computation in other areas such as data streams, communication complexity, and proof checking and could be used in applications such as cloud computing.

In this paper we focus on graph problems where the input is a sequence of edges. We show that checking if a graph has a perfect matching is impossible to do deterministically using small space. To contrast this, we show that randomized verifiers are powerful enough to check whether a graph has a perfect matching or is connected.

Update History

[v1] Apr 12, 2009 (Updated conference version in PDF)
[v2] Oct 1, 2009 (Submitted journal version in PDF)
[v3] Nov 22, 2010 (Link to the journal version)

## An Approximate Restatement of the Four Color Theorem (Manuscript)

Authors: Atish Das Sarma, Amita Gajewar, Richard J. Lipton, Danupon Nanongkai

Journal:

Conference:

Abstract:

The celebrated Four Color Theorem was ﬁrst conjectured in the 1850’s. More than a century later Appel and Haken [1] were able to ﬁnd the ﬁrst proof. Previously, there had been many partial results, and many false proofs. Appel and Haken’s famous proof has one “drawback”: it makes extensive use of computer computation. More recently Robertson, Sanders, Seymour and Thomas [18] created another proof of the Four Color Theorem. However, their proof, while simplifying some technical parts of the Appel-Haken proof, still relies on computer computations.

There is some debate in the mathematical community about whether mathematicians should be satisﬁed with mathematical proofs that rely on extensive computation and there is interest in ﬁnding a new proof of the Four Color Theorem that relies on no computer computation. Such a proof would perhaps yield additional insights into why the Four Color Theorem is really true, and might yield new insights into the structure of planar graphs. In any case, there continues to be a search for such a proof.

The contribution of this paper is that we initiate a new approach towards proving the Four Color Theorem. Our approach is based on insights from computer science theory and modern combinatorics in the style of “Erd¨os”. Tait proved in 1880 [20] that the Four Color Theorem is equivalent to showing that certain plane graphs have edge 3-colorings. Our main result is that this can be weakened to show that if every one of these plane graphs has even an approximate edge 3-coloring, then the Four Color Theorem is true. While it is not clear that our approach will necessarily open new attacks on the Four Color Theorem, the connections made in this paper seem to be of independent interest.

Update History
[v1]
November 2007

The celebrated Four Color Theorem was ﬁrst conjectured in the 1850’s. More than a century
later Appel and Haken [1] were able to ﬁnd the ﬁrst proof. Previously, there had been many
partial results, and many false proofs. Appel and Haken’s famous proof has one “drawback”:
it makes extensive use of computer computation. More recently Robertson, Sanders, Seymour
and Thomas [18] created another proof of the Four Color Theorem. However, their proof, while
simplifying some technical parts of the Appel-Haken proof, still relies on computer computations.
There is some debate in the mathematical community about whether mathematicians should
be satisﬁed with mathematical proofs that rely on extensive computation and there is interest
in ﬁnding a new proof of the Four Color Theorem that relies on no computer computation. Such
a proof would perhaps yield additional insights into why the Four Color Theorem is really true,
and might yield new insights into the structure of planar graphs. In any case, there continues
to be a search for such a proof.
The contribution of this paper is that we initiate a new approach towards proving the Four
Color Theorem. Our approach is based on insights from computer science theory and modern
combinatorics in the style of “Erd¨os”. Tait proved in 1880 [20] that the Four Color Theorem is
equivalent to showing that certain plane graphs have edge 3-colorings. Our main result is that
this can be weakened to show that if every one of these plane graphs has even an approximate
edge 3-coloring, then the Four Color Theorem is true. While it is not clear that our approach
will necessarily open new attacks on the Four Color Theorem, the connections made in this
paper seem to be of independent interest.