## Independent Set, Induced Matching, and Pricing: Connections and Tight (Subexponential Time) Approximation Hardnesses

We present a series of almost settled inapproximability results for three fundamental problems. The first in our series is the {\em subexponential-time} inapproximability of the {\em maximum independent set} problem, a question studied in the area of {\em parameterized complexity}. The second is the hardness of approximating the {\em maximum induced matching} on bounded-degree bipartite graphs. The last in our series is the tight hardness of approximating the {\em $k$-hypergraph pricing} problem, a fundamental problem arising from the area of {\em algorithmic game theory}. In particular, assuming the Exponential Time Hypothesis, our two main results are:

• For any $r$ larger than some constant, any $r$-approximation algorithm for the maximum independent set problem must run in time at least $2^{n^{1-\epsilon}/r^{1+\epsilon}}$. This nearly matches the upper bound of $2^{n/r}$ [Cygan et al:  Manuscript 2008]. It also improves some hardness results in the domain of parameterized complexity (e.g. [Escoffier et al: Manuscript 2012] and [Chitnis et al: Manuscript 2013]).
• For any $k$ larger than some constant, there is no polynomial time $\min \{k^{1-\epsilon}, n^{1/2-\epsilon}\}$-approximation algorithm for the $k$-hypergraph pricing problem , where $n$ is the number of vertices in an input graph. This almost matches the upper bound of  $\min \{O(k), \tilde O(\sqrt{n})\}$  (by Balcan and Blum in [Balcan and Blum: Theory of Computing 2007] and an algorithm in this paper).

We note an interesting fact that, in contrast to $n^{1/2-\epsilon}$ hardness, the $k$-hypergraph pricing problem admits $n^{\delta}$ approximation for any $\delta >0$ in quasi-polynomial time. This puts pricing problem in a rare approximability class in which approximability thresholds can be improved significantly by allowing algorithms to run in quasi-polynomial time.
The proofs of our hardness results rely on several unexpectedly tight connections between the three problems. First, we establish a connection between the first and second problems by proving a new graph-theoretic property related to an {\em induced matching number} of dispersers. Then, we show that the $n^{1/2-\epsilon}$ hardness of the last problem follows from nearly tight {\em subexponential time} inapproximability of the first problem, illustrating a rare application of the second type of inapproximability result to the first one. Finally, to prove the subexponential-time inapproximability of the first problem, we construct a new PCP with several properties; it is sparse and has nearly-linear size, large degree, and small free-bit complexity. Our PCP requires no ground-breaking ideas but rather a very careful assembly of the existing ingredients in the PCP literature.

## Dynamic Approximate All-Pairs Shortest Paths: Breaking the O(mn) Barrier and Derandomization

Abstract

We study dynamic $(1+\epsilon)$-approximation algorithms for the all-pairs shortest paths problem in unweighted undirected $n$-node $m$-edge graphs under edge deletions. The fastest algorithm for this problem is a randomized algorithm with a total update time of $\tilde O(mn)$ and constant query time by Roditty and Zwick (FOCS 2004). The fastest deterministic algorithm is from a 1981 paper by Even and Shiloach (JACM 1981); it has a total update time of $O(mn^2)$ and constant query time. We improve these results as follows:

(1) We present an algorithm with a total update time of $\tilde O(n^{5/2})$ and constant query time that has an additive error of two in addition to the $(1+\epsilon)$ multiplicative error. This beats the previous $\tilde O(mn)$ time when $m=\Omega(n^{3/2})$. Note that the additive error is unavoidable since, even in the  static case, an $O(n^{3-\delta})$-time (a so-called truly subcubic) combinatorial algorithm with $(1+\epsilon)$ multiplicative error cannot have an additive error less than $2-\epsilon$, unless we make a major breakthrough for Boolean matrix multiplication (Dor, Halperin and Zwick FOCS 1996) and many other long-standing problems (Vassilevska Williams and Williams FOCS 2010). The algorithm can also be turned into a $(2+\epsilon)$-approximation algorithm (without an additive error) with the same time guarantees, improving the recent $(3+\epsilon)$-approximation algorithm with $\tilde O(n^{5/2+O(1/\sqrt{\log n})})$ running time of Bernstein and Roditty (SODA 2011) in terms of both approximation and time guarantees.

(2) We present a deterministic algorithm with a total update time of $\tilde O(mn)$ and a query time of $O(\log\log n)$. The algorithm has a multiplicative error of $(1+\epsilon)$ and gives the first improved deterministic algorithm since 1981. It also answers an open question raised by Bernstein in his STOC 2013 paper.

In order to achieve our results, we introduce two new techniques: (1) A monotone Even-Shiloach tree algorithm which maintains a bounded-distance shortest-paths tree on a certain type of emulator called locally persevering emulator.  (2) A derandomization technique based on moving Even-Shiloach trees as a way to derandomize the standard random set argument. These techniques might be of independent interest.

## Multi-Attribute Profit-Maximizing Pricing

Author: Parinya Chalermsook, Khaled Elbassioni, Danupon Nanongkai, He Sun

Conference: Submitted

Journal:

Abstract:

In the unlimited-supply profit-maximizing pricing problem, we are given the consumers’ consideration sets and know their purchase strategy (e.g. buy the cheapest items). The goal is to price the items to maximize the revenue. Previous studies suggest that this problem is too general to obtain even a sublinear approximation ratio (in terms of the number of items) even when the consumers are restricted to have very simple purchase strategies.
In this paper we initiate the study of the multi-attribute pricing problem as a direction to break this barrier. Specifically, we consider the case where each item has a constant number of attributes, and each consumer would like to buy the items that satisfy her criteria in all attributes. This notion intuitively captures typical real-world settings and has been widely-studied in marketing research, healthcare economics, etc. It also helps categorizing previously studied cases, such as highway pricing problem and graph vertex pricing problem on planar and bipartite graphs, from the general case.

We show that this notion of attributes leads to improved approximation ratios on a large class of problems. This is obtained by utilizing the fact that the consideration sets have low VC-dimension and applying Dilworth’s theorem on a certain partial order defined on the set of items. As a consequence, we present sublinear-approximation algorithms, thus breaking the previous barrier, for two well-known variants of the problem: unit-demand uniform-budget min-buying and single-minded pricing problems. Moreover, we generalize these techniques to the unit-demand utility-maximizing pricing problem and (non-uniform) unit-demand min-buying pricing problem when valuations or budgets depend on attributes, as well as the pricing problem with symmetric valuations and subadditive revenues. These results suggest that considering attributes is a promising research direction in obtaining improved approximation algorithms for such pricing problems.

## Interactive Regret Minimization

Author: Danupon Nanongkai, Atish Das Sarma, Ashwin Lall, Kazuhisa Makino
(Author names are NOT in alphabetical order. )

Conference: SIGMOD 2012

Journal:

Abstract

We study the notion of regret ratio proposed by Nanongkai et al. [VLDB’10] to deal with multi-criteria decision making in database systems. The regret minimization query proposed Nanongkai et al. was shown to have features of both skyline and top-$k$: it does not need information from the user but still controls the output size. While this approach is suitable for obtaining a reasonably small regret ratio, it is still open whether one can make the regret ratio arbitrarily small. Moreover, it remains open whether reasonable questions can be asked to the users in order to improve efficiency of the process.

In this paper, we study the problem of minimizing regret ratio when the system is enhanced with interaction. We assume that when presented with a set of tuples, the user can tell which tuple is most preferred. Under this assumption, we develop the problem of interactive regret minimization where we fix the number of questions, and tuples per question, that we can display, and aim at minimizing the regret ratio. We try to answer two questions in this paper: (1) How much does interaction help? That is, how much can we improve the regret ratio when there are interactions? (2) How efficient can interaction be? In particular, we measure how many questions we have to ask the user in order to make her regret ratio small enough.

We answer both questions from both theoretical and practical standpoints. For the first question, we show that interaction can reduce the regret ratio almost exponentially. To do this, we prove a lower bound for the previous approach (thereby resolving an open problem from Nanongkai et al.), and develop an almost-optimal upper bound that makes the regret ratio exponentially smaller. Our experiments also confirm that, in practice, interactions help in improving the regret ratio by many orders of magnitude. For the second question, we prove that when our algorithm shows a reasonable number of points per question, it only needs a few questions to make the regret ratio small. Thus, interactive regret minimization seems to be a necessary and sufficient way to deal with multi-criteria decision making in database systems.

## Representative Skylines using Threshold-based Preference Distributions

Author: Atish Das Sarma, Ashwin Lall, Danupon Nanongkai, Richard J. Lipton, Jun Xu

Conference: ICDE 2011: the IEEE International Conference on Data Engineering [link]

Abstract:

The study of skylines and their variants has receivedconsiderable attention in recent years. Skylines are essentiallysets of most interesting (undominated) tuples in a database.However, since the skyline is often very large, much researcheffort has been devoted to identifying a smaller subset of (sayk) “representative skyline” points. Several different deﬁnitionsof representative skylines have been considered. Most of theseformulations are intuitive in that they try to achieve some kindof clustering “spread” over the entire skyline, with k points. Inthis work, we take a more principled approach in deﬁning therepresentative skyline objective. One of our main contributionsis to formulate the problem of displaying k representative skylinepoints such that the probability that a random user would clickon one of them is maximized.

Two major research questions arise naturally from this formu-lation. First, how does one mathematically model the likelihoodwith which a user is interested in and will “click” on a certaintuple? Second, how does one negotiate the absence of theknowledge of an explicit set of target users; in particular whatdo we mean by “a random user”? To answer the ﬁrst question,we model users based on a novel formulation of thresholdpreferences which we will motivate further in the paper. Toanswer the second question, we assume a probability distributionof users instead of a ﬁxed set of users. While this makes theproblem harder, it lends more mathematical structures that canbe exploited as well, as one can now work with probabilities ofthresholds and handle cumulative density functions.

On the theoretical front, our objective is NP-hard. For thecase of a ﬁnite set of users with known thresholds, we presenta simple greedy algorithm that attains an approximation ratio of $(1-1/e)$ of the optimal. For the case of user distributions,we show that a careful yet similar greedy algorithm achieves thesame approximation ratio. Unfortunately, it turns out that thisalgorithm is rather involved and computationally expensive. Sowe present a threshold sampling based algorithm that is morecomputationally affordable and, for any ﬁxed $\epsilon > 0$, has anapproximation ratio of $(1-1/e-\epsilon)$. We perform experimentson both real and synthetic data to show that our algorithmsigniﬁcantly outperforms previously proposed approaches.

Update History

[v1] November 14, 2010 (Conference version)

## Distributed Verification and Hardness of Distributed Approximation

Author: Atish Das Sarma, Stephan Holzer, Liah Kor, Amos Korman, Danupon Nanongkai, Gopal Pandurangan, David Peleg, Roger Wattenhofer

Conference: STOC 2011

Journal:

Abstract:

We study the verification problem in distributed networks, stated as follows. Let $H$ be a subgraph of a network $G$ where each vertex of $G$ knows which edges incident on it are in $H$. We would like to verify whether $H$ has some properties, e.g., if it is a tree or if it is connected. We would like to perform this verification in a decentralized fashion via a distributed algorithm. The time complexity of verification is measured as the number of rounds of distributed communication.
In this paper we initiate a systematic study of distributed verification, and give almost tight lower bounds on the running time of distributed verification algorithms for many fundamental problems such as connectivity, spanning connected subgraph, and $s-t$ cut verification. We then show applications of these results in deriving strong unconditional time lower bounds on the hardness of distributed approximation for many classical optimization problems including minimum spanning tree, shortest paths, and minimum cut. Many of these results are the first non-trivial lower bounds for both exact and approximate distributed computation and they resolve previous open questions. Moreover, our unconditional lower bound of approximating minimum spanning tree (MST) subsumes and improves upon the previous hardness of approximation bound of Elkin [STOC 2004] as well as the lower bound for (exact) MST computation of Peleg and Rubinovich [FOCS 1999]. Our result implies that there can be no distributed approximation algorithm for MST that is significantly faster than the current exact algorithm, for any approximation factor.
Our lower bound proofs show an interesting connection between communication complexity and distributed computing which turns out to be useful in establishing the time complexity of exact and approximate distributed computation of many problems.

Update History

## Regret-Minimizing Representative Databases

Author: Danupon Nanongkai, Atish Das Sarma, Ashwin Lall, Richard J. Lipton, Jun Xu

Conference: VLDB 2010: 36th International Conference on Very Large Databases [wiki]

Abstract:

We propose the k-representative regret minimization query (k-regret) as an operation to support multi-criteria decision making. Like top-k, the k-regret query assumes that users have some utility or scoring functions; however, it never asks the users to provide such functions. Like skyline, it filters out a set of interesting points from a potentially large database based on the users’ criteria; however, it never overwhelms the users by outputting too many tuples.

In particular, for any number k and any class of utility functions, the k-regret query outputs k tuples from the database and tries to minimize the {\em maximum regret ratio}. This captures how disappointed a user could be had she seen k-representative tuples instead of the whole database. We focus on the class of linear utility functions, which is widely applicable.

The first challenge of this approach is that it is not clear if the maximum regret ratio can be small, or even bounded. We answer this question affirmatively. Theoretically, we prove that the maximum regret ratio can be bounded and this bound is independent of the database size. Moreover, our extensive experiments on real and synthetic datasets suggest that in practice the maximum regret ratio is reasonably small. Additionally, algorithms developed in this paper are practical as they run in linear time in the size of the database and the experiments show that their running time is small when they run on top of the skyline operation which means that these algorithm could be integrated into current database systems.

Update History

[v1] June 28, 2010 (Conference version)