- For any larger than some constant, any -approximation algorithm for the maximum independent set problem must run in time at least . This nearly matches the upper bound of [Cygan et al: Manuscript 2008]. It also improves some hardness results in the domain of parameterized complexity (e.g. [Escoffier et al: Manuscript 2012] and [Chitnis et al: Manuscript 2013]).
- For any larger than some constant, there is no polynomial time -approximation algorithm for the -hypergraph pricing problem , where is the number of vertices in an input graph. This almost matches the upper bound of (by Balcan and Blum in [Balcan and Blum: Theory of Computing 2007] and an algorithm in this paper).

We note an interesting fact that, in contrast to hardness, the -hypergraph pricing problem admits approximation for any in quasi-polynomial time. This puts pricing problem in a rare approximability class in which approximability thresholds can be improved significantly by allowing algorithms to run in quasi-polynomial time.

The proofs of our hardness results rely on several unexpectedly tight connections between the three problems. First, we establish a connection between the first and second problems by proving a new graph-theoretic property related to an {\em induced matching number} of dispersers. Then, we show that the hardness of the last problem follows from nearly tight {\em subexponential time} inapproximability of the first problem, illustrating a rare application of the second type of inapproximability result to the first one. Finally, to prove the subexponential-time inapproximability of the first problem, we construct a new PCP with several properties; it is sparse and has nearly-linear size, large degree, and small free-bit complexity. Our PCP requires no ground-breaking ideas but rather a very careful assembly of the existing ingredients in the PCP literature.

We study dynamic -approximation algorithms for the all-pairs shortest paths problem in unweighted undirected -node -edge graphs under edge deletions. The fastest algorithm for this problem is a randomized algorithm with a total update time of and constant query time by Roditty and Zwick (FOCS 2004). The fastest deterministic algorithm is from a 1981 paper by Even and Shiloach (JACM 1981); it has a total update time of and constant query time. We improve these results as follows:

(1) We present an algorithm with a total update time of and constant query time that has an additive error of two in addition to the multiplicative error. This beats the previous time when . Note that the additive error is unavoidable since, even in the static case, an -time (a so-called truly subcubic) combinatorial algorithm with multiplicative error cannot have an additive error less than , unless we make a major breakthrough for Boolean matrix multiplication (Dor, Halperin and Zwick FOCS 1996) and many other long-standing problems (Vassilevska Williams and Williams FOCS 2010). The algorithm can also be turned into a -approximation algorithm (without an additive error) with the same time guarantees, improving the recent -approximation algorithm with running time of Bernstein and Roditty (SODA 2011) in terms of both approximation and time guarantees.

(2) We present a deterministic algorithm with a total update time of and a query time of . The algorithm has a multiplicative error of and gives the first improved deterministic algorithm since 1981. It also answers an open question raised by Bernstein in his STOC 2013 paper.

In order to achieve our results, we introduce two new techniques: (1) A *monotone Even-Shiloach tree* algorithm which maintains a bounded-distance shortest-paths tree on a certain type of emulator called *locally persevering emulator*. (2) A derandomization technique based on *moving Even-Shiloach trees* as a way to derandomize the standard random set argument. These techniques might be of independent interest.

**Download:** arXiv (Extended Abstract), pdf (full version)

**Conference: **Submitted

**Journal: **–

**Abstract:**

In the* unlimited-supply profit-maximizing pricing* problem, we are given the consumers’ consideration sets and know their purchase strategy (e.g. buy the cheapest items). The goal is to price the items to maximize the revenue. Previous studies suggest that this problem is too general to obtain even a sublinear approximation ratio (in terms of the number of items) even when the consumers are restricted to have very simple purchase strategies.

In this paper we initiate the study of the *multi-attribute pricing* problem as a direction to break this barrier. Specifically, we consider the case where each item has a constant number of *attributes*, and each consumer would like to buy the items that satisfy her *criteria* in all attributes. This notion intuitively captures typical real-world settings and has been widely-studied in marketing research, healthcare economics, etc. It also helps categorizing previously studied cases, such as *highway pricing* problem and *graph vertex pricing* problem on planar and bipartite graphs, from the general case.

We show that this notion of attributes leads to improved approximation ratios on a large class of problems. This is obtained by utilizing the fact that the consideration sets have low VC-dimension and applying Dilworth’s theorem on a certain partial order defined on the set of items. As a consequence, we present sublinear-approximation algorithms, thus breaking the previous barrier, for two well-known variants of the problem: *unit-demand uniform-budget min-buying* and *single-minded* pricing problems. Moreover, we generalize these techniques to the *unit-demand utility-maximizing* pricing problem and (non-uniform) unit-demand min-buying pricing problem when valuations or budgets depend on attributes, as well as the pricing problem with *symmetric valuations* and *subadditive revenues*. These results suggest that considering attributes is a promising research direction in obtaining improved approximation algorithms for such pricing problems.

(Author names are NOT in alphabetical order. )

**Download:** Soon

**Conference: **SIGMOD 2012

**Journal: **–

**Abstract**

We study the notion of *regret ratio* proposed by Nanongkai et al. [VLDB’10] to deal with multi-criteria decision making in database systems. The regret minimization query proposed Nanongkai et al. was shown to have features of both skyline and top-: it does not need information from the user but still controls the output size. While this approach is suitable for obtaining a reasonably small regret ratio, it is still open whether one can make the regret ratio arbitrarily small. Moreover, it remains open whether reasonable questions can be asked to the users in order to improve efficiency of the process.

In this paper, we study the problem of minimizing regret ratio when the system is enhanced with *interaction*. We assume that when presented with a set of tuples, the user can tell which tuple is most preferred. Under this assumption, we develop the problem of *interactive regret minimization* where we fix the number of questions, and tuples per question, that we can display, and aim at minimizing the regret ratio. We try to answer two questions in this paper: (1) How much does interaction help? That is, how much can we improve the regret ratio when there are interactions? (2) How efficient can interaction be? In particular, we measure how many questions we have to ask the user in order to make her regret ratio small enough.

We answer both questions from both theoretical and practical standpoints. For the first question, we show that interaction can reduce the regret ratio almost *exponentially*. To do this, we prove a lower bound for the previous approach (thereby resolving an open problem from Nanongkai et al.), and develop an almost-optimal upper bound that makes the regret ratio exponentially smaller. Our experiments also confirm that, in practice, interactions help in improving the regret ratio by many orders of magnitude. For the second question, we prove that when our algorithm shows a reasonable number of points per question, it only needs a few questions to make the regret ratio small. Thus, interactive regret minimization seems to be a necessary and sufficient way to deal with multi-criteria decision making in database systems.

**Author****:** Danupon Nanongkai, Atish Das Sarma, Gopal Pandurangan

(Author names are NOT in alphabetical order. )

**Download:** arXiv

**Conference: **PODC 2012

**Journal: **–

**Abstract:**

We consider the problem of performing a random walk in a distributed network. Given bandwidth constraints, the goal of the problem is to minimize the number of rounds required to obtain a random walk sample. Das Sarma et al. [PODC’10] show that a random walk of length $\ell$ on a network of diameter $D$ can be performed in $\tilde O(\sqrt{\ell D}+D)$ time. A major question left open is whether there exists a faster algorithm, especially whether the multiplication of $\sqrt{\ell}$ and $\sqrt{D}$ is necessary.

In this paper, we show a tight unconditional lower bound on the time complexity of distributed random walk computation. Specifically, we show that for any $n$, $D$, and $D\leq \ell \leq (n/(D^3\log n))^{1/4}$, performing a random walk of length $\Theta(\ell)$ on an $n$-node network of diameter $D$ requires $\Omega(\sqrt{\ell D}+D)$ time. This bound is {\em unconditional}, i.e., it holds for any (possibly randomized) algorithm. To the best of our knowledge, this is the first lower bound that the diameter plays a role of multiplicative factor. Our bound shows that the algorithm of Das Sarma et al. is time optimal.

Our proof technique introduces a new connection between {\em bounded-round} communication complexity and distributed algorithm lower bounds with $D$ as a trade-off parameter, strengthening the previous study by Das Sarma et al. [STOC’11]. In particular, we make use of the bounded-round communication complexity of the pointer chasing problem. Our technique can be of independent interest and may be useful in showing non-trivial lower bounds on the complexity of other fundamental distributed computing problems.

**Update History**

6.10.2010: New link to the pdf posted. PPTX posted.

]]>**Download:** PDF

**Conference: **ICDE 2011: the IEEE International Conference on Data Engineering [link]

**Abstract:**

The study of skylines and their variants has receivedconsiderable attention in recent years. Skylines are essentiallysets of most interesting (undominated) tuples in a database.However, since the skyline is often very large, much researcheffort has been devoted to identifying a smaller subset of (sayk) “representative skyline” points. Several different deﬁnitionsof representative skylines have been considered. Most of theseformulations are intuitive in that they try to achieve some kindof clustering “spread” over the entire skyline, with k points. Inthis work, we take a more principled approach in deﬁning therepresentative skyline objective. One of our main contributionsis to formulate the problem of displaying k representative skylinepoints such that the probability that a random user would clickon one of them is maximized.

Two major research questions arise naturally from this formu-lation. First, how does one mathematically model the likelihoodwith which a user is interested in and will “click” on a certaintuple? Second, how does one negotiate the absence of theknowledge of an explicit set of target users; in particular whatdo we mean by “a random user”? To answer the ﬁrst question,we model users based on a novel formulation of thresholdpreferences which we will motivate further in the paper. Toanswer the second question, we assume a probability distributionof users instead of a ﬁxed set of users. While this makes theproblem harder, it lends more mathematical structures that canbe exploited as well, as one can now work with probabilities ofthresholds and handle cumulative density functions.

On the theoretical front, our objective is NP-hard. For thecase of a ﬁnite set of users with known thresholds, we presenta simple greedy algorithm that attains an approximation ratio of of the optimal. For the case of user distributions,we show that a careful yet similar greedy algorithm achieves thesame approximation ratio. Unfortunately, it turns out that thisalgorithm is rather involved and computationally expensive. Sowe present a threshold sampling based algorithm that is morecomputationally affordable and, for any ﬁxed , has anapproximation ratio of . We perform experimentson both real and synthetic data to show that our algorithmsigniﬁcantly outperforms previously proposed approaches.

**Update History**

**[v1] **November 14, 2010 (Conference version)

**Author****:** Atish Das Sarma, Stephan Holzer, Liah Kor, Amos Korman, Danupon Nanongkai, Gopal Pandurangan, David Peleg, Roger Wattenhofer

**Download:** arXiv

**Conference: **STOC 2011

**Journal: **–

**Abstract:**

We study the *verification* problem in distributed networks, stated as follows. Let be a subgraph of a network where each vertex of knows which edges incident on it are in . We would like to verify whether has some properties, e.g., if it is a tree or if it is connected. We would like to perform this verification in a decentralized fashion via a distributed algorithm. The time complexity of verification is measured as the number of rounds of distributed communication.

In this paper we initiate a systematic study of distributed verification, and give almost tight lower bounds on the running time of distributed verification algorithms for many fundamental problems such as connectivity, spanning connected subgraph, and cut verification. We then show applications of these results in deriving strong unconditional time lower bounds on the * hardness of distributed approximation* for many classical optimization problems including minimum spanning tree, shortest paths, and minimum cut. Many of these results are the first non-trivial lower bounds for both exact and approximate distributed computation and they resolve previous open questions. Moreover, our unconditional lower bound of approximating minimum spanning tree (MST) subsumes and improves upon the previous hardness of approximation bound of Elkin [STOC 2004] as well as the lower bound for (exact) MST computation of Peleg and Rubinovich [FOCS 1999]. Our result implies that there can be no distributed approximation algorithm for MST that is significantly faster than the current exact algorithm, for *any* approximation factor.

Our lower bound proofs show an interesting connection between communication complexity and distributed computing which turns out to be useful in establishing the time complexity of exact and approximate distributed computation of many problems.

**Update History**

**–**

(Alphabetical order)

**Download:** PDF

**Conference: **WINE 2010: **6th Workshop on Internet & Network Economics [wiki]. Published in LNCS Vol. 6484.**

** **

**Abstract:**

We consider the Stackelberg shortest-path pricing problem, which is deﬁned as follows. Given a graph G with ﬁxed-cost and pricable edges and two distinct vertices s and t, we may assign

prices to the pricable edges. Based on the predeﬁned ﬁxed costs and our prices, a customer purchases a cheapest s-t-path in G and we receive payment equal to the sum of prices of pricable

edges belonging to the path. Our goal is to ﬁnd prices maximizing the payment received from the customer. While Stackelberg shortest-path pricing was known to be APX-hard before, we provide

the ﬁrst explicit approximation threshold and prove hardness of approximation within 2 − o(1). We also prove that for the nicely structured type of instance resulting from our reduction, the

gap between the revenue of an optimal pricing and the only known general upper bound can still be logarithmically large.

**Update History**

**[v1] **October 1, 2010 (Conference version)

**Download:** PDF

**Conference: **VLDB 2010: 36th International Conference on Very Large Databases [wiki]

**Abstract:**

We propose the k-representative regret minimization query (k-regret) as an operation to support multi-criteria decision making. Like top-k, the k-regret query assumes that users have some utility or scoring functions; however, it never asks the users to provide such functions. Like skyline, it filters out a set of interesting points from a potentially large database based on the users’ criteria; however, it never overwhelms the users by outputting too many tuples.

In particular, for any number k and any class of utility functions, the k-regret query outputs k tuples from the database and tries to minimize the {\em maximum regret ratio}. This captures how disappointed a user could be had she seen k-representative tuples instead of the whole database. We focus on the class of linear utility functions, which is widely applicable.

The first challenge of this approach is that it is not clear if the maximum regret ratio can be small, or even bounded. We answer this question affirmatively. Theoretically, we prove that the maximum regret ratio can be bounded and this bound is independent of the database size. Moreover, our extensive experiments on real and synthetic datasets suggest that in practice the maximum regret ratio is reasonably small. Additionally, algorithms developed in this paper are practical as they run in linear time in the size of the database and the experiments show that their running time is small when they run on top of the skyline operation which means that these algorithm could be integrated into current database systems.

**Update History**

**[v1] **June 28, 2010 (Conference version)

**Download: **arXiv of the journal version, arXiv of the conference version

**Conference:** PODC 2010

**Journal:** Journal of the ACM 2013

**Abstract:**

We focus on the problem of performing random walks efficiently in a distributed network. Given bandwidth constraints, the goal is to minimize the number of rounds required to obtain a random walk sample. We first present a fast sublinear time distributed algorithm for performing random walks whose time complexity is sublinear in the length of the walk. Our algorithm performs a random walk of length in rounds (with high probability) on an undirected network, where is the diameter of the network. This improves over the previous best algorithm that ran in rounds (Das Sarma et al., PODC 2009). We further extend our algorithms to efficiently perform independent random walks in rounds. We then show that there is a fundamental difficulty in improving the dependence on any further by proving a lower bound of under a general model of distributed random walk algorithms. Our random walk algorithms are useful in speeding up distributed algorithms for a variety of applications that use random walks as a subroutine. We present two main applications. First, we give a fast distributed algorithm for computing a random spanning tree (RST) in an arbitrary (undirected) network which runs in rounds (with high probability; here is the number of edges). Our second application is a fast decentralized algorithm for estimating mixing time and related parameters of the underlying network. Our algorithm is fully decentralized and can serve as a building block in the design of topologically-aware networks.

**Update History**

Mar 03, 2009 (New version posted on ArXiv)

Nov 06, 2009 (Link to arXiv posted)

Feb 18, 2010 (New version posted)

Feb 18, 2013 (Journal version posted)