Mechanism Design for Crowdsourcing: An Optimal 11/e Competitive BudgetFeasible Mechanism for Large Markets
Abstract
In this paper we consider a mechanism design problem in the context of largescale crowdsourcing markets such as Amazon’s Mechanical Turk (MTrk), ClickWorker (ClkWrkr), CrowdFlower (CrdFlwr). In these markets, there is a requester who wants to hire workers to accomplish some tasks. Each worker is assumed to give some utility to the requester on getting hired. Moreover each worker has a minimum cost that he wants to get paid for getting hired. This minimum cost is assumed to be private information of the workers. The question then is  if the requester has a limited budget, how to design a direct revelation mechanism that picks the right set of workers to hire in order to maximize the requester’s utility?
We note that although the previous work (Singer (2010); Chen et al. (2011)) has studied this problem, a crucial difference in which we deviate from earlier work is the notion of largescale markets that we introduce in our model. The notion of a largescale market that we consider is a natural one which states that the (private) cost of each worker is small compared to the budget of the requester. Without the large market assumption, it is known that no mechanism can achieve a competitive ratio better than and for deterministic and randomized mechanisms respectively (while the best known deterministic and randomized mechanisms achieve an approximation ratio of and respectively). In this paper, we design a budgetfeasible mechanism for large markets that achieves a competitive ratio of . Our mechanism can be seen as a generalization of an alternate way to look at the proportional share mechanism, which is used in all the previous works so far on this problem. Interestingly, we can also show that our mechanism is optimal by showing that no truthful mechanism can achieve a factor better than ; thus, fully resolving this setting. Finally we consider the more general case of submodular utility functions and give new and improved mechanisms for the case when the market is large.
Contents
 1 Introduction
 2 A Simple Approximate Truthful Mechanism
 3 Our Approach
 4 A Parameterized Class of Truthful Mechanisms
 5 A ()Approximate Optimal Truthful Mechanism
 6 Impossibility Result: On why is the best approximation possible
 7 Conclusion
 A Analyzing our Optimal Truthful Mechanism for the general case
 B Missing Proofs from Section 5.2
 C Mechanisms for Indivisible Items
 D The Optimal Standard Allocation Rule
 E Submodular Utility Functions
 F Hoeffding Bounds
1 Introduction
Crowdsourcing is a recent phenomenon that is used to describe the procurement of a large number of workers to do certain tasks. These tasks can be of a variety of natures and  to give a few examples  include image annotation, data labeling for machine learning systems, consumer surveys, rating search engine results, spam detection, product reviews, etc. There are several platforms (such as Amazon’s Mechanical Turk (MTrk)) that facilitate and automate various steps involved in setting up and executing crowdsourcing tasks.
A key challenge in these online labor markets is to be able to properly price the tasks. Since the requester (the one who wants to procure workers) is usually budget constrained, pricing the tasks too high can result in lower output for the requester. On the other hand, pricing the tasks too low can disincentivize workers to work on the tasks. This makes pricing a nontrivial step for the requester when setting up a crowdsourcing task. One idea  to make pricing more automated and to prevent economic loss from poor pricing  is to design a direct revelation mechanism that solicits bids from workers to report their cost of participation, and based on this decide which workers to hire and how much to pay them.
A simple model that captures the above problem is as follows: There is a set of workers. Worker has a private cost and provides utility to the requester on getting hired. We want to design a truthful mechanism that decides which workers to recruit and how much to pay them. The goal is to maximize the requester’s utility without violating her budget constraint.
For the above model, Singer (2010) gave an incentivecompatible mechanism that achieves an approximation ratio of compared to the offline optimum that knows the costs of the workers. Later on Chen et al. (2011) improved the approximation ratio to (and to for randomized mechanisms). Chen et. al. also showed that no deterministic mechanism can achieve an approximation ratio better than , and no randomized mechanism can achieve an approximation ratio better than .
Our work is motivated by the following observation: Most of the crowdsourcing tasks are largescale in nature in terms of the number of workers involved. On the other hand if one looks at the impossibility result of Chen et al. (2011), they involve only a small number of workers (specifically, only 3 workers). Thus, this leads to a natural open question  Do these lower bounds extend to the case of large markets? or can one design better mechanisms for this important case of large markets?
In this paper, we seek to understand the above question. We show that one can significantly improve the approximation ratio for the case of large markets. We give a mechanism that achieves an approximation ratio of for large markets. In addition, we show that our mechanism is the best possible mechanism by showing that no truthful budgetfeasible mechanism can achieve a factor better than . Finally, we look at the more general case of submodular utility functions.
1.1 The Model
We define the model abstractly: Consider a reverse auction scenario with one buyer and sellers, where the set of sellers is denoted by . Each seller owns a single item (denoted by item ) and has a private cost for it. The buyer derives a utility of from item . The buyer has a limited budget , and its goal is to buy a subset of items that maximizes her utility without exceeding her budget.
Note that if the sellers are not strategic and the costs are known to the buyer, then this is the wellknown knapsack optimization problem.
However, the cost is assumed to be a private information of seller . Thus we are interested in designing directrevelation mechanisms where the buyer solicits bids from the sellers, and then computes which sellers to buy from and how much to pay them. More formally, a mechanism consists of two functions and . The allocation function takes as input the costs of sellers and reports the set of winners. The payment function takes as input the costs of sellers and reports how much each seller should pay. Sometimes we will use functions (and similarly ) for each to refer to the restriction of functions and for seller .
The mechanism should satisfy the following properties:

Budget Feasibility: The sum of the payments made to the sellers should not exceed , i.e., for all .

Individual rationality: A winner is paid at least .

Truthfulness/IncentiveCompatibility: Reporting the true cost should be a dominant strategy of the sellers, i.e. for all nontruthful reports from seller , it holds that
Among all mechanisms that satisfy the above properties, we are interested in the ones that give high utility to the buyer. Note that no mechanism can achieve utility larger than , where (or simply for brevity) is the utility of the knapsack optimization problem assuming costs of the sellers are known to the buyer. We say a mechanism is an approximation (for ) if it gives utility at least for any and .
Indivisible vs Divisible Items. Note that the above description is given for indivisible items, however, we can define the above problem for divisible items as well. For instance, if the item being sold by a seller is his own time, then it can be modeled as a divisible item. For fraction of a divisible item, the cost of seller is and the utility obtained by the buyer is . The allocation function for divisible items is defined as .
More general utility functions. An interesting generalization of the above model is when the utility function over the set of items is a submodular function rather than additive functions. We denote this function by (for additive functions, , for ). We assume that the utility function is known to the buyer.
1.1.1 The Large Market Assumption
Crowdsourcing systems are excellent examples of large markets. Informally speaking, a market is said to be large if the number of participants are large enough that no single person can affect the market outcome significantly. Our results take advantage of this nature of the crowdsourcing markets to give better mechanisms.
We define the large market assumption as follows: We assume that in our model, the cost of a single item is very small compared to the buyer’s budget . More formally, let . Then, the large market assumption is defined as below.
The Large Market Assumption: .
In other words, we define the largeness ratio of the market to be and analyze our mechanisms for .
This assumption  also known as the small bid to budget ratio assumption  is used in other largemarket problems as well (for instance, see Mehta et al. (2007) for a similar definition with application in online advertising). All the mechanisms that we present in the main body of the paper (mechanisms for additive utility functions) will be analyzed under this assumption. The mechanisms that we design for submodular utility functions work under a different large market assumption which is explained below.
An Alternative Assumption
We also suggest another definition for large markets, the discussion of which will be deferred to the appendix. Our mechanisms for submodular utility functions work under this assumption; moreover, we can slightly modify our mechanisms for additive utility functions so that they work under this assumption as well, while preserving their approximation ratio. We define this assumption below.
Let and be the total utility of the optimum solution (i.e. the maximum utility that the buyer can achieve when the costs are known to her). This large market assumption states that:
The Alternative Large Market Assumption: .
In other words, we define the largeness ratio of the market to be and analyze our mechanisms for when .
We note that our impossibility result for additive utilities (Section 6) holds for either of the two definitions.
1.2 Our Results
In this paper, we design optimal budgetfeasible mechanisms for large markets. To the best of our knowledge, we are the first ones to study the case of large markets. We list our results below:

If the items are divisible, we design a deterministic mechanism which satisfies all the required properties and has an approximation ratio of (Section 5). Note that previously, no mechanism was known for the case of divisible items. In fact, one can show that no bounded approximation ratio is possible for divisible items if the large market assumption is dismissed.

If the items are indivisible, we can modify our mechanism and give a randomized truthful mechanism for this case which achieves an approximation ratio of . (Section C)

In Section 6, we show that the above results are optimal by proving that no truthful (and possibly) randomized mechanism can achieve approximation ratio better than . Our hardness result holds for both cases of divisible and indivisible items.

For the case of submodular utility functions, we design deterministic mechanisms that achieve approximation ratios of and with exponential and polynomial running times respectively. Note that we only consider the case of indivisible items for submodular utility functions. (Section E)
As we saw in Section 1.1.1, one could define a notion of large market, i.e. a market with largeness ratio . To gain a better understating of the problem, we focus on large markets (i.e. when ) and state our main theorems for this setting. However, our mechanisms do not need “very large” markets to perform well; for instance, in the knapsack problem with additive utilities, the approximation ratio ^{4}^{4}4we didn’t try to optimize the dependence on in our analysis as we focus on the main ideas for the sake of better understanding. is when all the items have equal utilities (Section 5.2). Thus, say for and (which are reasonable assumptions in many settings) we get approximation factors and respectively.
Also we point out that the above results have applications beyond crowdsourcing  for instance, see Singer (2011) for application in marketing over social networks, and Horel et al. (2013) for application in experiment design. Singer (2011) provides a truthful mechanism with approximation ratio and Horel et al. (2013) provides an approximately truthful mechanism with approximation ratio . For both these settings, large market assumption is a very reasonable assumption to make; thus, our results apply to these applications as well. In particular, our results give fully truthful mechanisms for these applications with approximation ratios (for exponential and polynomial running time respectively) in large markets.
1.3 Related Work
The most relevant related work is that of Singer (2010) and Chen et al. (2011). Singer (2010) first introduced this model (without the large market assumption). For the case of additive utilities and indivisible items, he gave a deterministic mechanism with an approximation ratio of . Chen et al. (2011) later improved it to , and also gave a randomized mechanism with an approximation ratio of . They gave a lower bound of and for deterministic and randomized mechanisms respectively. For the case of submodular utilities, Singer (2010) gave a randomized mechanism with an approximation ratio of which was improved to by Chen et al. (2011). Chen et al. (2011) also gave an exponential time deterministic mechanism for submodular utility functions with an approximation ratio of .
Dobzinski et al. (2011) looked at the more general subadditive utility functions and gave a and approximation ratio for randomized and deterministic mechanisms respectively. Singla and Krause (2013a) design budget feasible mechanisms for adaptive submodular functions with applications in community sensing.
In another work, Bei et al. (2012) study this problem in the bayesian setting. Singer (2011) looks at the application of this model in marketing over social networks. Horel et al. (2013) study the application of this model in experiment design.
Another related model that has been inspired from crowdsourcing applications is when the workers arrive online. A sequence of papers model this as an online learning problem. See Singla and Krause (2013b); Badanidiyuru et al. (2012); Singer and Mittal (2013) for more details.
Finally, we note that our assumption for large markets is similar to the assumption made in other application areas; notably in the Adwords problem as studied by Mehta et al. (2007). See Goel and Mehta (2008); Devanur and Hayes (2009); Feldman et al. (2010, 2009) for other models motivated by online advertising where they make similar assumptions.
1.4 Roadmap
The readers are encouraged to read this section before proceeding further. We begin by developing some intuition in Section 3. In Section 2, a simple proportional share mechanism which forms the basis for Singer (2010); Chen et al. (2011) is introduced. The mechanism picks a single cutoff for the utility to cost ratio in such a way that the whole budget is consumed. In Section LABEL:sec.nonuniscalef, we generalize the simple proportional share mechanism to a class of mechanisms parameterized by a singlevariable allocation function. In later sections, we show that this generalization improves the approximation ratio ^{5}^{5}5Although the truthfulness is sacrificed, later we augment the mechanism so that it becomes truthful without compromising the approximation ratio in large market.. We develop some intuition by considering a simple instance of our generalized mechanisms: instead of a hard cutoff that is used in the proportional share mechanism, i.e. a twolevel allocation rule, we consider a special class of threelevel allocation rules and show that they can improve the approximation ratio.
The generalized mechanisms introduced are not in general truthful. In Section LABEL:sec.optmec, we introduce a simple method to make them truthful, while maintaining their individually rational and budgetfeasibility. Later when we introduce the optimal mechanism, we show that its approximation ratio does not get compromised by utilizing this method in large markets.
In Section 5, we find one of the generalized mechanisms which provides an approximation ratio of in large markets. In Section 6, we complement this result by showing that no truthful mechanism can achieve approximation ratio better than . In Section C, we adapt our mechanism to the case of indivisible items.
In Section E, we present two mechanisms for submodular utility functions which have exponential and polynomial running times and approximation ratios and , respectively.
2 A Simple Approximate Truthful Mechanism
In this section, we briefly explain the previous mechanism designed for this problem for the additive utility functions that gives an approximation factor of in large markets.
Definition 1
Costperutility rate of a seller is equal to .
A natural approach to this problem tries to find a single paymentperutility rate (denoted by rate ) at which all the winning sellers get paid. In other words, this approach picks a single number and makes a payment of to seller if she wins and pays her otherwise. For brevity, we sometimes call the paymentperutility rate simply the rate when there is no risk of confusion.
Individual rationality implies that a seller is willing to sell her item at rate iff . Initially the buyer declares a very large rate , and then sees which sellers are willing to sell at this rate. If the total cost to buy from all these sellers at rate is higher than the budget , then the buyer decreases the rate . More formally, a natural descending price auction for this problem works as follows:

Let denote the set of active sellers, and initially set .

Start with a very high rate .

Verify if all the active sellers can be paid with rate , i.e. whether or not.

If the payment is feasible, then allocate the subset , make the payment and stop.

If the payment is not feasible then decrease slightly; update accordingly by removing the sellers for whom ; go to Step 3.
The above auction captures the main idea behind the proportional share mechanisms designed in Singer (2010); Chen et al. (2011)^{6}^{6}6 It is worth pointing out that for submodular utilities, they need to use an additional trick: constructing a (sorted) list of sellers in a greedy manner before running the auction., although they describe it in a forward auction format. It is not hard to see that the above mechanism is truthful, budgetfeasible, and in large markets achieves an approximation ratio of (with small modifications, this can be converted to a randomized approximation for arbitrary markets as well Chen et al. (2011)).
3 Our Approach
In this section, we give a high level overview of our approach. Sections 3.1 and 3.2 are preliminary sections and must be read before proceeding further. The notions defined in these sections are explained more intuitively using an example in Section 3.3. Also for rest of the paper we will assume that the sellers’ items are divisible, unless we explicitly talk about indivisible items.
3.1 A Notion of An Allocation Rule
To build our new ideas, we first introduce and formalize the notion of an allocation rule.
An allocation rule is a function which determines how much to buy from a given seller. The domain of allocation rules is the cost per utility rate; meaning, given a pair of seller , the allocation rule says that we should buy fraction of seller ’s item. We do not enforce using the same allocation rule for all sellers.
We say an allocation rule is a Standard Allocation Rule if is a decreasing function such that and .^{7}^{7}7The choice of is just for simplifying the future calculations; it can be replaced with any other constant.
For any standard allocation rule , we can define an associated family of allocation rules
where denotes an allocation rule which is same as except that it is stretched along the horizontal axis with ratio , i.e. for all .
As we will see later, any single standard allocation rule and its corresponding family of allocation rules will uniquely specify our mechanism. At a high level, our mechanism will work as follows: we will pick the largest positive such that is budgetfeasible; meaning the sum of the payments with allocation rule does not exceed . However, note that we have not yet defined a payment rule given an allocation rule  we define it next.
3.2 Payment Rule
Recall that given a function and pair for a seller , the value of only tells us what fraction of seller ’s item to buy. But how much should we pay seller in order to give incentives to seller to report its cost truthfully to the mechanism? We compute these payments based on the well known Myerson’s characterization of the truthful mechanisms Myerson (1981).
Let the payment rule for seller is denoted by for an allocation rule . Here maps the reported cost of seller into its payment.
To define , we do the following thought process: Let’s divide seller ’s item into different pieces. Note that now the seller’s cost for each piece is . Thus function can now be seen as mapping the cost of a single piece into the fraction of that piece that we will buy. Let denote the function that maps the cost for a single piece into a payment for that piece. Now, Myerson’s characterization Myerson (1981) says that the payment for each piece is given by the following formula:
Intuitively, represents the area under the curve as seen in Figure 1. Going forward, we will call the function a unitpayment rule. Note that and are related by the following formula:
Thus, to summarize, for an allocation rule , we buy units of her item, and pay her amount of money.
Remark: We make a remark that the above payment rule is truthful only if the allocation rule that is offered to seller doesn’t depend on the private information (cost in this case) of the seller . If the allocation rule does depend on the private information of the seller, then the mechanism may or may not be truthful. In the next section we give a mechanism in which the allocation rule for a seller depends on its reported costs. Later in section 4, we give a mechanism where the allocation rule for a seller doesn’t depend on its reported cost, thus our payment rule will ensure that the resulting mechanism is truthful.
3.3 Example
Suppose the buyer has budget , and sellers each own an item with cost and , respectively. Also, suppose both of the items have utility .
Let the mechanism use the family of curves for where the domain of is . The mechanism should find the largest for which is budget feasible. To this end, we set to be a very large number and decrease until is budget feasible. For instance, suppose we start from (see Figure 2). Observe that the payment of the mechanism in this case would be and to and , respectively. Since the sum of payments exceed , then is not budget feasible. Consequently, we reduce further until the mechanism becomes budget feasible at : At this point, the payment of the mechanism to and would respectively be and , which sum up to be exactly . (see Figure 3)
3.4 First Attempt: A Parameterized Class of EnvyFree Mechanisms
In this section we describe a mechanism (denoted by Mechanism 1()) that is not always truthful, but it will form the basis of our truthful mechanism. Moreover, some structural results about this mechanism will be useful while analyzing our truthful mecahanism, thus we will be talking about this mechanism throughout the paper. This mechanism is parameterized by the choice of a standard allocation rule . The mechanism described in this section offers a single allocation rule to all the sellers, thus it is envyfree (although it may not be truthful).
Definition 2
We say that an allocation rule is a budgetfeasible allocation rule if , i.e. the payments defined with respect to sum up to .
Now given any standard allocation rule , the mechanism starts with a very large scaling ratio so that we are guaranteed to have .
Then, the mechanism decreases until the rule becomes a budgetfeasible rule (say at ). The mechanism stops at this point and uses and to determine the allocations and payments. The ratio is also called the stopping rate of the mechanism. We define this process formally in Mechanism 1().
One can easily see that the above mechanism is budgetfeasible, individually rational, and envyfree; however, it may not be truthful. Also, the efficiency of the above mechanism depends on the choice of function . Thus, an important question is: What is the optimal choice of function ? Let’s first understand the performance of the above mechanism for a simple choice of function .
Definition 3
A standard allocation rule is called a uniform standard allocation rule if for , and otherwise. Figure 4 depicts this curve.
One can show that the above envyfree mechanism when run using a uniform standard allocation rule, mimics the simple factor mechanism presented earlier. Thus, it turns out to be truthful as well for this choice of standard allocation rule. However for more general allocation rules, the above envyfree mechanism might not be truthful. Thus before we answer the harder question about the optimal choice of function , we next describe the truthful version of the above envyfree mechanism.
4 A Parameterized Class of Truthful Mechanisms
We use a simple trick to convert Mechanism 1() to a truthful mechanism. The idea is to define, for each seller , an allocation rule which does not depend on . In particular, we define the allocation rule for seller to be , where will be chosen independently of . For finding , we run Mechanism 1() on the instance which is obtained by setting to be while keeping cost of the other sellers intact; would be the stopping rate of the mechanism 1(). The formal definition of the truthful mechanism appears in Mechanism 2.
In Lemma 2, we prove that Mechanism 2 is individually rational, truthful, and budgetfeasible for any given standard allocation rule . First, we state the following useful lemma.
Lemma 1
For any seller we have .
Proof. The proof is based on the fact that is an increasing function of (for a fixed ) and is a decreasing function of (for a fixed ). The proof is by contradiction, suppose . Let for all and let . Observe that
where the first inequality is due to the fact that is a decreasing function of and the second inequality is due to the fact that . However, note that the above inequalities imply that , which contradicts with the budget feasibility of Mechanism 1(): see that represents the payment of 1() when the costs are , and so it can not be larger than .
Lemma 2
Mechanism 2 is individually rational, truthful, and budgetfeasible.
Proof. Note that the allocation and payment rules for seller , i.e. , do not depend on the cost reported by her. This fact, along with the fact that is a monotone rule (decreasing function) implies individual rationality and truthfulness. The proof is almost identical to the proof of Myerson’s Lemma and we do not repeat it here.
The proof for budget feasibility needs a bit more work. Let denote the payments to seller respectively in Mechanism 2 and Mechanism 1(), i.e. and . The lemma is proved if we show that , since we have .
To see , note that is an increasing function of (for a fixed ). So, since we have due to Lemma 1, it must be the case that .
5 A ()Approximate Optimal Truthful Mechanism
So far, we have introduced a parameterized class of individually rational, truthful, and budgetfeasible mechanisms for the problem: Passing any standard allocation rule to Mechanism 2 fixes the mechanism which we denote by . Our goal in this section is to find the most efficient mechanism in this class. Formally, given a standard allocation rule , we denote the approximation ratio of by and define it as:
where the infimum is taken over all instances of the problem ^{8}^{8}8If we are focused on large markets, we take only instances for which the largeness ratio is smaller than some threshold, and take the limiting approximation factor as the threshold goes to . . Here denotes the utility obtained by in instance , and denotes the optimum utility in instance .
The most efficient allocation rule , is the one which maximizes . Our goal, in this section and Section 6, is to find the most efficient allocation rule and its corresponding approximation ratio. Formally, we prove the following theorem.
Theorem 1
The most efficient standard allocation rule for Mechanism 2 is , for which we get , i.e. it has an approximation ratio .
We prove this theorem in two parts: In the first part we show that for ; this is proved in the current section. In the second part, we show that for any standard allocation rule . This fact can be seen as a consequence of our hardness result in Section 6, which states that no truthful mechanism can achieve approximation ratio better than . We also provide a more direct (alternative) proof in Section D that shows our choice of is optimal among all possible choices of the standard allocation rules.
5.1 Finding an optimal for the (nontruthful) Mechanism 1()
In this section, we prove that Mechanism 1() has approximation ratio for . Note that the Mechanism 1() is not truthful, however its analysis will be helpful when analyzing our truthful mechanism in Section 5.2 and in Section A. Here, we analyze Mechanism 1() assuming that the true costs are known; later, in Section 5.2, we use this result to prove that Mechanism 2 has approximation ratio for the same choice of .
5.1.1 Preliminaries
We use to denote the inverse of an allocation rule , i.e. . Given an allocation rule , we also write an alternative definition of its corresponding unitpayment rule . This definition, rather than being in terms of , would be in terms of . This alternative definition is denoted by , and is defined such that . For instance, if a seller owns an item with utility , then we pay her when a fraction of her item is allocated. To be more precise, for we define
We also denote and respectively by and .
Proposition 1
Given the standard allocation rule , it is straightforward to verify that and . Also, .
From now on in this section, we assume that . Next, we prove a useful inequality in the following lemma which will be used in the analysis of 1().
Lemma 3
For any such that we have
Proof.
5.1.2 Approximation Ratio of Mechanism 1()
In the following lemma, we prove the efficiency of Mechanism 1() when all sellers report true costs.
Lemma 4
If sellers report true costs, then Mechanism 1() has approximation ratio .
Proof. Observe that w.l.o.g. we can assume : If , then we can construct a new instance which is similar to the original instance and has stopping rate . More precisely, there exists some such that if we multiply the budget and the reported costs by , the stopping rate becomes equal to . Note that this operation will not change the optimal solution or the solution of 1() and can be performed w.l.o.g.
Now, suppose that a fraction of item is allocated by 1(). Since , we can use Lemma 3 to write the following set of inequalities:
where is the fraction that is allocated from seller in the optimal solution (recall that we are are comparing 1() with the optimum fractional solution). The above inequalities can be multiplied by on both side and be written as:
By adding up these inequalities, we get:
(1) 
Now, we show that if
(2) 
then the lemma is proved using (1) and (2). First we show why (1) and (2) prove the lemma, and then in the end, we prove (2) itself.
5.2 Special case: Analyzing our truthful mechanism for unit utilities
In this section, we prove that Mechanism 2 has approximation ratio in large markets when it uses the standard allocation rule for the special case when all the utilities are equal to . In other words, we will show that approximation ratio approaches as , the market’s largeness ratio, approaches for the case of unit utilities. The proof for the case of general utilities is intricate and appears in Section A.
Note that the assumption of unit utilities imply for any seller . Next, we state two lemmas before proving the approximation ratio. For simplicity in the analysis, w.l.o.g., assume that .
Lemma 5
.
Lemma 6
Let denote the maximum utility that the buyer can achieve with budget (when the items are divisible). Then, is a concave function.
Proofs for both of these lemmas are straightforward and are deferred to the appendix, Section B.
6 Impossibility Result: On why is the best approximation possible
In this section we show that no truthful (and possibly) randomized mechanism achieves approximation ratio better than . We prove a stronger claim by allowing satisfying budget feasibility in expectation, i.e. we prove that no truthful mechanism that is budget feasible in expectation can achieve ratio better than . From now on in this section, we assume that all the mechanisms that we refer to are truthful, and are also budget feasible in expectation. First, we prove the claim assuming that the items are indivisible, then we will see that the same proof easily extends to divisible items as well.
Proof Outline.
We construct a bayesian instance of the problem and prove that no budget feasible truthful mechanism for this instance can achieve approximation ratio better than ; this also implies that no mechanism for the priorfree setting can achieve ratio better than ^{9}^{9}9This is so because an approximate mechanism in the priorfree setting is also approximate in the bayesian setting.
The proof is done in two steps. First, we show that for any truthful mechanism for this instance, there exist a simple posted price mechanism that achieves at least the same revenue. The posted price mechanism simply offers the same price to every seller and pays to any seller who accepts the offer and to others. In the second step of the proof, we show that for any choice of , such mechanisms can not achieve a ratio better than .
The proof that we present w.l.o.g. analyzes the market in expectation: budget feasibility is satisfied in expectation; also, the utility of the mechanisms are computed in expectation.
We now give the full proof by first giving our hardness instance.
The Hardness Instance.
We construct a bayesian instance of the problem in which all the sellers have unit utility and their costs are drawn i.i.d. from a distribution with CDF , defined as follows:
In other words, denotes the probability that the cost of a seller is at most . Let be the distribution defined by and let denote the expected cost of a seller sampled from , i.e. . We define the budget to be where is an arbitrary integer denoting the number of sellers.
Definition 4
A posted price mechanism is a mechanism that offers a price to any seller , and pays her if she accepts the offer and pays her otherwise.
Definition 5
A uniform posted price mechanism is a posted price mechanism that offers the same price to all sellers.
Definition 6
A cutoff allocation rule is an allocation rule which allocates the whole unit of an item if its cost is less than a certain cutoff and allocate units otherwise. Let denote a cutoff allocation rule with the cutoff price .
It is clear that posted price mechanisms use cutoff allocation rules to allocate items from sellers.
Lemma 7
If the sellers costs are drawn i.i.d. from the distribution , then for any mechanism in this bayesian setting there exists a posted price mechanism with the same approximation ratio.
Proof. Due to Myerson’s Lemma (see footnote LABEL:myersonfnote), any truthful mechanism in the bayesian setting can be seen as a set of allocation and payment rules corresponding to each seller, where the allocation rule is a decreasing function (of the cost) and the payment rule is defined with respect to the allocation rule as we saw in Figure 1. Given such an allocation rule for an arbitrary seller , namely , one can think of a simpler way to implement (in expectation) by finding a distribution over cutoff allocation rules.
More precisely, we find the distribution with PDF such that the distribution over all cutoff allocation rules, which assigns probability density to the cutoff allocation rule , implements the allocation rule in expectation. We prove the existence of in the following claim.
Claim 1
Define the distribution by its CDF such that for any cost . Then would implement in expectation.
Proof. All we need to show that a seller with price would be allocated with probability in . To see this, note that the probability that the seller is allocated is exactly equal to the probability of observing a cutoff price at least when a cutoff price is sampled from . This probability is equal to by the definition of ; this proves the claim.
Now, we claim that the cutoff allocation rule with the cutoff price
(4) 
achieves the same utility and spends the same budget (in expectation) as the allocation rule paired with its corresponding Myerson payment rule.
Claim 2
For any seller , achieves the same utility and spends the same budget (in expectation) as the allocation rule paired with its corresponding Myerson payment rule.
Proof. The main idea of the proof is that the set of points forms a straight line in the twodimensional plane; See Figure 5 for a proof by picture.
Note that denotes the expected allocation when a price is offered to a seller and is the corresponding expected payment.
Now, see that the expected utility achieved by the allocation rule is , which is exactly equal to the expected utility achieved by due to (4). To prove the claim, it remains to verify that the Myerson payment rules corresponding to and spend the same budget (in expectation). To this end, just observe that is a straight line; consequently, since the allocation rules allocate equal units of items (in expectation), then they also spend the same amount of the budget (in expectation).
Due to Claim 2, the posted price mechanism that offers price to seller is budget feasible (in expectation) and also achieves an expected utility equal to the utility of the originally given mechanism. This proves the claim.
Lemma 8
If the sellers costs are drawn i.i.d. from the distribution , then for any (budget feasible) posted price mechanism there exists a (budget feasible) uniform posted price mechanism with the same approximation ratio.
Proof. Suppose that denotes the offered prices in a posted price mechanism and let
First, observe that the uniform posted price mechanism with price achieves a utility equal to the utility of the original posted price mechanism; this can be verified simply due to linearity of expectation. It remains to verify that the uniform posted price mechanism is budget feasible. To this end, just observe that the set of points (depicted in Figure 5) is a straight line; consequently, since the posted price mechanism and the uniform posted price mechanism allocate equal units of items (in expectation), then they also spend the same amount of the budget (in expectation).
Theorem 2
For the case of indivisible items, no truthful budget feasible mechanism can achieve approximation ratio better than .
Proof. We use Lemma 8 and show that no uniform posted price mechanism can achieve ratio better than . Equivalently, we show that the uniform posted price mechanism which spends all the budget in expectation has approximation ratio no better than .
To define the uniform posted price mechanism that spends all the budget in expectation, we need to find such that . Given the definitions of and , we can solve this equation to get . Now, we are ready to compute the approximation ratio. First, note that the (expected) utility of the uniform posted price mechanism is . If we had , then we had (the optimum solution could buy all items), and so we could write the approximation ratio as
which would prove the claim. However, although , the sum is not always bounded by , which means does not always hold. We find a way to fix this issue using Hoeffding bounds (see Section F to see formal statements of Hoeffding bounds). We show that although is not always bounded by , it is concentrated around its mean, , with high probability. We will see that this is enough to prove the theorem.
As a consequence of Hoeffding bounds (stated in Section F), for any we have:
(5) 
Recall that we defined and that in our hardness instance . Using (5), we will provide an upper bound on the approximation ratio which, for any constant , approaches to as approaches infinity. This proves that the approximation ratio can not be a constant larger than .
To this end, first note that if , then we have ; this holds due to Lemma 6. We can use this fact along with (5) to write the following upper bound on the (expected) approximation ratio:
The above ratio clearly approaches as . Noting that finishes the proof.
Now we use Theorem 2 to prove its counterpart for divisible items.
Corollary 1
For the case of divisible items, no truthful budget feasible mechanism can achieve approximation ratio better than .
Proof. Proof by contradiction. Suppose there exists a mechanism with approximation ratio for some constant . Then, we show that we can convert this mechanism to an approximation mechanism for indivisible items which is truthful and budget feasible in expectation. This would contradict Theorem 2.
To do this conversion, we repeat the exact same argument that we used to prove Theorem 2. As the result, we can convert the given approximation mechanism to a uniform posted price mechanism with approximation ratio . Note that all posted price mechanisms allocate items without dividing them. Consequently, we have an approximation mechanism for indivisible items. Contradiction.
7 Conclusion
Our main contribution is designing optimal budget feasible mechanisms for the knapsack model in large markets. First, we assume that the items are divisible, and study a natural class of deterministic mechanisms: each mechanism in this class is characterized by a decreasing allocation function. All the mechanisms in this class are individually rational, truthful and budget feasible, but they have different approximation ratios based on the choice of the allocation function. We find a mechanism in this class which has an approximation ratio , and prove that no truthful mechanism can achieve a better approximation ratio.
We also provide a mechanism with approximation ratio for the case of indivisible items: the idea is to first run the mechanism for divisible items, and then round the obtained fractional solution (allocation). We design a rounding process that takes the fractional allocation as its input and outputs an integral allocation with its associated payments. Due to the properties of our rounding process, the resulting mechanism is individual rational, truthfulinexpectation, and budget feasible; also, it has approximation ratio in large markets.
Finally, we study the problem for submodular utility functions with indivisible items. For this case, we first design a deterministic mechanism which has approximation ratio in large markets; this mechanism can have an exponential running time in general. Inspired by this mechanism, we also design a polynomialtime deterministic mechanism with approximation ratio . We do not provide any results for when the items are divisible in the submodular model: One has to model the utility function over divisible items; the multilinear extension Vondrak (2008) or Lovàsz extension of submodular functions is a potential choice for this purpose. We leave open this case for future study. Bayesian
Acknowledgments
We acknowledge the comments and ideas of an anonymous reviewer that helped us generalize our impossibility result to the case of bayesian setting.
References
 (1) Amazon’s mechanical turk platform. https://www.mturk.com/.
 (2) Clickworker, virtual workforce. http://www.clickworker.com.
 (3) Crowdflower. http://www.crowdflower.com/.
 (4) Hoeffding bounds. http://en.wikipedia.org/wiki/Hoeffding’s_inequality.
 Badanidiyuru et al. (2012) Badanidiyuru, A., Kleinberg, R., and Singer, Y. 2012. Learning on a budget: Posted price mechanisms for online procurement. In EC.
 Bei et al. (2012) Bei, X., Chen, N., Gravin, N., and Lu, P. 2012. Budget feasible mechanism design: from priorfree to bayesian. In STOC.
 Chen et al. (2011) Chen, N., Gravin, N., and Lu, P. 2011. On the approximability of budget feasible mechanisms. In Proceedings of the TwentySecond Annual ACMSIAM Symposium on Discrete Algorithms. SODA ’11. SIAM, 685–699.
 Devanur and Hayes (2009) Devanur, N. R. and Hayes, T. P. 2009. The adwords problem: online keyword matching with budgeted bidders under random permutations. In ACM Conference on Electronic Commerce.
 Dobzinski et al. (2011) Dobzinski, S., Papadimitriou, C. H., and Singer, Y. 2011. Mechanisms for complementfree procurement. In ACM Conference on Electronic Commerce. 273–282.
 Feldman et al. (2010) Feldman, J., Henzinger, M., Korula, N., Mirrokni, V. S., and Stein, C. 2010. Online stochastic packing applied to display ad allocation. In ESA (1). 182–194.
 Feldman et al. (2009) Feldman, J., Korula, N., Mirrokni, V. S., Muthukrishnan, S., and Pál, M. 2009. Online ad assignment with free disposal. In WINE. 374–385.
 Goel and Mehta (2008) Goel, G. and Mehta, A. 2008. Online budgeted matching in random input models with applications to adwords. In SODA.
 Horel et al. (2013) Horel, T., Ioannidis, S., and Muthukrishnan, S. 2013. Budget feasible mechanisms for experimental design. CoRR abs/1302.5724.
 Mehta et al. (2007) Mehta, A., Saberi, A., Vazirani, U. V., and Vazirani, V. V. 2007. Adwords and generalized online matching. J. ACM 54, 5.
 Myerson (1981) Myerson, R. B. 1981. Optimal Auction Design. Mathematics of Operations Research 6, 58–73.
 Singer (2010) Singer, Y. 2010. Budget feasible mechanisms. In FOCS. 765–774.
 Singer (2011) Singer, Y. 2011. How to win friends and influence people, truthfully: Influence maximization mechanisms for social networks. In WSDM.
 Singer and Mittal (2013) Singer, Y. and Mittal, M. 2013. Pricing mechanisms for crowdsourcing markets. In WWW. 1157–1166.
 Singla and Krause (2013a) Singla, A. and Krause, A. 2013a. Incentives for privacy tradeoff in community sensing. In AAAI Conference on Human Computation and Crowdsourcing (HCOMP).
 Singla and Krause (2013b) Singla, A. and Krause, A. 2013b. Truthful incentives in crowdsourcing tasks using regret minimization mechanisms. In WWW. WWW ’13. 1167–1178.
 Sviridenko (2004) Sviridenko, M. 2004. A note on maximizing a submodular set function subject to a knapsack constraint. Operations Research Letters 32, 1, 41 – 43.
 Vondrak (2008) Vondrak, J. 2008. Optimal approximation for the submodular welfare problem in the value oracle model. In Proceedings of the 40th Annual ACM Symposium on Theory of Computing. STOC ’08. ACM, 67–74.
Appendix A Analyzing our Optimal Truthful Mechanism for the general case
In this section we will prove that the approximation ratio of Mechanism 2 approaches as , the market’s largeness ratio, approaches . We emphasize that here we dismiss the extra assumption that was made in Section 5.2: There, we assumed all items have utility , here we give a proof for the general case when item provides utility .
Lemma 9
For each , .
Proof. We just need to prove that is not a fit rule (i.e. does not consume all of the budget) when we set the cost of item to . First of all, note that
Here we used the fact that is a decreasing function. This implies that . This expression is the budget consumed by the rule without setting the cost of item to . When we set to , the amount of budget consumed can be bounded in the following manner
(6) 
Note that is defined to be the area of the shaded region as seen in figure 1. Therefore one can crudely upper bound the difference by for any . Now letting , and substituting in inequality 6 we get
This completes the proof.
Lemma 10
Mechanism 2 has an approximation ratio approaching as approaches .
Proof. W.l.o.g. assume that (since we can scale the budget and costs by an appropriate scaling factor). Now let us pick a constant threshold and partition the indices into two sets and : let be the set of indices where and let be the complement.
Let be the minimum where . If happens to be empty, let . Let be the budget consumed by the allocation rule , i.e. let . We will prove that is close to . If , this is obviously true because . So assume that for some .
Because of the way is chosen, we have
(7) 
Here we used the fact that (since we assumed ). Note that . Combining this with the inequality 7 we get
Using lemma 6, one can see that . But we also know from lemma 4 that the utility achieved by is at least . Therefore we have
(8) 
For an item , we have (we used lemma 9). Therefore
One can easily verify that is a concave function. Therefore for is minimized at . This means that
If we let , then for every we have