Dashboard Mechanisms for Online Marketplaces
This paper gives a theoretical model for design and analysis of mechanisms for online marketplaces where a bidding dashboard enables the bid-optimization of long-lived agents. We assume that a good allocation algorithm exists when given the true values of the agents and we develop online winner-pays-bid and all-pay mechanisms that implement the same outcome of the algorithm with the aid of a bidding dashboard. The bidding dashboards that we develop work in conjunction with the mechanism to guarantee that bidding according to the dashboard is strategically equivalent (with vanishing utility difference) to bidding truthfully in the truthful implementation of the allocation algorithm. Our dashboard mechanism makes only a single call to the allocation algorithm in each stage.
This paper formalizes a theoretical study of bidding dashboards for the design of online markets. In these markets short-lived users arrive frequently and are matched with long-lived agents. Rarely in practice do market mechanisms, which allow the agents to bid to optimize their matching with the users, have truthtelling as an equilibrium strategy.111Such a matching is often the combination of a ranking of the agents by the market mechanism and the users’ choice behavior. Examples of the agents in such mechanisms include, sellers posting a buy-it-now price on eBay, hotilliers choosing a commission percentage on Booking.com, or advertisers bidding in Google’s ad auction. The first-two-examples are “winner pays bid” for the agents; none of these examples have truthtelling as an equilibrium. Though the Google Ad auction is not winner-pays-bid, its bidding dashboard displays clicks versus cost-per-click which is equivalent to a winner-pays-bid mechanism. The non-truthfulness of mechanisms for these markets is frequently the result of practical constraints; this paper is part of a small but growing literature on the design of these non-truthful mechanisms. By formally modeling bidding dashboards, which are common in these markets, this paper gives simple and practical approach to non-truthful mechanism design.
Two key challenges for the design of non-truthful mechanisms are (i) assisting the agents to find good strategies in these mechanisms and (ii) identifying mechanisms that have good outcomes when agents find good strategies. Bidding dashboards, which provide agents with market information relevant for bid optimization and are common in online marketplaces,222For example, Google provides a dashboard that forecasts click volume and cost per click as a function of advertisers’ bids and Booking.com provides a visibility booster that forecasts percentage of increased clicks relative to base commission as a function of the commission percentage offered by the hotel for bookings. can provide a solution to both issues. This paper gives a formal study of dashboard mechanisms and identifies dashboards and the accompanying mechanism that lead agents to find good strategies and in which good strategies result in good outcomes.
A novel technical feature of the mechanisms proposed by this paper is the use of econometric inference to infer the preferences of the agents in a way that permits the mechanism to then allocate nearly optimally for the inferred preferences. To illustrate this idea, consider the Nash equilibrium of the winner-pays-bid mechanism that allocates a single item to one of two agents with probability proportional to their bids, i.e., an agent wins with probability equal to her bid divided by the sum of bids, cf. the proportional share mechanism (e.g., Johari and Tsitsiklis, 2004). Fixing the bid of agent 2, and assuming agent 1’s bid is in best response, the standard characterization of equilibria in auctions allows the inversion of agent 1’s bid-optimization problem and identification of agent 1’s value.333The bid-allocation rule of agent 1 is . Agent 1 with value chooses ; this bid satisfies . For example, if and , we can infer that agent 1’s value is . The same holds for agent 2. Thus, in hindsight the principal knows both agents’ values. The difficulty, which this paper addresses, of using this idea in a mechanism is the potential circular reasoning that results from attempting to allocate efficiently given these inferred values.
The paper considers general environments for single-dimensional agents, i.e., with preferences given by a value for service and utility given by value times service probability minus payment. The main goal of the paper is to convert an allocation algorithm which maps values of the agents to which agents are served into a mechanism that implements the same outcome but has winner-pays-bid or all-pay format. In winner-pays-bid mechanisms the agents report bids, the mechanism selects a set of agents to win, and all winners pay their bids. In all-pay mechanisms the agents report bids, the mechanism selects a set of agents to win, and all agents pay their bids. For practical reasons, many markets are winner-pays-bid or all-pay.444All-pay mechanisms correspond to markets were agents pay monthly for service and the service level depending on total payment but the total payment is fixed in advance.
From a theoretical point of view, the focus on winner-pays-bid and all-pay mechanisms is interesting because the prior literature has not identified good winner-pays-bid mechanisms for general environments (with the exception of Hartline and Taggart, 2016, to be discussed with related work). For example, natural winner-pays-bid mechanisms for the canonical environment of single-minded combinatorial auctions do not exhibit equilibria that are close to optimal (even for special cases where winner determination is computationally tractable; see Lucier and Borodin, 2010, and Dütting and Kesselheim, 2015).
The motivating marketplaces for this work are ones were short-lived users perform a search or collection of searches in order to identify one or more long-lived agents with whom to transact. The principal facilitates the user search by, for example, ranking the agents by a combination of perceived match quality for the user and the agents’ willingness to pay for the match, henceforth, the agents’ valuations. Strategically, the agents are in competition with each other for matching with the user. Viewing the user as a random draw from a population of users with varying tastes results in a stochastic allocation algorithm which maps the valuation profile of agents to the probabilities that a user selects each agent. This stochastic allocation algorithm is typically monotone in each agent’s value and continuous (e.g., Athey and Nekipelov, 2010).
A dashboard gives the agent information about market conditions that enables bid optimization. Without loss in our setting, the dashoard is the agent’s predicted bid allocation rule. Our dashboard mechanism assumes that the bids are best response to the published dashboard, i.e., they “follow the dashboard”, and uses econometric inference to invert the profile of bids to get a profile of values. The dashboard mechanism for a given dashboard and allocation algorithm is as follows:
Publish the dashboard bid allocation rule to each agent and solicit each agent’s bid.
Invert the bid of each agent, assuming it is in best response to the published dashboard, to obtain an inferred value for each agent.
Execute the allocation algorithm on the inferred values to determine the outcome; charge each winner her bid (winner-pays-bid) or all agents their bids (all-pay).
Our main result is to identify dashboards where following the dashboard is not an assumption but near optimal strategy. For online marketplaces, we define winner-pays-bid and all-pay dashboards for any sequence of allocation algorithms where following the dashboard for a sequence of values is approximately strategically equivalent to bidding those values in the sequential truthful mechanism corresponding to the sequence of allocation algorithms. Thus, there is an approximate correspondence between all equilibria. The notion of approximate strategic equivalence is that the allocations are identical and the differences in payments vanish with the number of stages.
Our dashboards can be implemented in the (blackbox) single-call model for mechanism design. In this model, an algorithm has been developed that is monotonic in each agent’s value and obtains a good outcome. The mechanism’s only access to the allocation algorithm is by live execution, i.e., where the outcome of the algorithm is implemented. Babaioff et al. (2010) introduced this model and showed that truthful mechanisms can be single-call implemented. We show that winner-pays-bid and all-pay dashboard mechanisms can be single-call implemented in online marketplaces.
The above approach to the design of online markets is simple and practical. Though we have specified the mechanism and framework for winner-pays-bid and all-pay mechanisms and single-dimensional agents, the approach naturally generalizes to multi-dimensional environments and other kinds of mechanisms. It requires only that, from bids that are optimized for the dashboard, the preferences of the agents can be inferred.
This paper is on non-revelation mechanism design as it applies to online markets. The goals of this paper – in providing foundations for non-revelation mechanism design – are closest to those of Hartline and Taggart (2016). Both papers study iterated environments and the design of mechanisms with welfare that is arbitrarily close to the optimal welfare under general constraints. (Hartline and Taggart additionally consider the objective of maximizing revenue.) The equilibrium concept of Hartline and Taggart is Bayes-Nash equilibrium and it is assumed that agents are short-lived. These assumptions are not appropriate for the design of online markets considered in this paper, where agents are long-lived and possibly have persistent values.
There have been extensive studies of the price of anarchy of ad auctions, e.g., on the Google search engine (Leme and Tardos, 2010; Caragiannis et al., 2011; Syrgkanis and Tardos, 2013). A conclusion of these studies is that in standard models of equilibrium the industry-standard generalized second price auction can have welfare that is a constant factor worse than the optimal welfare. In these auctions, dashboards can also be used to infer agent values and the mapping from bids to values can be used in place of the the “quality score” already in place in these mechanisms. This approach could result in more efficient outcomes.
The rest of this paper is organized as follows. Section 2 gives notation for discussing allocation algorithms, mechanisms, and agents. Section 3 describes the dashboard mechanism for any monotone and strictly continuous allocation algorithm and dashboard. Section 4 gives a dashboard and proves that, in static settings, following the dashboard converges to Nash equilibrium and this equilibrium implements the algorithm’s desired allocation. Section 5 defines the incentive constancy, shows that incentive consistency of follow-the-dashboard and constant-bid strategies implies that following the dashboard is a no-regret strategy, and proves that a simple all-pay dashboard is incentive consistent for these two kinds of strategies. Section 6 gives sophisticated winner-pays-bid and all-pay dashboards based on the idea of rebalancing the residual of actual and appropriate payments. These dashboards are incentive consistent for all strategies and thus the resulting dashboard mechanism is behaviorally equivalent to the truthful mechanism for the allocation algorithm. Section 7 generalizes the dashboards to environments where the mechanism only has access to the allocation algorithm by single blackbox call.
This paper considers general environments for single-dimensional linear agents. An agent has value ; for allocation probability and expected payment , the agent’s utility is . A profile of agent values is denoted ; the profile with agent ’s value replaced with is .
A non-revelation mechanism maps a profile of bids to profiles of allocation probabilities and payments . A winner-pays-bid mechanism is specified by a bid allocation rule and with bid payment rule defined as , i.e., agent pays her bid when she wins. An all-pay mechanism is specified by a bid allocation rule and with bid payment rule defined as . A stochastic allocation algorithm maps a valuation profile to a profile of allocation probabilities. Our goal is to define winner-pays-bid and all-pay mechanisms that implement the allocation algorithm .
We will consider mechanisms in an online setting where the mechanism is to be run over stages . In the dynamic setting an agent’s stage value and the allocation algorithm can change in every stage. In the static setting both are assumed to be constant. When analyzing a single stage the superscript will be dropped.
An important special case is where there is agent, or equivalently, when the bids of all agents but one are fixed. For single-agent problems, there is a simple mapping between bid-allocation rules and value-allocation rules, optimal bids can be easily calculated from values, and values can be easily calculated from optimal bids. The paragraph below makes these observations precise.
With value the agent’s optimal bid in single-agent mechanism satisfies from first-order conditions. For winner-pays-bid mechanisms where and all-pay mechanisms where , the agent’s value can be calculated from her optimal bid as:
|(Equations (1), (2), and (3) give winner-pays-bid and all-pay formulas, respectively.) Given a non-decreasing single-agent allocation rule , the payment identity of Myerson (1981) gives formulas for the optimal bid for value in both a winner-pays-bid mechanism and an all-pay mechanism , each implementing for all :|
|When is strictly increasing and continuously differentiable, is as well and is, thus, invertible. From this inversion , the bid allocation rule that implements the allocation rule is:|
Note in equation (3) that distinctness of strategies implies distinctness of bid allocation rules corresponding to the same allocation rule for winner-pays-bid and all-pay formats.
Our dashboard mechanisms in Section 3 effectively convert the mechanism design problem of implementing a multi-agent allocation algorithm into a collection of problems where these single-agent derivations apply. When analizing a single agent in the multi-agent problem the subscript will be dropped and outcomes for the agent will be analyzed for implicit values or bids of other agents.
3 Dashboard Mechanisms
This section defines a family of dashboard mechanisms that give a practical approach to bidding and optimization in non-truthful Internet markets. The principal publishes a bidding dashboard which informs the agents of the market conditions; the agents use this dashboard to optimize their bids. We consider dashboards that give each agent estimates of the outcome of the mechanism for any possible bid of the agent. These estimates can be constructed, for example, from historical bid behavior of the agents in the mechanism.
A dashboard for an -agent winner-pays-bid mechanism is a profile of bid-allocation rules where is the forecast probability that agent wins with bid .
We consider the implementation of an allocation algorithm with a strictly monotone and continuously differentiable allocation rule by winner-pays-bid and all-pay dashboard mechanisms. The winner-pays-bid and all-pay format dashboards are identical except with respect to equations (1)-(3) which respectively define the bid-to-value inversion, the value-to-bid optimal bidding function, and the translation from (value) allocation rules to bid allocation rules.
We consider two design questions: (a) what dashboard should the principal publish and (b) what mechanism should the principal run. The goal is to pick a dashboard for which following the dashboard mechanism is a good strategy and if agents follow the dashboard then the allocation algorithm is implemented. In fact, if agents follow the dashboard, then the mechanism to implement the allocation algorithm is straightforward.
The dashboard mechanism for dashboard and allocation algorithm is:
Solicit bids for dashboard .
Infer values from bids from equation (1).
Output allocation (with prices according to the payment format).
Notice that in the above definition the bid allocation rule is a mapping from bid profiles to allocation profiles while the dashboard is a profile of single-agent bid allocation rules each of which maps a bid to an allocation probability. In other words, outcomes according to the dashboard are calculated independently across the agents while outcomes according to the mechanism depend on the reports of all agents together. The following proposition follows simply from the correctness of equation (1).
In the dashboard mechanism (Definition 2) for any given strictly monotone and continuously differentiable dashboard and allocation algorithm, if the agents follow the dashboard then the allocation algorithm is implemented.
4 Inferred Values Dashboards
Our analysis of dashboards in this section is restricted to static settings where both the agents’ values and the allocation algorithm do not change from stage to stage. The question we address next is how the principal should construct the dashboard to ensure the game, if all agents follow the dashboard, converges so that following the dashboard is a Nash equilibrium. For the static setting, we will say that following the dashboard converges to Nash equilibrium if, assuming other agents follow the dashboard, an agent’s best response converges to following the dashboard.
Our approach is motivated by the solution concept of fictitious play. In fictitious play agents best respond to the empirical distribution of the actions in previous rounds. Fictitious play assumes that the agents know the actions in past rounds. The principal could publish these actions; however, a better approach is to just publish as a dashboard the aggregate bid allocation rules that result. Our dashboard follows this approach except with respect to estimated values rather than actions. Best responding to such a dashboard is, in a sense, an improvement on fictitious play.
For stage , the inferred values dashboard is the profile of single-agent bid allocation rules defined as follows:
The inferred valuation profile for stage is .
The profile of single-agent allocation rules for stage is with for each agent .
The empirical profile of single-agent allocation rules at stage is with for each agent .
The inferred values dashboard is the profile of single-agent bid allocation rules that correspond to profile via equation (3).
The -lookback inferred values dashboard is the variant of the inferred values dashboard that averages over the last stages. The last-stage inferred values dashboard chooses .
The inferred values dashboard does not specify a dashboard for stage 1. For stage 1, any strictly increasing dashboard will suffice. When the agents values and the allocation rule are static, following the dashboard converges to Nash equilibrium.
For the -lookback inferred dashboard mechanism, any fixed valuation profile , and any fixed continuous and strictly monotone allocation algorithm , if agents follow the dashboard in stages through then following the dashboard in stage is a Nash equilibrium.
The dashboards up to and including round are the average of the bid allocation rule for a continuous and strictly increasing allocation algorithm; thus they are continuous and strictly increasing. By equation (1) the inferred values of agents that follow the dashboard are the true values. These values are the same in each stage; thus the profile of single-agent allocation rules in each stage is the one that corresponds to the allocation algorithm on true valuation profile. The average of these profiles of allocation rule is the profile itself (they are all the same). The dashboard is the corresponding profile of single-agent bid allocation rules.
Consider agent and assume that other agents are following the dashboard in stage . Thus, the estimated profile of other agent values is . The allocation rule in value space faced by agent is which is equal to the allocation rule in all the previous stages. As the dashboard suggests bidding optimally according to the allocation rule of the previous stages, bidding according to the dashboard is agent ’s best response. Thus, following the dashboard is a Nash equilibrium in stage . ∎
This dashboard mechanism is simple and practical and can implement any strictly monotone and continuously differentiable allocation algorithm. For example welfare maximization with a convex regularizer gives such a allocation algorithm. For the paradigmatic problem of single-minded combinatorial auction environments, our dashboard mechanism with the regularized welfare-maximization allocation algorithm gives outcomes that can be arbitrarily close to optimal (for the appropriate regularizer).
A critical issue with the guarantee of Theorem 1 is that it is delicate to the Nash assumption. If other agents do not follow the dashboard, then following the dashboard is not necessarily a good strategy for the agent. The remainder of the paper will resolve this issue by giving stronger analyses and stronger dashboards. In the next section we will show that the regret, in the learning sense, of an agent that follows the dashboard in the last-stage all-pay dashboard mechanism vanishes with the number of stages . Specifically, for any actions of the other agents, an agent with value that follows the dashboard has average stage utility that is at least the average stage utility of any fixed bid in hindsight (up to an additive loss of ).
5 Incentive Consistency, No-Regret, and the All-pay Last-stage Dashboard
In this section we define a notion of incentive inconsistency that will quantify as an additive loss how different are average utilities obtained from strategies in dashboard mechanisms to those from strategies in truthful mechanisms. Outcomes from strategies with vanishing incentive inconsistency will be analyzable via the usual geometric understanding incentive compatible mechanisms in single-dimensional environments from Myerson (1981). The strategies in the definition can be any perhaps randomized mapping from an agent’s value and observable histories to bids, as usual in an extensive form game.
A dashboard mechanism is incentive inconsistent for an agent strategy if, for any induced allocation rules of the mechanism, an agent strategy generates a sequence of bids that imply inferred values for which the average per-stage outcomes satisfy
If this condition holds for all strategies, the dashboard is incentive inconsistent. A dashboard is incentive consistent if its incentive inconsistency vanishes with the number of stages .
A consequence of incentive inconsistency is that the (value) allocation rules induce agent utility as they usually do in Nash equilibrium. Specifically, denote by the utility of an agent with value following the dashboard optimally for an agent with value .
If the dashboard mechanism is incentive inconsistent with the agent strategy then the agent with values in each stage and inferred values according to the strategy of in each stage has utility
From the definitions of utility and incentive inconsistency. ∎
In this section we will focus on two particular kinds of strategies. In the follow-the-dashboard strategy an agent with value bids optimally for her value and the given dashboard . In the constant-bid strategy an agent bids a fixed bid in every stage. The following theorem shows that following the dashboard is a no-regret strategy.
If a dashboard is and incentive inconsistent for the follow-the-dashboard strategy and all constant-bid strategies, respectively, then the following-the-dashboard strategy has regret at most .
Consider the follow-the-dashboard strategy and fix the resulting sequence of stage allocation rules. We compare the utility of following the dashboard to any fixed bid in hindsight (where “in hindsight” means with respect to the stage allocation rules resulting from the follow-the-dashboard strategy).
Write the utility of the agent in stage with value and estimated value in the incentive compatible mechanism with allocation rule as:
|Incentive compatibility requires for all that the true value gives higher utility than|
The inferred values for the follow-the-dashboard strategy are identically . Denote the inferred values for the constant-bid strategy as . Denote the agent’s average stage utility from following the dashboard by and the utility from the constant bid by . The definition of incentive inconsistency, definition of utility (5), and truth-telling utility inequality (6) give:
Below we show that the all-pay last-stage dashboard (Definition 4) is incentive consistent for follow-the-dashboard and constant-bid strategies. A key observation that makes the incentive consistency straightforward in all-pay last-stage dashboard mechanism is that the payment in stage is the correct incentive compatible payment for the allocation of stage . Thus, pairing as “outcome ” the allocation from stage with the payment from stage , we may as well view the outcomes as those from a sequence of incentive compatible mechanisms when averaging across stages (up to a vanishing loss of at most for the one stage that cannot be paired like this).
The lemmas below show that all-pay last-stage dashboard is incentive inconsistent with for both the strategy of following the dashboard for any fixed value and the strategy of bidding any fixed bid . For the former ; for the latter .
For an agent with constant per-stage value , the last-stage all-pay dashboard is incentive inconsistent for the follow-the-dashboard strategy.
We will consider grouping the allocation of stage with the payment in stage (with the allocation of round paired with the payment of round ) and we will refer to this grouping as outcome . As we have discussed above, if the agent follows the dashboard, outcomes corresponding to are incentive compatible outcomes for the allocation rule of stage and value . The imbalance of payments from outcome (allocation from stage and payment from stage 1; for formal argument see Lemma 4 in Section 6) is at most . ∎
The last-stage all-pay dashboard is incentive inconsistent for the constant bid strategy.
For fixed stage allocation rules, consider the hindsight deviation of an agent to bid in every stage, similarly grouping allocations and payments into outcomes as in the proof of Lemma 2, the outcome in stage are incentive compatible outcomes for the allocation rule of stage and the value that would bid . Thus, the incentive inconsistency is zero. ∎
Comparing the utility of the agent for following the dashboard and bidding for outcome we see that the former is the utility maximizing outcome for value while the latter is the utility maximizing outcome for value . Since the agent’s value is the utility of the former is at least the utility of the latter. For outcome (allocation from stage and payment from stage 1), the utility from following the dashboard can be worst than the incentive compatible utility by at most (as the difference between the correct payment for round and the actual payment in round 1 is at most ), and the incentive compatible utility is at least the utility from bid . Thus, the total regret is at most and the per-stage regret is at most .
For the all-pay last-stage dashboard mechanism, an agent with value that follows the dashboard for stages has per-stage regret at most
The above analysis of incentive consistency of the follow-the-dashboard strategy guarantees that following the dashboard is a good strategy compared to single-bid deviations when the agents value is constant in each round. This analysis does not rule out the possibility that some other strategy is better than following the dashboard. In the next section we develop dashboard mechanisms that are incentive consistent for all strategies. Importantly, this property enables dashboards that work even when the agents’ values and the allocation algorithm changes from stage to stage.
6 Payment Rebalancing Dashboards
Below we develop a dashboard that will satisfy the strong guarantee that its equilibria are approximately the same as the equilibria from running the truthful mechanism that implements the allocation algorithm in each stage. By the following theorem, it will be sufficient to develop a dashboard that is incentive consistent for all strategies.
A dashboard mechanism that is -incentive inconsistent for all strategies is strategically equivalent up to utility to a sequential truthful mechanism with the same sequence of allocation algorithms .
Strategy profiles in the dashboard mechanism are in one-to-one correspondence with sequences of inferred valuation profiles . If each agent reports their sequence of inferred values in the truthful mechanism, incentive consistency implies that the utility that each agent receives is within of their utility in the dashboard mechanism. Thus, all approximate equilibria in the sequential truthful mechanism are approximate equilibria of the dashboard mechanism and vice versa. ∎
We now develop a dashboard mechanism that is incentive inconsistent for all strategies. The high-level approach of this dashboard is simple: When the actual payment and the incentive compatible payment are different, add the residual payment to a balance and adjust the dashboard to either collect additional payment or to discount the required payment so that the residual payment is resolved over a few subsequent stages. The difference in utility of an agent in such a dashboard mechanism and the truthful mechanism with the same allocation algorithm is bounded by the per-stage residual payment and the number of stages it takes to resolve it. When these quantities are both constants, the average per-stage difference between the dashboard mechanism’s outcome and the truthful mechanism’s outcome vanishes with the number of stages.
The existence of this dashboard shows that in a repeated scenario there is essentially no difference between winner-pays-bid, all-pay, and truthful payment formats. In some sense, linking payments between stages and adjusting the mapping from bids to values allows a mechanism designer to choose a payment format that is appropriate for the application.
Three final comments before presenting the mechanism: First, balancing payments in all-pay mechanisms is much easier than balancing payments in winner-pays-bids mechanisms. The reason is simple, in all-pay mechanisms the payment is deterministic and, thus, any additional payment requested is paid exactly. On the other hand, payments collected in winner-pays-bid mechanisms depend on the probability of allocation. If this probability of allocation is fluctuating then collected payments can over- or under-shoot a target. We give an approach below that resolves this issue. Second, in the static setting where the agent values and the allocation algorithm are unchanging, in the follow-the-dashboard equilibrium of the dashboard mechanism the only non-trivial payment residual is in the first stage, once this balance is resolved, the accrual of subsequent balance is off the equilibrium path. Thus, in steady state the rebalancing required by the dashboard is trivial. Third, dashboards with payment rebalancing allow dynamically changing environments, e.g., agent values and the allocation algorithm. For agents that follow the dashboard, the rebalancing mechanism only kicks in when the environment changes.
The main idea in this rebalancing approach is that adding a constant to the agent’s expected payment, i.e., setting in the payment identity, does not affect incentives. While this approach can occasionally violate per-stage individual rationality, an agent’s long term utility is optimized by continuing to participate even in these stages.
The rebalancing dashboard for dashboard allocation rule , corresponding payment rule , rebalancing rate , and outstanding balance is that corresponds to payment rule defined as , i.e., with .
From Definition 6, the bidding strategy can be calculated for winner-pays-bid and all-pay dashboards, respectively
The bid-allocation rule of the dashboard is then defined via the allocation rule and the inverse of the strategy via equation (3). The bid of an agent should be viewed as shown in equation (7) as two terms. The first term is for the dashboard and the second term is for resolving the outstanding balance. After each stage the balance is adjusted to account for how much of the outstanding balance was resolved and by any new payment residual resulting from misestimation of the dashboard and the realized allocation rule . While our analysis will keep track of when payment residuals are generated and how long it takes to resolve them, the balance tracking need only consider the difference between what was paid and what should have been paid for the realized allocation rule.
For allocation rule , corresponding payment rule , dashboard allocation rule , and inferred value , the payment residual under the payment rebalancing dashboard is
|in winner-pays-bid and all-pay formats, respectively. The balance resolved is|
respectively. The total change to the balance is .
Note that in the above definition for winner-pays-bid mechanisms, if then and . For all-pay mechanisms with , then and . The perspective to have is that in steady-state it should be that and there should be no payment residual but, when agents arrive, depart, or have value changes, then there can be non-trivial payment residual which will be rebalanced by the dashboard. More generally the following lemma bounds the magnitude of the payment residual.
In a stage in which the agent’s inferred value is , the magnitude of payment residual is at most .
From equation (8), for all-pay mechanisms the payment residual is the difference in bids in an individually rational mechanism. As bids for an agent with value are between and , the magnitude of their difference is at most . For winner-pays-bid is again a difference of bids and multiplying this difference by preserves the bound of on its magnitude. ∎
The analysis in the sections below makes no assumption about whether the mechanism is in steady-state. Both the mechanism and the agents’ values may change from stage to stage. The main idea of the analysis is to consider the payment residual in stage and calculate how much of it can remain after stage when there is a guaranteed fraction of it is rebalanced as part of for subsequent stages . We give a worst case analysis that pessimistically assumes that the payment residual in each stage is the same sign and equal to its maximum value, i.e., the value of the agent. These bounds apply to any dashboard that may not have any relation to the realized allocation rule.
6.1 All-pay Rebalancing Dashboards
From stage with outstanding balance , agent bid , inferred value , realized allocation rule , and corresponding payment rule ; the stage outstanding balance in the all-pay rebalancing dashboard is:
The analysis of the rebalancing mechanism for all-pay mechanisms is trivial. To parallel the subsequent presentation for winner-pays-bid mechanisms below, we write the formal lemma and theorem. Lemma 5 combines with Lemma 4, bounding the payment residual, to give the theorem.
In the all-pay dashboard with rebalancing rate , in any stage the balance resolved is ; for the full outstanding balance is resolved.
In dynamic environments, the all-pay payment rebalancing dashboard with and per-stage estimated value at most has outstanding balance at stage at most ; consequently, over stages the dashboard mechanism is incentive inconsistent for all strategies.
As we saw in the previous section, the all-pay auction generally balances itself when grouping the allocation in stage with the payment in stage . It is straightforward to adjust the rebalancing mechanism to only resolve balance that is residual after this natural balancing of the all-pay dashboard. One approach is to allow the magnitude outstanding balance to be a small positive multiple of and only resolve the outstanding balance when this allowed magnitude is exceeded. We omit further details.
Finally, note that Theorem 4 can be improved in static environments where both inferred stage values and the single-agent allocation rules induced from the values of the other agents are static, i.e., and . In these cases the inferred values dashboard is in all stages but the first and only this first stage accrues payment residual. Thus, the total balance at stage with is .
6.2 Winner-pays-bid Rebalancing Dashboards
Defining rebalancing dashboards is more challenging for winner-pays-bid dashboards as the balance resolved depends on the dashboard at the estimated value which we do not require to be constant across rounds. We therefore chose a conservative rebalancing rate to avoid overshooting the target of zero balance. For dashboards that have a lower bound on the minimum allocation probability , it is possible to avoid overshooting by setting the rebalancing rate to (specifically is a good choice). Our perspective on the assumption that the allocation probability is lower bounded by is analogous to the assumption in multi-armed bandit learning that each arm is played with a minimum probability, i.e., exploring. Any dashboard or mechanism that does not satisfy this property can be made to satisfy this property by, e.g., by randomly allocating to the agent with some small probability.
Since we are considering winner-pays-bid mechanisms we will only adjust an agent’s balance when the agent is allocated. To make explicit the actual values versus their expectations we will adopt notation and for the actual balance and actual allocation with and . We write, per the discussion above, the payment residual and amount rebalanced for inferred value as:
where and . Note that is the winner-pays-bid bid strategy corresponding to while is the same for dashboard .
From stage with outstanding balance , agent bid , inferred value , realized allocation rule , corresponding payment rule , and realized allocation ; the stage outstanding balance in the all-pay rebalancing dashboard is:
We consider the stages where the agent is allocated, i.e., . These are the stages where and are non-zero. We consider the amount of the payment residual from stage that remains at final stage when at each subsequent stage in a fraction of that payment residual is resolved.
At any rebalancing stage and under any agent strategy, the winner-pays-bid rebalancing dashboard with for dashboard allocation rule with allocation probability supported on resolves balance between and .
This result follows from the definition of and the fact that for all by the assumption that . ∎
The following theorem about the winner-pays-bid dashboard rebalancing mechanism will upper bound the outstanding balance at any time. With these quantities taken as constants relative to the number of stages , the imbalance per stage is vanishing with . It is useful to contrast the assumptions and bounds of the analogous result for all-pay dashboards (Theorem 4) with Theorem 5 (and Theorem 5, below).
For any monotonic stage allocation rules and any monotonic dashboard rules , each with allocation probabilities supported on , the winner-pays-bid payment rebalancing dashboard with and per-stage payment residual at most has outstanding balance at stage of at most .
If we start a stage with outstanding balance , Lemma 6 implies that at least a (and at most 1) fraction of it is resolved. The balance remaining is at most (and at least zero).
Consider the stages indexed in decreasing order. The payment residual from stage that remains in the final stage is at most . (This bound is worst case, specifically, we do not track the possibility that some of the balance might cancel with new payment residual of the opposite sign.) We can bound the outstanding balance at stage by:
Combining Theorem 5 with Lemma 4, the bound on the residual payment of gives the subsequent corollary. The winner-pays-bid mechanism with the rebalancing procedure described above is incentive inconsistent with in rounds. With held as constants the incentive inconsistency vanishes with .
In dynamic environments, the winner-pays-bid rebalancing dashboard with allocation probabilities supported on and rebalancing rate with per-stage estimated value at most , the outstanding balance at stage is at most ; consequently, over stages the dashboard mechanism is incentive inconsistent for all strategies.
Finally, note that Theorem 5 can be improved in static environments where both inferred stage values and the single-agent allocation rules induced from the values of the other agents are static, i.e., and . In these cases the inferred values dashboard is in all stages but the first and only this stage accrues non-trivial payment residual. In this case, the total balance at stage is , i.e., exponentially small.
7 Single-call Dashboards
In this section we generalize the construction of dashboards and their analyses to the practically realistic case that the principal has only single-call access to the allocation algorithm. For example, given a profile of values, the principal can draw a single sample from the distribution of the algorithm’s outcomes. Such a dashboard is necessary for online marketplaces where the allocations of the algorithm are endogenous to the behavior of one side of the market. E.g., in ad auctions where the mechanism is designed for the advertisers and the users show up and make decisions that realize the stochasticity of the allocation. The single-call perspective of this section allows viewing the allocation algorithm as a map from the values of the agents to how agents are prioritized in the marketplace where stochastic outcomes are obtained.
The single-call model of mechanism design is one where the designer has an algorithm and aims to design a mechanism that implements the allocation of the algorithm in equilibrium. The designer, however, can only call the algorithm once. The single-call model is significant in that the standard methods for calculating payments in mechanisms, e.g., in the Vickrey-Clarke-Groves or unbiased payment mechanism (Archer and Tardos, 2001), require blackbox calls to the algorithm. Babaioff et al. (2010) showed that single-call mechanisms exist. These mechanisms slightly perturb the allocation algorithm and use these perturbations to compute the correct payments for the perturbed allocation algorithm.
There are two challenges to single-call implementation of dashboards. First, for the valuation profile input into the algorithm, only the realized allocation in is observed. The allocation probabilities are not observed. Second, the principal does not have counterfactual access to the allocation algorithm that we have previously used to determine reasonable dashboards. Recall, a dashboard is a prediction of the bid allocation rule faced by a agent. This prediction requires knowledge, e.g., for agent of for all .
Our approach is to solve both of these challenges at once. Specifically, we instrument the allocation algorithm with uniform exploration, i.e., with probably independently for each agent we enter a uniform random value rather than the agent’s value. This instrumentation degrades the quality of the allocation algorithm by . First, this uniform instrumentation can be viewed as a randomized controlled experiment for estimating the counterfactual allocation rule as is necessary for giving the agent a dashboard that estimates her bid-allocation rule. Second, the uniform exploration enables implicitly calculating unbiased incentive compatible payments for the realized mechanism (cf. Babaioff et al., 2010). The difference between these desired payments and the actual payments, i.e., the agent’s bid if she wins, can then be added to the balance in the payment rebalancing dashboard of Section 6.
The instrumented allocation algorithm for allocation algorithm , instrumentation rate , valuation range , and input valuation profile is:
For each agent , sample
Define and sample , i.e., run on .
For each agent , set instrumentation variables
The incentive compatible payment functions for are denoted by (via the payment identity).
The instrumentation payment variables are unbiased estimators for the incentive compatible payments for the instrumented allocation algorithm , i.e., .
We break the payment for agent into two parts, the part where and the part where . In the latter part the payment identity requires zero payment. In the former, which happens with probability , the payment conditioned on is . The expected payment conditioned on (but not conditioning on ) is .
Now evaluate as defined in Definition 10 as follows:
Since the expected payments conditioned on are equal, so are the unconditional expected payments. ∎
Instrumentation allows the mechanism to construct a consistent estimator for the realized allocation rule that is otherwise unknown.
For stage , the instrumented dashboard is the profile of single-agent bid allocation rules defined by for agent as follows:
Let denote the agent’s inferred value in stage .
Define allocation data set corresponding to the uniform instrumentation .
Define the empirical average allocation as as the average allocation of this data set.
Define the empirical allocation rule as as a continuous isotonic regression of the allocation data set.
Define the instrumented allocation rule as .
The instrumented dashboard for agent is the bid-allocation rule that corresponds to via equation (3).
Standard methods for isotonic regression can be used to estimate the empirical allocation rule in Definition 11. Errors in the regression will be resolved by the rebalancing approach of Section 6. A key required property, however, is that dashboard is continuous. Thus, isotonic regressions that result in continuous functions should be preferred. Approaches to isotonic regression in statistics, including smoothing and penalization (see Ramsay, 1988, Mammen, 1991, Kakade et al., 2011) guarantee continuity of the resulting estimator for the regression function. The ironing procedure common in Bayesian mechanism design is an isotonic regression; however, it results in discontinuous functions and is, thus, inappropriate for constructing dashboards.
The analysis below focuses on the winner-pays-bid rebalancing instrumented dashboard for the instrumented allocation algorithm. (Recall that all-pay mechanisms have deterministic payments and, thus, the budget imbalance generated at a given round can be resolved deterministically. We omit the simple analysis.) With winner-pays-bid mechanisms the payments are only made when the agent is allocated. As a result, the payment residual and balanced resolved are stochastic and necessitate a more sophisticated analysis.
The rebalancing approach in Section 6 used the functional form of the allocation algorithm to calculate the difference between actual payments given by the payment format and incentive compatible payments given by the payment identity. This payment residual is added to a balance and each stage a portion of the balance is added to the payment of a type with zero value when determining the dashboard. Our approach here is to instead use the implicit payments which, by Theorem 6, are unbiased estimators of the incentive compatible payments. The three terms in the balance update formula below are the previous balance, the implicit payment of the current stage, and the actual payment of the current stage.
From stage with outstanding balance , agent bid , realized instrumented allocation , and realized implicit payment ; the stage outstanding balance in the winner-pays-bid instrumented rebalancing dashboard is:
We first confirm that in expectation the analysis of Section 6 holds. Recall from equation (7) that the bid strategy for the payment rebalancing dashboard for allocation rule is . Thus the actual payment in the balance update formula can be split for inferred value satisfying giving a balance update as
|Define the payment residual (cf. Definition 7) as the first two terms and the resolved balance as the last term:|
|Thus, the total change to the cumulative balance is . Importantly the expected payment residual matches Definition 7. Taking expectations and applying Theorem 6 we have,|
Here is the payment rule for and payment satisfies the payment identity for . Thus, if the dashboard is correct, i.e., , then the expected payment residual is zero. When we have incorrect estimates, the extent to which the difference of the first terms is not zero is gives a payment residual that must be rebalanced in the future.
Our analysis starts with three observations:
When the payment residual and amount rebalanced are zero, i.e., ; otherwise:
the size of the range of is (equal to the size of the range of which equals for inferred value ); and
the amount rebalanced is for inferred value .
Our analysis proceeds like that of Section 6. We consider the stages where the agent is allocated, i.e., . These are the stages where and are non-zero. We consider the amount of the payment residual from stage that remains at final stage when, at each subsequent stage in , a fraction of that payment residual is resolved. Lemma 6 shows that the balance resolved in each stage is between and . The magnitude of the payment residual is upper bounded by in the lemma below.
In a stage in which the agent’s inferred value is , the single-call winner-pays-bid rebalancing dashboard for instrumented allocation algorithm with instrumentation probability has payment residual with magnitude at most .
When the payment residual is zero. The payment residual when is with . The magnitude . Since the winner-pays-bid strategy (without the addition for rebalancing) is individually rational and, thus, . ∎
We are ready now to apply Theorem 5 to the rebalancing dashboard for the instrumented allocation algorithm. A helpful property of the instrumented allocation algorithm (Definition 10) is that the instrumentation implies that the allocation probabilities for all agents is bounded away from zero, i.e., the minimum allocation probability for an agent in the dashboard is times the average allocation probability for with value in the induced allocation algorithm. We need to set to lower bound this quantity.
For the single-call winner-pays-bid rebalancing dashboard for instrumented allocation algorithm with rebalancing rate (set appropriately), instrumentation parameter , and values in ; the outstanding balance at stage is at most .
Note that this bound is on the realized total balance and is not a high-probability or in-expectation result; it holds always. The payment residual at any stage is a random variable with expected magnitude at most and range at most . The expected payment residual, however, has magnitude at most (Lemma 4). Theorem 5 implies that the expected outstanding balance at time is at most . Incentive consistency is defined in expectation over randomization in the mechanism and strategies. We have the following corollary. As before, holding constant, the incentive inconsistency vanishes with .
For the single-call winner-pays-bid rebalancing dashboard for instrumented allocation algorithm with rebalancing rate (set appropriately), instrumentation parameter , and values in ; over stages the dashboard mechanism is incentive inconsistent for all strategies.
The bound of Corollary 3 can be improved via the Chernoff-Hoeffding inequality. We show that the outstanding balance at time is the weighted average of these payment residuals with weights that are geometrically decreasing. This results in the following high-probability bound.
For the single-call winner-pays-bid rebalancing dashboard for instrumented allocation algorithm with rebalancing rate (set appropriately), instrumentation parameter , and values in ; the outstanding balance at stage is at most with probability at least .
where the range of each term in the sum is bounded by
Then we can apply the Chernoff-Hoeffding inequality to evaluate
which follows from since and
Then with probability the imbalance of empirical dashboard with instrumentation can deviate from the expectation no further than ∎
- Archer and Tardos (2001) Archer, A. and Tardos, É. (2001). Truthful mechanisms for one-parameter agents. In Proceedings 2001 IEEE International Conference on Cluster Computing, pages 482–491. IEEE.
- Athey and Nekipelov (2010) Athey, S. and Nekipelov, D. (2010). A structural model of sponsored search advertising auctions. In Sixth ad auctions workshop.
- Babaioff et al. (2010) Babaioff, M., Kleinberg, R. D., and Slivkins, A. (2010). Truthful mechanisms with implicit payment computation. In Proceedings of the 11th ACM conference on Electronic commerce, pages 43–52. ACM.
- Caragiannis et al. (2011) Caragiannis, I., Kaklamanis, C., Kanellopoulos, P., and Kyropoulou, M. (2011). On the efficiency of equilibria in generalized second price auctions. In Proceedings of the 12th ACM conference on Electronic commerce, pages 81–90. ACM.
- Dütting and Kesselheim (2015) Dütting, P. and Kesselheim, T. (2015). Algorithms against anarchy: Understanding non-truthful mechanisms. In Proceedings of the Sixteenth ACM Conference on Economics and Computation, pages 239–255. ACM.
- Hartline and Taggart (2016) Hartline, J. and Taggart, S. (2016). Non-revelation mechanism design. arXiv preprint arXiv:1608.01875.
- Johari and Tsitsiklis (2004) Johari, R. and Tsitsiklis, J. N. (2004). Efficiency loss in a network resource allocation game. Mathematics of Operations Research, 29(3):407–435.
- Kakade et al. (2011) Kakade, S. M., Kanade, V., Shamir, O., and Kalai, A. (2011). Efficient learning of generalized linear and single index models with isotonic regression. In Advances in Neural Information Processing Systems, pages 927–935.
- Leme and Tardos (2010) Leme, R. P. and Tardos, E. (2010). Pure and Bayes-Nash price of anarchy for generalized second price auction. In Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on, pages 735–744. IEEE.
- Lucier and Borodin (2010) Lucier, B. and Borodin, A. (2010). Price of anarchy for greedy auctions. In Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms, pages 537–553. Society for Industrial and Applied Mathematics.
- Mammen (1991) Mammen, E. (1991). Estimating a smooth monotone regression function. The Annals of Statistics, pages 724–740.
- Myerson (1981) Myerson, R. B. (1981). Optimal auction design. Mathematics of Operations Research, 6(1):58–73.
- Ramsay (1988) Ramsay, J. (1988). Monotone regression splines in action. Statistical science, 3(4):425–441.
- Syrgkanis and Tardos (2013) Syrgkanis, V. and Tardos, E. (2013). Composable and efficient mechanisms. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 211–220. ACM.