Inference on Auctions with Weak Assumptions on Information1footnote 1footnoteFootnotefootnotesFootnotes1footnote 1We thank participants at the Econometric Society China Meeting in Wuhan - 2017 and at the R. Porter Festschrift at Northwestern University in May 2017 for comments. Code for monte-carlo simulations and ocs data analysis using our methods can be found at https://github.com/vsyrgkanis/information_robust_econometrics_auctions

# Inference on Auctions with Weak Assumptions on Information111We thank participants at the Econometric Society China Meeting in Wuhan - 2017 and at the R. Porter Festschrift at Northwestern University in May 2017 for comments. Code for monte-carlo simulations and ocs data analysis using our methods can be found at https://github.com/vsyrgkanis/information_robust_econometrics_auctions

Vasilis Syrgkanis1    Elie Tamer2    Juba Ziani3
22Microsoft Research, New England; vasy@microsoft.com
33Department of Economics, Harvard University; elietamer@fas.harvard.edu
44Department of Computing and Mathematical Sciences, California Institute of Technology; jziani@caltech.edu
September 15, 2019
###### Abstract

Given a sample of bids from independent auctions, this paper examines the question of inference on auction fundamentals (e.g. valuation distributions, welfare measures) under weak assumptions on information structure. The question is important as it allows us to \etdeleteseparate what can be learned \eteditlearn about the valuation distribution in a robust way, i.e., without assuming that a particular information structure holds across observations. \vseditWe leverage recent contributions in the robust mechanism design literature that exploit the link between Bayesian Correlated Equilibria and Bayesian Nash Equilibria in incomplete information games to construct an econometrics \eteditframework for learning about auction fundamentals using observed data on bids. We showcase our construction of identified sets in private value and common value auctions. Our approach for constructing these sets inherits the computational simplicity of solving for correlated equilibria\vsedit: checking whether a particular valuation distribution belongs to the identified set \vsdeletefor example is as simple as determining whether a linear program is feasible. \eteditA similar linear program can be used to construct the identified set on various welfare measures and counterfactual objects. \eteditFor inference and to summarize statistical uncertainty, we propose novel finite sample methods using tail inequalities that are used to construct confidence regions on sets. We also highlight methods based on Bayesian bootstrap and subsampling. \etdeletea set of approaches for statistical inference on sets that include the Bayesian bootstrap, subsampling, and finite sample methods using tail inequalities. A set of Monte Carlo experiments show adequate finite sample properties of our inference procedures. We also illustrate our methods using data from OCS auctions.

## 1 Introduction

A recent literature in robust mechanism design studies the following question: given a game \vsedit(e.g. an auction), what are the possible outcomes - such as welfare or revenue - that arise under different information structures. This literature is motivated by robustness, i.e., characterizing outcomes that can occur in a given game under weak assumptions on information (See Bergemann and Morris (2013)). For example, in auction models, in addition to specifying the details of the game in terms of bidder utility function, and bidding rules, one needs to specify the information structure (what players know about the state of the world and the information possessed by other players) to derive the Nash equilibrium.

For instance, \vseditin the Independent Private Values (IPV) setting, players know their \vseditvalue for the item, which is assumed independent from other player \vseditvalues, and receive no further information about their opponents’ values. \vseditThe latter typically yields a unique equilibrium outcome. However, different assumptions on what signals players have about opponent values prior to bidding in the auction, lead to different equilibrium outcomes. \vseditAuction data rarely contain information on what bidders knew and what their information sets \etdeletecontains \eteditinclude, \eteditand given that this information leads to different outcomes, \etdeleteso it would be interesting to analyze what can happen when we relax the independence assumption in such auctions \eteditby allowing bidders to know some information about their opponents’ valuations. Bergemann, Brooks, and Morris (2017) (BBM) examine exactly this question in an auction game and provide achievable bounds on various outcomes\vsedit, such as the revenue of the auction as a function of auction fundamentals like the distribution of the common value.

In this paper, we address \vseditthe \eteditfollowing \etdeletereverse econometrics question, \eteditwhich is the reverse of the one posed by BBM above: given an i.i.d. sample of auction data (independent copies of bids from a set of auctions), what can we learn about auction fundamentals, such as the distribution of \vsdeletesignals, values \vsdelete, etc, when we make weak assumptions on the information structure? We maintain throughout that \eteditplayers use \etdeleteassumption of Bayesian Nash equilibrium (BNE) but allow these bidders \etdeletein different auctions to have different information structures \eteditin different auctions\vsedit, i.e. receive different types of signals prior to bidding. \vsdeleteGiven iid data from a set of auctions, we analyze the inference/econometrics questions of how to use these data to learn about the auction fundamentals: the valuation distribution, the distribution of signals, etc without making strong assumptions on information. In particular, we use these observations (bids and other observables) from a set of independent auctions to construct sets of valuation distributions that are consistent with both the observed distribution of bids and the \vseditknown auction rules maintaining that players are Bayesian. We exploit the robust predictions in a given auction à la BBM to conduct econometrically robust predictions of auction fundamentals given a set of data; i.e., robust economic prediction leads to robust inference.

Key to our approach is the characterization of sharp sets of valuation distributions (and other functionals of interest) via computationally attractive procedures. This is a result of the equivalence between a particular class of Bayes Correlated Equilbria (or BCE) and Bayes Nash Equilibria (or BNE) for a similar game with an arbitrary information structure. It is well known that BCEs can be computed efficiently since they are solutions to linear programs (as opposed to BNE which are hard to compute). \vseditMoreover, exploiting a result of Bergemann and Morris (2016) (see also Aumann (1987)), we show that there is an equivalence between the set of fundamentals that obey the BCE restrictions and the fundamentals that obey the BNE constraints under some information structure. This equivalence is the key to our econometrics approach. The formal statistics program that ensues is one where the sharp set satisfies a set of linear equality and inequality constraints. If we knew the true distribution of bids, then this becomes a simple computational problem os solving a linear program. We do not observe the true bid distribution, but this distribution can be estimated consistently from the observed data. So, we use the estimated bid distribution to solve for an estimate of the identified set. We are also able to characterize sampling uncertainty to obtain various notions of confidence regions \eteditcovering the identified with a prespecified probability.

In addition to learning about auction primitives, we show how our approach using data on auctions can be used to construct identified sets for auction welfare measures, seller surplus and other objects. Information on auction primitives along with these measures obtained using our procedures that combine data with the theory can be used to guide future market designers to better study particular auction setups (or use our results in other markets).

\vsedit

Importantly, we address the problem of counter-factual estimation: what would the revenue or surplus have been had we changed the auction rules. We formulate notions of informationally robust counter-factuals and we show that such counter-factual questions can also be phrased as solutions to a single linear program, \etdeletethat simultaneously captur\etdeletees\etediting equilibrium constraints in the current auction as well as the new target auction. We show that even without recovering the information structure from the data, an analyst can perform robust counterfactuals: under an arbitrary information structure in the current auction which produced the data at hand, what is the best and worst value of the given quality measure (e.g. welfare, revenue) in the new target auction under an arbitrary information structure also? Thus we can get estimates of the upper and lower bounds of a given quantity in the new auction design, in a way that is robust to information structure and without the need of recovering it. \eteditSo, this approach to inference requires minimal assumptions on information in the DGP that generated the data, but also minimal assumptions in the counterfactual auction that a market designer is contemplating to run.

The closest work to our paper is Bergemann, Brooks, and Morris 2017, where the authors provide worst-case bounds on the revenue of a common value first price auction as a function of the distribution of values. Their approach does not use the bid distribution as input, unlike our approach which obtains an estimate of the bid distribution from the data. The main approach in BBM is to claim that the revenue cannot be too small since at BCE, no player wants to deviate to any other action and so players do not want to deviate to a specific style of a deviation which is: conditional on your bid, deviate uniformly at random above your bid (upwards deviation). So, the bound on the mean they provide uses a subset of the set of best response deviations that are allowed \vseditso as to bound the bid of a player as a function of his value. \etdeleteand Hence\etedit, in drawing a connection between the equilibrium bid and the value, \eteditthis bound is by definition loose \etdeletein drawing a connection between the equilibrium bid and the value (and can be very loose - bound twice as large as identified set - as we show in an example in the appendix). Given data, we are able to learn the bid distribution and hence we are are not constrained to look at only these bid-distribution-oblivious upwards deviations. We can instead compute an optimal best-response bid for this given bid distribution and use the contraint that the player does not want to deviate to this tailored best-response. This allows us to bound the unobserved value of the player as a function of the observed bid \etdeleteThis approach leading to a sharp characterization of \vseditauction fundamentals using data.
Another paper that uses a similar insight of studying the econometrics of games with weak information is the recent work of Magnolfi and Roncoroni (2016) on inference in entry game models. The approach used there, though similar in motivation, does not transport easily to studying general auction mechanisms.

The paper is organized as follows. Section 2 introduces the problem and provides formal definitions of the objects of interest. We then state our identification results given an iid set of data on bids. This identification is constructed via a linear program where we show how various constraints (such as symmetry, parametric restrictions, etc) can be used. \vseditWe also show how sharp sets for the expected value of moments of the fundamentals amounts to solving two linear programs and how robust counter-factuals to changes in the auction, with respect to some metric function can also be easily handled. \vseditWe then provide two example applications of the general setup: one for common value auctions (Section 3) and another for \vseditprivate value auctions (Section 4). Section 6 provides our estimation approach for constructing confidence intervals on the estimated quantities from sampled datasets using sub-sampling methods and finite sample concentration inequality approaches. Section 7 examines the finite sample performance of the large linear program using a set of Monte Carlo simulations. These show adequate performance in IPV and CV setups. Section 8 illustrates our inference approach using auction data from OCS wildcat oil auctions and show how the statistical algorithm can be used to derive bounds on valuation distributions. \vsdeleteSection ? provides further extensions and Section ? concludes. \eteditFinally, the Appendix contains results on the sharpness of the BBM bounds, and bounds on the mean of the valuation in common value auctions with different smoothness assumptions.

## 2 Bayes-Correlated Equilibria and Information Structure Uncertainty

We consider a game of incomplete information among players. There is an unknown payoff-relevant state of the world . This state of the world enters directly in each player’s utility. Each player can pick from among a set of actions and receives utility which is a function of the payoff-relevant state of the world and the action profile of the players : . This along with a prior on (defined below) will represent the game structure that we denote by , as separate from the information structure which we will define next.

Conditional on the state of the world each player receives some minimal signal . The state of the world and the vector of signals are drawn from some joint measure555The setup is general in that the set is unrestricted. . The signals ’s can be arbitrarily correlated with the state of the world. We denote such signal structure with . This defines the game

We consider a setting where prior to picking an action each player receives some additional information in the form of an extra signal . The signal vector can be arbitrarily correlated with the true state of the world and with the original signal vector . We denote such augmenting signal structure with and the set of all possible such augmenting signal structures with . This will define a game . Subsequent to observing the signals and , the player picks an action . A Bayes-Nash equilibrium or BNE in this game is a mapping from the pair of signals to a distribution over actions, for each player , such that each player maximizes his expected utility conditional on the signals s/he received.

A fundamental result in the literature on robust predictions (see Bergemann and Morris (2013, 2016)), is that the set of joint distributions of outcomes , unknown states and signals , that can arise as a BNE of incomplete information under an arbitrary additional information structure in , is equivalent to the set of Bayes-Correlated Equilibria, or BCE in . \eteditSo, every BCE in is a BNE in for some augmenting information structure . \etdeletethere is an augmenting information structure such that We give the formal definition of BCE next.

###### Definition 1 (Bayes-Correlated Equilibrium).

A joint distribution is a Bayes-correlated equilibrium of if for each player , signal , action and deviating action :

 (1)

and such that the marginals with respect to signals and payoff states are preserved, i.e.:

 ∀θ∈Θ,t∈T:∑a∈Aψ(θ,t,a)=π(θ,t) (2)

An equivalent and simpler way of phrasing the Bayes-correlated equilibrium conditions is that:

 ∀ti,ai,a′i:∑θ,t−i,a−iψ(θ,t,a)⋅(ui(a;θ)−ui(a′i,a−i;θ))≥0 (3)

We state the main result in Bergemann and Morris (2016) next.

###### Theorem 1 (Bergemann and Morris (2016)).

A distribution can arise as the outcome of a Bayes-Nash equilibrium under some augmenting information structure , if and only if it is a Bayes-correlated equilibrium in .

The robustness property of this result is as follows. The set of BNEs for (think of an auction with unknown information) is the same as the set of BCEs for where is an augmented information structure derived from . So, we will not need to know what is in , but rather we could compute the set of BCEs for and the Theorem shows that for each BCE, there exists a corresponding information structure in and a BNE of the game that implements the same outcome.

### 2.1 The Econometrics Inference Question

We consider the question of inference on auction fundamentals using data under weak assumptions on information. In particular, assume we are given sample of observations of action profiles from an incomplete information game . Assume that we do not know the exact augmenting signal structure that occurred in each of these samples where it is implicitly maintained that can be different from for , i.e., the signal structure in the population is drawn from some unknown mixture. Also, maintaining that players play Nash, or that was the outcome of some Bayes-Nash equilibrium or BNE under signal structure , We study the question of inference on the distribution of the fundamentals of the game. Under the maintained assumption, we characterize the sharp set of possible distributions of fundamentals that could have generated the data. This allows for policy analysis within the model without making strong restrictions on information.

A similar question was recently analyzed in the context of entry games by Magnolfi and Roncoroni (2016), where the goal was the identification of the single parameter of interaction when both players choose to enter a market. In this work we ask this question in an auction setting and attempt to identify the distribution of the unknown valuations non-parametrically.

The key question for our approach is to allow for observations on different auctions to use different (and unobserved to the econometrician) information structures, and, given the information structures, that different markets or observations on auctions, to use a different BNE. Given this equivalence of the set of Bayes-Nash equilibria under some information structure and the set of Bayes-correlated equilibria, and given that the set of BCEs is convex, allowing for this kind of heterogeneity is possible. Heuristically, given a distribution over action profiles (which is constructed using the data), there exists a mixture of information structures and equilibria under which was the outcome. The process by which we arrived at is by first picking an information structure from this mixture and then selecting one of the Bayes-Nash equilibria for this information structure. This is possible if and only if there exists a distribution , that is a Bayes-correlated equilibrium and such that for all . Again, we start with the elementary information structure and maintain that observations in the data are expansions of this information structure. So, for a given market , any BNE using information structure is a BCE under . The data distribution of action is a mixture of such BNEs over various information structures and hence it would map into a mixture of BCEs under the same . Since the set of BCEs under is convex, any mixtures of elements in the set is also a BCE. So then intuitively, the set of primitives that are consistent with the model and the data is the set of BCEs, such that for all .
To conclude, the convexity of the set of BCEs allows us to relate a distribution of bids from an iid sample to a mixture of BCEs. This is possible since the distribution of bids uses a mixture of signal structures, which essentially coincides with a mixture of BCEs, itself another BCE by convexity. We summarize this discussion with a formal result.

###### Lemma 2.

Consider a model which conditional on the unobservables, yields a convex set of possible predictions on the observables. Then the sharp identified set of the unobservables under the assumption that exactly one of these predictions is selected in our dataset, is identical to the sharp identified set under the assumption that our dataset is a mixture of selections from these predictions.

###### Proof.

Suppose that what we observe in the data is a convex combination of feasible predictions of our model. Then by convexity of the prediction set, this convex combination is yet another feasible prediction of our model. Hence, this is equivalent to the assumption that this single prediction is selected in our dataset.

Finally, we provide next the main engine that allows for construction of the observationally equivalent set of primitives that obey model assumptions and result in a distribution on the observables that match that with the data. We state this as a Result.

###### Result 3.

Let there be a distribution defined on the space of action profiles. Given the setup and results above, the set of feasible joint distributions of signals and types that are consistent with are the set of distributions for which the following linear program is feasible:

 LP(ϕ,π) ∀ti,ai,a′i: ∑θ,t−i,a−iϕ(a)⋅x(θ,t|a)⋅(ui(a;θ)−ui(a′i,a−i;θ))≥0 (4) ∀(θ,t): π(θ,t)=∑a∈Aϕ(a)⋅x(θ,t|a) (5) x(⋅,⋅|α)∈Δ(Θ×T) (6)

where for all , and .

Equivalently, the sharp set for the distribution :

 ΠI(ϕ)={π∈Π:LP(ϕ,π) is feasible }
\etedit

The above result is generic, in that it handles general games with generic states of the world In particular, it nests both standard private and common value auction models and provides a mapping between the distribution of bids and the set of feasible distribtions over signals and An iid assumption on bids along with a large sample assumption allow us to learn the function (asymptotically). So, given the data, we can consistently estimate . This is a maintained assumption that we require throughout. Given , the above result tells us how to map the estimate of to the set of BNEs that are consistent with the data and are robust to any information structure that is an expansion of a minimal information structure . Suppose we assume that both and take finitely many values (an assumption we maintain throughout), then a joint distribution on , is consistent with the model and the data if and only if it solves the above linear program. The LP formulation is general, but in particular applications, it is possible to use parametric distributions for the . In addition, it is possible for the above LP to allow for observed heterogeneity by using covariate information whereby this LP can be solved accordingly \vsedit(see an example such adaptation in the common value Section 3). Though the above LP holds in general (and covers both common and private values for example), we specialize in the next Sections the above LP to standard cases studied in the auction literature, mainly common values an private values models.

\vscomment

I think we need to move the corollary that we can calculate the identified set of the expected value of any function of , by simply solving two linear programs, which will yield the upper and lower bounds of this “projection” of the identified set. This will make it way more general in this general setup. I also think that we might want to move the counter-factuals up here, as they can be phrased in the general setup and don’t need to be a common value or private value auction. I implemented these two sections. Let me know what you think. \eteditI AGREE!

### 2.2 Identified Sets of Moments of Fundamentals

In the case of non-parametric inference where we put no constraint on the distribution of fundamentals, i.e. , observe that the sharp set is linear in the density function of the fundamentals . Therefore, maximizing or minimizing any linear function of this density \etdeletevariables can be performed via solving a single linear program. This implies that we can evaluate the upper and lower bounds of the expected value of any function of these fundamentals, in expectation over the true underlying distribution. The latter holds, since the expectation of any function with respect to the underlying distribution is a linear function of the density. \eteditWe state this as a corollary next.

###### Corollary 4.

Let Result (3) hold. Also, let be any function of the state of the world and the profile of minimal signals (eg, or , …). The sharp identified set for the expected value of w.r.t. the true distribution of fundamentals, i.e. is an interval such that:

 L= minπ∈ΠI(ϕ)∑θ∈Θ,t∈Tf(θ,t)⋅π(θ,t) (7) U= maxπ∈ΠI(ϕ)∑θ∈Θ,t∈Tf(θ,t)⋅π(θ,t) (8)
\etedit

Without further assumptions on , i.e., when \etdeleteUnder non-parametric inference, i.e. , these are two linear programming problems.

### 2.3 Robust Counter-factual Analysis

\vsedit

Suppose that we wanted to understand the performance of some other auction when deployed in the same market, with respect to some \eteditobjective or metric: that is a function of the unknown fundamentals and the action vector. Examples of such metrics in single-item auctions could be social welfare: or revenue, i.e. , where is the value of player for the item at sale, is the probability of allocating to player under action profile and is the expected payment of player under action profile .

\vsedit

We are interested in computing an upper and lower bound on this metric under this new auction which has different utilities and under any Bayes-correlated equilibrium which would map into a BNE with an augmenting information structures. This is important since it allows us to obtain bounds on welfare or other metrics in a new environment \etdeleteunder general information structures. The welfare bounds computed in this manner will inherit the robustness property in that they will be valid under all information structures.

\vsedit

This is straightforward in our setup since computing a sharp identified set for any such counter-factual can be done in a computationally efficient manner, in both the common value setting and in the correlated private value setting. The upper bound of the counter-factual \etdeleteboils down to \eteditcan be obtained using the following linear program, that takes as input the observed distribution of bids in our current auction, the metric , and the primitive utility form under the alternative auction. We state this result in the next Theorem.

\vsedit
###### Theorem 5.

Given a metric function , a distribution over action profiles and vector of alternative auction utilities , the sharp upper and lower bounds on the expected metric under the new auction can be computed using the following LP:

 LP(ϕ,F,~u) min/max~ψ∑θ,t,aF(θ,t,a)⋅~ψ(θ,t,a) ∀ti,ai,a′i:∑θ,t−i,a−i~ψ(θ,t,a)⋅(~ui(a;θ)−~ui(a′i,a−i;θ))≥0 ∀(θ,t)∈Θ×T: π(θ,t)=∑a∈A~ψ(θ,t,a) ~ψ∈Δ(Θ×T×A) and % π∈ΠI(ϕ)

In the non-parametric case, where , the latter is a linear program.

\etedit

Note that getting sharp bounds on welfare measures for example using the above Theorem does not require one to infer in a prior step the distribution over the primitives. Rather, the above procedure provides sharp bounds on this welfare measure using a linear program.

## 3 Common Value Auctions

\etedit

We specialize the above results to important classes of auctions. We begin with the common value model in which the game of incomplete information is a single item common value auction. In this case the unknown state of the world is the unknown common value of the object , which we assume to take values in some finite set . Moreover, we initially assume that the minimal information structure is degenerate, i.e., players receive no minimal signal about this unknown common value666Other constraints on the initial signals are allowed and here we take the degenerate signal for simplicity.. The signal set becomes a singleton and is irrelevant. Thus we will denote with the distribution of the unknown common value, which is the parameter that we wish to identify. This is a particularly simple model to illustrate the structure of the LP approach and showcase the flexibility of our methods. So, in this particular model, we want to learn the distribution of the state of the world which is the common valuation distribution.

Prior to bidding in the auction, the players receive some signal which is drawn from some distribution; this signal can be correlated with the unknown common value and with the signals of his opponents. We wish to be ignorant about which information structure realized in each auction sample and want to identify the sharp identified set for . Moreover, we will assume that the players’ bids take values in some discrete set and players play a BNE. The characterization of the identified set for in this model is stated in the Theorem below.

###### Theorem 6.

Let the common value model above hold with bidders. Given a distribution of bid profiles supported on a set , the set of distributions of bids that are consistent with are ones where the following program is feasible:

 LP(ϕ,π) ∀b∗i,b′i∈B: ∑v∈V, b∈S:bi=b∗iϕ(b)⋅x(v|b)⋅(ui(b;v)−ui(b′i,b−i;v))≥0 ∀v∈V: π(v)=∑b∈Sϕ(b)⋅x(v|b) ∀b∈S: x(⋅|b)∈Δ(V)

Observe, that in this setting, the latter linear program is also linear in . Thus we get that the sharp identified set is a convex set and is defined as the set of solutions to the above linear program, where is also a variable.

This observation also allows us to easily infer upper and lower bounds on any linear function of the unknown distribution . This is stated next as a Corollary to the above Theorem.

###### Corollary 7.

Let Result (6) hold. Also, let be any function of valuations (such as or , …). Then, using the LP above, we can get upper and lowe bounds on such moments as follows:

 maxπ∈ΠI(ϕ)∑v∈Vf(v)⋅π(v) (9) minπ∈ΠI(ϕ)∑v∈Vf(v)⋅π(v) (10)

Observe that the latter linear expressions are simply: and so the above shows that we can compute in polynomial time upper and lower bounds of any moment of the unknown distribution of the common value.

Also, note that the Corollary above shows that to do set inference with respect to any moment of the unknown distribution, we do not need to discretize the space of probability distributions and enumerate over all probability vectors, checking whether they are inside the sharp identified set. Rather we can just solve the above LP.

Essentially this observation says that we can easily compute the support function of the identified set at any direction , by simply solving a linear program.

In Section (A.2), we provide an upper bound on the mean valuation distribution when the latter is continuous. This upper bound is derived in terms of the observed bids distribution.

###### Remark 1 (Winning bid).

The above procedure can be easily modified if indeed as it may be the case, only winning bids are observed. In particular, given that I observe the CDF of the winning bid, I know that an equilibrium is consistent with said CDF if and only if for every possible bid in ,

 F(x)=∑v∈V∑b/bi≤x∀iψ(v,b)

This has up to linear constraints of variables, but if we assume the bid vector distribution has small support of size , we only need variables.

###### Remark 2 (Covariate Heterogeneity).

Suppose we have covariates and want to allow for observed heterogeneity where we maintain the assumtion that the vector of covariates takes finitely many values. Then, one nonparametric approach is to repeat and solve the above LP for for every value that takes. In addition, in cases where we have , then we can solve directly for the identified set for by solving the following LP:

 LP(ϕ,β) ∀b∗i,b′i∈B,x0∈X: ∑v∈V, b∈S:bi=b∗iϕ(b|x0)⋅x(v|b,x0)⋅(ui(b;v)−ui(b′i,b−i;v))≥0 ∀x0∈X: x′0β=∑b∈S∑vvϕ(b|x0)⋅x(v|b,x0) ∀b∈S,x0∈X: x(⋅|b,x0)∈Δ(V)

Here, the vector allows for different players to have different ’s (and the ’s in this case would be auction specific heterogeneity where the different ’s would allow the mean valuation of different players to depend differently on auction characterisitcs).

\vsedit

The latter is attractive as it allows us to couple the identified set of the mean of the unknown common value across multiple co-variate realizations, in a computationally tractable manner. Otherwise we would have to generate the identified sets of the mean for each co-variate realization and then solve a second stage problem which would try to find the set of joint solutions in each of these identified sets (via some form of joint grid search) that are consistent with a model of how the conditional mean varies as a function of . However, joint grid search would grow exponentially with the number of realizations of the co-variates. The latter remark, shows that when the model of is linear, then we can save this exponential blow-up in the computation whilst leveraging the statistical power of coupling data from separate co-variate realizations. \vsdeleteThe attractive

## 4 Private Value Auctions

We now consider the case of a private value single item auction. In this case the (unknown) state of the world is the a vector of private values . We assume that these private values come from some unknown joint distribution . Moreover, we initially assume that players know at least their own private value. Thus the (minimal) signal set is equal to and moreover, we have that conditional on a value vector , , deterministically. Since the signal is a deterministic function of the unknown state of the world, we will again denote with the distribution of the unknown valuation vector, which is the parameter that we wish to identify. Here, each player first draws a valuation (as an element of the state of the world), and then each player’s own valuation is revealed to the player through a signal. After that, a signal is further revealed before players play a BNE given this signal.

In this setting the sharp identified set is again slightly simplified. The result is stated in the next Theorem.

###### Theorem 8.

Let the above private values auction model hold. Given a distribution of bid profiles supported on a set , the set of distributions of bids that are consistent with are the ones where the following linear program is feasible:

 LP(ϕ,π) ∀v∗i∈V,b∗i,b′i∈B: ∑v:vi=v∗i,b∈S:bi=b∗iϕ(b)⋅x(v|b)⋅(ui(b;vi)−ui(b′i,b−i;vi))≥0 ∀v∈Vn: π(v)=∑b∈Sϕ(b)⋅x(v|b) ∀b∈S: x(⋅|b)∈Δ(Vn)

Observe, that in this setting, the latter linear program is also linear in . Thus we get that the sharp identified set is a convex set and is defined as the set of solutions to the above linear program, where is also a variable. Note also here that no assumption is made on the correlation between player valuation. This result allows for the recovery of valuation distribution with arbitrary correlation (and general signaling structures). As a special case of the above, we study next the IPV model of auctions.

##### Independent Private Values.

The situation becomes more complex if we also want to impose an extra assumption that the distribution of private values is independent. In that case, we have the extra condition that must be a product distribution which is a non-convex constraint. For instance, if we want to assume that the value of each player is independently drawn from the same distribution , and hence is what we wish to identify, then we also have the extra constraint that:

 ∀v∈Vn:π(v)=ρ(v1)⋅…⋅ρ(vn) (11)

Adding this constraint into the above LP, makes the LP non-convex with respect to the variables (even though checking whether a given is in the identified set, is still an LP). Thus in this case we cannot compute in polynomial time upper and lower bounds on the moments of the distribution using the above LP.

However, we make the following observation which simplifies the constraints of the LP: we note that conditional on a player’s valuation and on a bid profile , the effect of a deviation is independent of the values of opponents, in a private value setting. Thus we can re-write the best response constraint as:

 ∑b∈S:bi=b∗iϕ(b)⋅xi(v∗i|b)⋅(ui(b;v∗i)−ui(b′i,b−i;v∗i))≥0 (12)

where , where is the random variable representing player ’s value and is the random variable representing the bid profile at a BCE.

Then we can formulate the consistency constraints, by simply imposing a constraint per player, i.e. if is the distribution of player ’s value, then it must be that:

 ρi(vi)=∑b∈Sϕ(b)⋅xi(vi|b) (13)

These are constraints that are still linear in . We state the LP as a corrolary next.

###### Corollary 9.

Assume that the above IPV model hold. The following LP characterizes the sharp identification under the independent private values model:

 LP(ϕ,ρ) ∀i∈[n],vi∈V,b∗i,b′i∈B: ∑b∈S:bi=b∗iϕ(b)⋅xi(vi|b)⋅(ui(b;vi)−ui(b′i,b−i;vi))≥0 ∀i∈[n],vi∈V: ρi(vi)=∑b∈Sϕ(b)⋅xi(vi|b) ∀b∈S: xi(⋅|b)∈Δ(V)

In particular if we assume that player’s are symmetric, i.e. , then we can compute upper and lower bounds on any moment of the common value distribution:

 maxxi(⋅|b) ∑v∈Vf(v)⋅ρ(v) ∀i∈[n],vi∈V,b∗i,b′i∈B: ∑b∈S:bi=b∗iϕ(b)⋅xi(vi|b)⋅(ui(b;vi)−ui(b′i,b−i;vi))≥0 ∀i∈[n],v∈V: ρ(v)=∑b∈Sϕ(b)⋅xi(v|b) ∀i∈[n],b∈S: xi(⋅|b)∈Δ(V)

A by-product of this analysis is that the linear program allows us to test for symmetry in the independent private values model. In particular it is not clear that when we assume that all marginals are the same, then the LP is feasible. Thus by checking feasibility of the LP we can refute the assumption of symmetric independent private values.

\etedit

Note here that given that we allow the augmenting signals to be arbitrary correlated, we are not able to infer any information about the joint distribution of valuation, such as correlation, given a bid profile. \vscommentThe observation below is wrong. We can still infer something about correlation, due to correlation in bids. What we cannot infer at all is what is the correlation in values conditional on a bid profile. This could still be arbitrary and there constraints only on the marginal dsitribution of each value conditional on a bid profile. \vsdelete

##### Non-identifiability of Correlation in Values.

The above discussion shows a stronger point: the BCE constraints for the case of private values and under the assumption that players observe their own valuation, yields no constraints on the correlation of player valuations, but only constraints the marginal distributions of each player’s value. Thus assuming that the observed distribution is the outcome of a BCE, does not allow for identification of the correlation among player’s valuations. The reasoning behind this result is that it in a Bayes-correlated equilibrium it is impossible to distinguish between the case where valuations are genuinely correlated and where bids are correlated through signaling. Since a BCE allows for arbitrary such signaling, no inference can be made on the underlying correlation of values. We can learn the marginal distributions but the model under general information structures contains no information on the copula without further restrictions on the model.

### 4.1 Point Identification in First Price Auctions with Continuous Bids

\vscomment

If we make the assumption that the equilibrium is strictly monotone, i.e. that there is a one-to-one mapping between bids and values, then we should be able to extend this to the correlated values case, and claim that marginals are uniquely identified in the correlated values case and the formula which I think is the same as some prior work on correlated first price auctions is robust to informational assumptions. The paper that does exactly that without informational robustness is Li et al. (2002). So we would be stating something like the results of Li et al. (2002) for estimating private values in the affiliated private values setting is robust to

Generally in the independent private values model, if we allow the bids and valuations to be continuous, we show here that the bidder specific valuation distributions are all identified even allowing for general information structures. This is important since the information here is allowed to be correlated but yet the valuation distributions of each player (under independence) are point identified.

Denote with , the CDF of the maximum other bid at a BCE, conditional on a bid of player , and let be the density. These are observable in the data. Now the utility of a player from submitting a bid conditional on being recommended by BCE to player and observing a value of is:

 Ui(b′i;vi,bi)=(vi−b′i)⋅G−i(b′i|bi) (14)

Since, is maximizing the above quantity, by the best-response constraints, then we get that the derivative of the utility with respect to has to be equal to at . The latter implies:

 (vi−bi)⋅g−i(bi|bi)−G−i(bi|bi)=0 (15)

We summarize our result in the following Theorem.

###### Theorem 10.

Let the IPV model above hold and further assume the bid distribution is continuous and admits a density. Thus we can write the value of a player as a function of his bid and of the conditional maximum other bid distribution:

 vi=bi+G−i(bi|bi)g−i(bi|bi) (16)

Hence, the valuation distribution for each player is point identified.

Hence, if we know the population bid distribution and if we assume that it is continuous and admits a density, then given the bid of a player we can invert and uniquely identify his value. Thus we can write the CDF of the value distribution of each player as a function of the observables.

\vsedit

A similar result was shown for the first price IPV model (without signals) in Guerre, Perrigne, and Vuong (2000) and later also generalized to the affiliated private values setting by Li, Perrigne, and Vuong (2002). Interestingly, the inversion formula that we arrived to in the independent private values setting, but under robustness to the information structure, is the same as the inversion formula of Li, Perrigne, and Vuong (2002) under the correlated private value setting. It is interesting to note that under the assumption that the bid of a player is strictly increasing in his value at any Bayes-Correlated equilibrium, then we can re-do the analysis in this section in the more general correlated private values setting and show that we can invert the value of a player from the observed correlated bid distribution. The inversion formula is identical to Equation (16) and identical to the inversion derived in Li, Perrigne, and Vuong (2002). This essentially shows that in terms of inference, robustness to information structure is in some sense equivalent to robustness to correlation in values.

Finally, as in the common values case, it is possible to adjust the LP to account for observing only the winning bids in the Private Values case.

\vsdelete

## 5 Counter-factual Analysis

Suppose that we wanted to understand the performance of some other auction when deployed in the same market, with respect to some metric: that is a function of the unknown value vector and the bid vector in the private value setting or a function in the common value setting. Examples of such metrics could be social welfare: or revenue, i.e. , where is the probability of allocating to player under bid profile and is the expected payment of player under bid profile .

We are interested in computing an upper and lower bound on this metric under this new auction which has different utilities and under any Bayes-correlated equilibrium. This is important since it allows us to obtain bounds on welfare or other metrics in a new environment under general information structures. The welfare bounds computed in this manner will inherit the robustness property in that they will be valid under all information structures.

This is straightforward in our setup since computing a sharp identified set for any such counter-factual can be done in a computationally efficient manner, in both the common value setting and in the correlated private value setting. The upper bound of the counter-factual boils down to the following linear program (we give here the linear program for the case of common values, but the analogous program can be written for correlated private values), that takes as input the observed distribution of bids in our current auction, the metric and the primitive utility form under the alternative auction. We state this in the next Theorem.

###### Theorem 11.

Let the common value model above hold. Given a social welfare function , and vector of utilities , the sharp upper and lower bounds on expected welfare can be computed using the following LP:

 LP(ϕ,W,~u) min/max~ψ∑v,bW(v,b)⋅~ψ(v,b) ∀bi,b′i:∑v∈V,b−i∈Bn−1~ψ(v,b)⋅(~ui(b;v)−~ui(b′i,b−i;v))≥0 ∀v∈V: π(v)=∑b∈Bn~ψ(v,b) ~ψ∈Δ(V×Bn) and π∈ΠI(ϕ)

Similarly if we want to compute a lower bound on the counter-factual we can compute the minimum rather than the maximum of the above objective subject to the same constraints.

## 6 Statistical Inference on Sharp Sets in CV Auctions

In general, we do no have access to the distribution . Instead, we have access to i.i.d. observations \vsedit from said distribution ; let . The sampled distribution is then give by

 ϕN(b)=1NN∑t=1ωt(b) (17)

In this section, we develop techniques for estimating the sharp identified set using samples from . We showcase our techniques for the CV setup for simplicity. Also, we focus on inference on the identified set for which is the object of inference here. Often times, we parametrize the distribution of the common values, and so inference will be on the identified set for the vector of parameters777It is possible to construct the CI for the unknown parameters -rather than the identified set by inverting test statistics. . \eteditWe start with finite sample approaches to constructing confidence regions for sets using concentration inequalities. These seem to be the first application of such results on using such inequalities to settings with partial identification888Finite sample inference results are particularly attractive in models with partial identification since standard asymptotic approximations are not typically uniformly valid especially in models that are close to/or are point identified.. We also show how existing set inference methods can also be used to construct confidence regions.

### 6.1 Finite Sample Inference Using Concentration Inequalities

We explore first the question of set inference using finite sample concentration inequalities. This allows us to obtain a confidence set for the identified set where the coverage property holds for every sample size. In addition, we highlight tools from the concentration of measure literature applied to inference on sets in partially identified models.

As a reminder, given a probability distribution over bids , the sharp set of compatible equilibria is the set of joint probability distributions satisfying:

 ∀b∗i,b′i∈B: ∑v∈V, b∈S:bi=b∗iϕ(b)⋅x(v|b)⋅(ui(b;v)−ui(b′i,b−i;v))≥0 (18)

for the proper . See the statement of Theorem (6) above. Let be the vector characterizing the distribution of the common value. Let , denote the negative of the best-response and density constraints, associated with the BCE LP in Theorem (6). Then in the population is feasible iff:

 minxmaxj∈MFj(x;π,ϕ)≤0 (19)

Then we have that the identified set is defined as:

 ΠI={π:minxmaxj∈MFj(x;π,ϕ)≤0} (20)

Now we consider a finite sample analogue. Observe that , where expectation is over the random vector and the function takes the form:

 fj(x;π,ω)=∑v∈V, b∈S:bi=b∗iω(b)⋅x(v|b)⋅(ui(b′i,b−i;v)−ui(b;v)) (21)

for some triplet for the case of best-response constraints and similarly for density consistency constraints. Then we consider the sample analogue:

 FNj(x;π)=1NN∑t=1fj(x;π,ωt) (22)

Observe that due to the linearity of with respect to , we can re-write: . One can then define the estimated identified set by analogy, i.e., replacing with

 ΠNI={π:minxmaxj∈MFj(x;π,ϕN)≤σN} (23)

for some decaying tolerance constant (which can be set to zero).

\vsedit

Because the pdf of a non-parametric distribution is a very high-dimensional object, inference on it will require many samples. Hence, we will instead focus on two more structured inference problems. In the first one we are interested in inferring the identified set for a moment of the distribution and in the second we make assume that the said pdf is known up to a finite dimensional parameter and infer the identified set of these lower dimensional parameters. One can in principle recover non-parametric inference by simply making the parameters be the values of the pdf at the discrete support points, albeit at a cost in the sample complexity.

\vsedit

#### 6.1.1 Inference on Identified Set for Moments.

We begin with the non-parametric setting and show how to construct inference on the identified set of any moment function of the common value distribution that is valid in finite samples. \vsdeleteEven though the identified set on the whole distribution is very high dimensional, we can still answer probabilistic questions for any moment of the distribution via the Bayesian Bootstrap. Observe that the identified set for any moment is an interval defined by:

 L=minx: maxj∈MFj(x;ϕ)≤0 ∑v∈Vm(v)⋅∑b∈Sϕ(b)⋅x(v|b) (24) U=maxx: maxj∈MFj(x;ϕ)≤0 ∑v∈Vm(v)⋅∑b∈Sϕ(b)⋅x(v|b)

where in both optimizations is ranging over the convex set of conditional distributions, i.e. , which we omit for simplicity of notation. We can then define their finite sample analogues as:

 LN(σN)=minx: maxj∈MFNj(x)≤σN ∑v∈Vm(v)⋅∑b∈SϕN(b)⋅x(v|b) (25) UN(σN)=maxx: maxj∈MFNj(x)≤σN ∑v∈Vm(v)⋅∑b∈SϕN(b)⋅x(v|b)

Where and are given in Equations (22) and (17) respectively.

\vsdelete

We remind the reader that in this setting the identified set is an interval given by Equations (24) and its finite sample equivalent is given by Equations (25). The following result gives finite sample high probability bounds on the coverage of the interval . As a reminder, we use to designate the number of bidders, is sample size, and is the number of support points for bids.

###### Theorem 12.

Suppose that for all , and and let and . Then:

 Pr[[L,U]⊆[LN(σN)−ϵN,UN(σN)+ϵN]]≥1−δ (26)

where , are defined by Equation (24) and , are defined by Equation (25).

###### Proof.

We will show that with probability at least : . The theorem follows by also showing that with probability at least and then using a union bound of the bad events. The second inequality, follows along identical lines by simply replacing with . So it suffices to show the first inequality.

We remind the reader the definitions of the two quantities:

 L=minx: maxj∈MFj(x;ϕ)≤0 ∑v∈Vm(v)⋅∑b∈Sϕ(b)⋅x(v|b) (27) LN(σN)=minx: maxj∈MFNj(x)≤σN ∑v∈Vm(v)⋅∑b∈SϕN(b)⋅x(v|b) (28)

Let be the optimal solution to the population LP of Equation (27). We will argue that w.h.p. for the choice of it remains a solution of the finite sample LP of Equation (28) and the value of the finite sample LP under is at least the value of the population LP plus .

Observe that is the sum of i.i.d. random variables bounded in with mean . By Hoeffding’s inequality with probability :

 FNj(x∗)≤Fj(x∗;ϕ)+2H√log(1/κ)N≤2H√log(1/κ)N (29)

where the second inequality follows by feasibility of for the LP of Equation (27). By a union bound over all and since we get that with probability at least :

 maxj∈MFj(x∗;ϕ)≤2H√log(1/κ)N (30)

For and we get that with probability : and thereby is feasible for the finite sample LP.

Finally, the value of the finite sample solution can be written as with:

 q(x;ωt)=∑v∈Vm(v)∑b∈Sωt(b)⋅x(v|b) (31)

Thus it is also the sum of i.i.d. random variables bounded in ,999Since and . and with mean . Hence, by Hoeffding’s inequality, with probability at least :

 QNj(x∗)≤Qj(x∗;ϕ)+2H√log(4/δ)N=L+ϵN (32)

By a union bound we get that with probability at least , is feasible for the finite sample LP and achieves value . In that case, by the definition of , we get: and the theorem follows. 101010We could have used an empirical Bernstein inequality instead of Hoeffding’s inequality and and replace the in the quantities by , where is the empirical variance, at the expense of adding a lower order term of (see e.g. Maurer and Pontil (2009); Peel et al. (2010)). Similarly, for . However, the latter requires knowledge of and taking a supremum over in the latter seems to be as conservative as a Hoeffding bound.

\vsedit

#### 6.1.2 Inference on Identified Set for Parametric Distributions.

For the parametric case, we assume that the distribution is parametric of the form for some finite parameter set . In the parametric setting we need to augment the constraint set , apart from containing the best-response constraints, to contain the parametric form consistency constraints:

 ∀v∈V:π(v,θ)=∑v∈Vϕ(b)⋅x(v|b) (33)

These can also be written of the form for some function .111111Simply add one constraint of the form and one of the form . We overload notation and let be this augmented set of constraints. Then the parameter of interest is and the identified set for takes the form:

 ΘI={θ∈Θ:minxmaxj∈MFj(x;θ,ϕ)≤0} (34)

and its sample equivalent

 ΘNI(σN)={θ∈Θ:minxmaxj∈MFj(x;θ,ϕN)≤σN} (35)

for some decaying tolerance constant (which can be set to zero).

The next Theorem provides finite sample high probability coverage bounds for .

###### Theorem 13.

Consider and as defined in Equations (34) and (48). Suppose that for all , and . If , then:

 Pr[ΘI⊆ΘNI(σN)]≥1−δ (36)
###### Proof.

To show the statement we need to show that with probability , for all it must be that