Optimized Partial Identification Bounds for Regression Discontinuity Designs with Manipulation
Abstract
The regression discontinuity (RD) design is one of the most popular quasiexperimental methods for applied causal inference. In practice, the method is quite sensitive to the assumption that individuals cannot control their value of a “running variable” that determines treatment status precisely. If individuals are able to precisely manipulate their scores, then point identification is lost. We propose a procedure for obtaining partial identification bounds in the case of a discrete running variable where manipulation is present. Our method relies on two stages: first, we derive the distribution of nonmanipulators under several assumptions about the data. Second, we obtain bounds on the causal effect via a sequential convex programming approach. We also propose methods for tightening the partial identification bounds using an auxiliary covariate, and derive confidence intervals via the bootstrap. We demonstrate the utility of our method on a simulated dataset.
1 Introduction
The regression discontinuity (RD) design is a vital analytic tool for social scientists. First introduced by Thistlethwaite and Campbell (1960), the RD design gained popularity due to its applicability to a wide variety of nonexperimental settings. RD designs have been used to infer the effect of health insurance on neonatal hospital stays (Almond and Doyle, 2011), the effect of college quality on students’ postsecondary enrollment choices (Cohodes and Goodman, 2014), and the effect of incumbency on U.S. House election outcomes (Lee, 2008), to name just a few.
The method exploits scenarios in which each unit has an associated score (the “running variable”), and treatments are assigned based on whether the score falls above or below a threshold. Under the assumption that units are unable to precisely manipulate their score near the threshold, treatment assignment is as good as random in a narrow window around this cutoff. This quasirandomization can be exploited to infer local average treatment effects without the necessity of running a randomized experiment.
The RD design relies on the noprecisemanipulation assumption, but this can be problematic in practice. Gerard et al. (2015) document numerous scenarios in which plausible regression discontinuities show evidence of manipulation. These include teachers manipulating student test scores to meet performance standards in New York City (Dee et al., 2016), and students manipulating credits in order to be eligible for a college scholarship in West Virginia (ScottClayton, 2011).
If perfect manipulation is present, we can no longer assume that there is no systematic difference between units just above and just below the threshold. As a result, point identification of a local average treatment effect is lost. But if the population is composed of some manipulators and some nonmanipulators – and we estimate the prevalence of each – we can still hope to do valid causal inference on the subpopulation of nonmanipulators.
In this paper, we propose a partial identification approach in RD designs where manipulation is present. Our approach is designed for the case in which the running variable is discrete. We propose to first generate an estimate of the “unmanipulated” density, using a technique proposed by Diamond and Persson (2016). Under stated assumptions, we can subsequently derive the relative densities of manipulators and nonmanipulators. We then pose the problem as an optimization, in which we can derive treatment effect bounds as the best and worstcase treatment effect estimates consistent with these densities. Our method draws on other partial identification approaches proposed in the literature.
The remainder of this paper proceeds as follows. In Section 2, we review relevant literature on regression discontinuity designs and partial identification of causal effects. Our proposed procedure is described in detail in Section 3. In Section 4, we propose methods for tightening the partial identification bounds by making use of auxiliary covariates. We demonstrate the utility of these methods via simulations in Section 5. Section 6 concludes.
2 Literature Review
Since its introduction in the midtwentieth century, the RD design has yielded a broad literature covering both applications and methodological developments. Perhaps the most fundamental question is how to estimate conditional means just below and just above the cutoff. Local polynomial regression approaches have received substantial attention in recent decades (Hahn et al., 2001; Imbens and Lemieux, 2008), with particular focus on the selection of tuning parameters such as the choice of kernel and the smoothing bandwidth. More recently, Imbens and Wager (2018) proposed a minimax linear estimator, which obviates the need for some of these choices. Yet, the regressionbased methods remain appealing due to their simplicity and interpretability.
Violations of the RD design assumptions have received comparatively less attention. McCrary (2008) introduced an intuitive test for identifying the presence of manipulation, based on examination of the running variable density. Suppose the treatment is assigned to those with scores above the cutoff, and the treatment is desirable. In the absence of manipulation, we would expect to see a continuous running variable density at the cutoff. However, if manipulation were occurring, we would expect to see a discontinuity, resulting from individuals just below the cutoff manipulating their scores to just above the cutoff in order to secure the treatment. McCrary suggests an empirical hypothesis test for this density discontinuity, making use of a density estimator originally proposed by Cheng et al. (1997). Alternative density estimators have subsequently been proposed by both Otsu et al. (2013) and Cattaneo et al. (2015). Frandsen (2017) extended this approach by developing an alternative test for manipulation for the case of a discrete running variable. These methods have been widely adopted in the applied literature as a falsification test for checking the assumption of no manipulation.
There is comparatively little empirical work describing how to proceed when these tests indicate the presence of manipulation. Diamond and Persson (2016) consider Swedish math test data in which there is evidence of teachers inflating students’ grades. They develop an estimator to determine the causal effect of the score manipulation on future educational attainment and earnings. While their focus is on a different causal effect than the one we consider, the paper develops several useful methods that will be incorporated here.
We pursue the “partial identification” approach, popularized by Manski and later by Tamer (see e.g., Manski et al., 1989; Manski and Tamer, 2002; Haile and Tamer, 2003). The core idea is that, in scenarios in which a treatment effect cannot be point identified (even with an infinite sample size), it can still sometimes be bounded. These bounds might be very informative in practice – for example, allowing us to rule out negative or positive treatment effects.
Gerard et al. (2015) also take a partial identification approach to analyzing RD designs in the presence of manipulation. Their method posits the existence of subpopulations of manipulators and nonmanipulators and defines the causal effect on the nonmanipulators as the inferential target. We adopt the same framework but differ in our estimation technique. Particularly, Gerard and coauthors extend McCrary’s result to estimate the proportion of manipulators at the cutoff . They then propose a “polynomial truncation” approach to estimation, which implicitly assumes all manipulated units lie at the top or bottom of the distribution of outcomes within a bandwidth above the threshold. We avoid making such an assumption by estimating the manipulator counts at all values of the running variable and explicitly assuming that it is discrete.
3 Proposed Procedure
Our contribution is a novel optimization procedure for estimating partial identification bounds on causal effects in RD designs with a discrete running variable. We require two preliminary steps: testing for the presence of manipulation, and estimating the unmanipulated density to derive the counts of manipulators and nonmanipulators. These steps can be accomplished using existing methods.
3.1 Notation and Initial Assumptions
Our data consists of units. We associate with each unit a pair of unseen potential outcomes , corresponding to the value of the outcome if unit does not or does receive the treatment, respectively. We also associate with each unit an observed value of the running variable and a true, unobserved value of the running variable .
We have a treatment assignment and a running variable cutoff value such that implies . In other words, this is a sharp, rather than fuzzy, RD design, but treatment assignments are based on the observed running variable rather than the true running variable. We observe
the outcome for each unit. Our estimand of interest is
where the expectation is with respect to the superpopulation from which our data is sampled.
We define an indicator variable such that if and otherwise ( defines whether this is an “honest” subject as opposed to a manipulator). Both and lie in the set , a discrete set of running variable values, where . We also denote the counting functions
where and . Here, represents the manipulator distribution and the “honest” (nonmanipulator) distribution. is assumed to evolve smoothly with while there is no such assumption on .
We also define the quantities
observing . For convenience, we assume the indices are assigned such that for and for .
We make a somewhat restrictive shape assumption about the shape of the density of the .
Assumption 1
The unmanipulated density is logconcave, and the densities of the missing and excess mass (to the left and right of each threshold) are logconcave and monotonic.
This assumption appears in Diamond and Persson (2016), who argue that it is necessary for an approximate recovery of the unmanipulated density. Without it, any observed bunching in the density could simply be attributable to a bumpy unmanipulated density rather than the effect of the manipulation. Fortunately, the assumption is still quite general and allows for the density to follow many commonly used probability distributions, such as the normal, Gumbel, gamma, beta, and logistic (Diamond and Persson, 2016).
3.2 Testing for Manipulation
Many methods exist for validating the RD design, including density tests and tests for covariate balance just above and below the threshold (Lee and Lemieux, 2010). Rejections of the null in these tests can be used to falsify the RD design by identifying behavior that would be implausible in the case of treatment randomization near the threshold value of .
Our methods are only valid for the setting in which the unmanipulated density is recoverable from the observed density. Unsurprisingly, then, we rely on settings where evidence of the manipulation is manifest in the shape of the observed density. In cases where manipulation is plausible, we thus suggest the use of a density test as the first step in our procedure. The McCrary test (McCrary, 2008) is the standard approach, though the test advocated by Frandsen (2017) may be preferable for a discrete running variable if contains only a moderate number of distinct values.
If the null hypothesis is rejected by the density test, we suggest conducting further confirmatory falsification tests. A standard approach is to use baseline covariates as placebo outcomes and determine whether a causal effect would have been estimated at the cutoff for these outcomes (see e.g., Sekhon and Titiunik, 2016). If an abnormally high number of covariates show statistically significant discrepancies across the threshold, this further indicates the plausibility of manipulation.
3.3 Estimation of the True Density and the NonManipulator Counts
Once manipulation is established, we seek to estimate the density of the true running variable from the observed running variable . Multiple methods could be deployed for this task.
Socalled “bunching strategies” for analyzing manipulated distributions have ample precedent in the economics literature. For example, various models have been proposed to assess underlying income distributions when reported incomes are manipulated (Kleven and Waseem, 2012; Chetty et al., 2013). The statistics literature also provides methods for recovering the underlying distributions. Lindsey’s Method (Efron, 2012), in which a Poisson regression is fit to histogram heights in the unmanipulated section of the distribution, is a simple and intuitive technique.
We are partial to the technique used by Diamond and Persson (2016), which has the advantage of simultaneously estimating the width of the “manipulation region” (the radius around the cutoff in which manipulation takes place) and also the unmanipulated density. Their method performs a grid search over the possible widths of the manipulation region. For each potential value, nonlinear least squares is used to estimate the unmanipulated distribution as a linear combination of exponentiated Bernstein polynomials of the running variables, with a linear inequality constraint on the coefficients to enforce logconcavity of the unmanipulated density. A crossvalidation procedure is used to identify the optimal width and optimal polynomial degrees based on outofsample predictions.
Whether by the Diamond and Persson technique or another method, the output should be , an estimate of the counting function . To identify the count of nonmanipulators at each value of , we make a further assumption.
Assumption 2
and .
This assumption is somewhat restrictive. It implies that if there is some at which there are manipulators for whom , there cannot be any manipulators for whom the manipulated . Similarly, if there are some manipulators for whom , then there are no manipulators for whom . In practical settings, this will look like a monotonicity constraint i.e., any manipulator for whom will successfully manipulate such that and no manipulator will manipulate in the opposite direction (for a desirable treatment); or vice versa (for an undesirable treatment).
Assumption 2 is very similar to Assumption 3 in Gerard et al. (2015), who note that this kind of “onesided” manipulation is plausible as long as the treatment is unambiguously desirable or undesirable. We show an example of manipulation satisfying our assumptions in Figure 1.
Under Assumption 2, observe that
Thus, we can estimate the nonmanipulator count at each value by computing
3.4 Optimization Problem
Many RDD causal effect estimators can be written as
where and are concatenated basis expansion of the running variables for units to the left and right of the cutoff, respectively; is an analogous basis expansion of the cutoff ; and and are diagonal matrices representing unitlevel weights. The popular local polynomial regression approach (Hahn et al., 2001) can be expressed in this form, as can spline formulations (see e.g. Lemieux and Milligan, 2008) and simpler unweighted regressions. Suppose we are using any such method for inference.
Our goal is to put partial identification bounds on . Observe that, were we to know the values of the indicators, we could recover an estimate of the causal effect free from bias due to the manipulation. For bookkeeping, we collect the values into vectors and such that , where is the entry in , and .
In practice, are unknown to the researcher, but we can impose certain constraints that they must satisfy. If we additionally knew the true nonmanipulator counts , then for any choice of we would know
and for any choice of
We could collect these equalities into matrix equalities
where e.g.
and
with analogous definitions for . In practice, these equalities have to be approximated using the output of Section 3.3:
where are the approximated analogues of .
We can now write down the first iteration of our optimization problem. We solve for the upper partial identification bound via:
Optimization Problem 1
Initial Formulation
maximize  
subject to  
Here, observe that is a diagonal weight matrix resulting from the product of weight matrix , representing the weights used by our regression estimator (e.g. kernel weights), and , our optimization variables. Since the are boolean, any value has the effect of “turning off” observation under the assumption that it is a manipulator. Analogous definitions apply for .
3.5 Solving the Optimization Problem
Optimization Problem 1 is a boolean optimization problem with a nonconvex objective. To have any hope of solving this problem, we must relax the final two constraints to make them convex, via:
Optimization Problem 2
Relaxed Formulation
maximize  
subject to  
where the coordinates of are now confined to the unit interval rather than . Note that any solution of the relaxed problem for which the entries of lie in will also be a solution to the original problem.
Define as our optimization variable, the concatenation of . Denote the objective as . The nonconvexity of poses a substantial challenge. We propose a computationally efficient method based on sequential convex programming (Fleury, 1989). The algorithm can be described in a few simple steps.

Find a feasible point satisfying all the constraints.

Repeat until convergence:

Compute the linearization of at :

Solve the linearized convex optimization problem
maximize subject to and denote its solution as

Set

The linearization in the above procedure can be computed efficiently by observing that
where denotes the Hadamard product and is the length vector of ones. Note also that because the optimization variable appears only in convex inequalities and affine equalities, we need not maintain a trust region (Duchi et al., 2018).
Sequential convex programming is a heuristic and typically yields feasible points with good, but not necessarily optimal, objective values. Hence, the bounds generated by our procedure may be conservative. Such intervals are still informative if, for example, they cross zero. Additionally, in simulations, we frequently see that the solution has no sensitivity to the starting point, providing some evidence that the solution may be a global, rather than local optimum. Finding methods for solving Optimization Problem 3 with stronger theoretical guarantees remains a direction for future research.
3.6 Confidence Sets
The intervals generated by our procedure will account for the worstcase estimation bias resulting from the manipulation, but not the variance in the data. We would like to provide a datadriven interval such that the true causal effect lies asymptotically within with probability for some small value .
The development of valid confidence sets for regression discontinuity estimation remains an active area of research (Calonico et al., 2014; Bartalotti et al., 2016). In our setting, we have multiple sources of uncertainty: the estimation of the unmanipulated density as well as the estimation of the best and worstcase treatment effects. The most straightforward way to account for these sources is to use the percentile bootstrap (Efron, 1992). A discussion of the statistical issues involved in using the bootstrap in concert with an optimization procedure to yield valid intervals can be found in Zhao et al. (2019), though we do not attempt a robust treatment here.
For replicates, we draw samples with replacement from the original units. We then repeat the estimation of the unmanipulated density and the optimization procedure, yielding lower and upperbound estimates . We construct a confidence interval as
where represents the quantile of the distribution.
4 Bound Tightening via Covariates
This framework can be easily extended to incorporate covariates, which may help to tighten the partial identification bounds. Suppose we have access to some nonmanipulable covariates whose expected values we assume evolve smoothly with the running variable . Under our assumptions, the subset of honest subjects should be randomized locally about the threshold , while the manipulators are definitionally not randomized. If we solve Optimization Problem 3 and find that the solution implies unreasonably large changes in the average covariate value at or at any other point throughout the manipulation region, this indicates our worstcase or bestcase samples contain manipulators.
In practice, we can use this insight to tighten the bounds by imposing further constraints in the optimization procedure. Let us suppose also that we have access to some nonmanipulable (e.g. age), and that we have used the Diamond and Persson procedure to identify a symmetric running variable manipulation region given by . We consider the typical case in which all the manipulators are on one side of the threshold (to the left of , let’s say). We assume also that the expected value of evolves smoothly as a function of the running variable .
We can estimate the conditional mean of as a function of using data to the left of and above . This can be done using any of the methods we are using for the causal effect estimation, e.g. local polynomial regression or spline regression. Using the model, we then construct simultaneous confidence intervals for the observed mean of at points . An example is provided in Figure 2. A quadratic polynomial fit is used to estimate the dependence of on and Bonferronicorrected intervals for the mean observed value of at each value of the running variable in the manipulation region (given here by ).
These bounds can then be easily integrated into the optimization procedure as additional affine constraints on . A constraint for a single value within the manipulation region looks like:
for derived bounds . These constraints can simply be appended to the existing optimization problem. Using the onedirectional manipulation example above, our problem would look like
Optimization Problem 3
CovariateTightened Formulation
maximize  
subject to  
A note of caution: these constraints are taking a probabilistically unlikely event and using them to define a constraint that must be satisfied. The unlikely event is also defined using a model that may itself be biased if we guess an erroneous functional form. Lastly, if constraints are imposed based on several different covariates, then it may be that any one constraint violation is highly unlikely but that collectively at least one violation is plausible.
Based on these considerations, the analyst should be cautious when imposing these constraints. Levels of should be chosen low and Bonferroni corrections made when appropriate. Covariates should also be chosen that have reasonably clear functional forms to avoid underfitting bias.
5 Simulated Example
We step through an illustrative example to demonstrate the utility of our method. We sample a running variable from a Poisson distribution with mean 20 and suppose . The cutoff is chosen to be 25, and we suppose that 10% of individuals with , 20% of individuals with and 30% of individuals with manipulate their scores. Of these manipulators, 50% manipulate exactly to the threshold of 25, 30% to 26, and 20% to 27. Outcomes are sampled as
where are independent Gaussian errors. Note that the outcomes are independent of the treatment assignment, indicating no treatment effect.
We begin by using a modified version of Diamond and Persson’s method to recover an estimate of the unmanipulated distribution. We fit a Poisson model with coefficients corresponding to the unmanipulated, missing, and excess mass. A Bernstein basis expansion and linear constraint are used to ensure the unmanipulated distribution is logconcave, and the design matrix is constructed such that the missing and excess densities only contribute within the manipulation region below and above the cutoff respectively. We grid search over the order of the polynomials (two through five) and the width of the manipulation region (one through six units). Crossvalidated MSE is used to choose the optimal model, where the folds are defined by individual points on the histogram and we try to predict the height at each point).
The optimal model chosen by the algorithm uses a fourthorder Bernstein polynomial. The optimal radius for the manipulation region is correctly chosen to be four units. The estimated counts shown by the red line in Figure 3 nonetheless recover the true counts above the cutoff reasonably well. The estimated and true nonmanipulator counts at each value of the running variable are provided in Table 1.
Running Variable  Observed Count 



25  370  240  241  
26  328  189  201  
27  189  143  117  
28  136  104  100 
Suppose we are using a thirdorder polynomial to estimate the conditional means to the left and right of the threshold. The manipulation is such that if we naively fit the model without accounting for the manipulation, we will obtain a causal estimate of 0.98. We might mistakenly think this is a real effect: a bootstrap confidence interval with 500 replicates covers , lending credence to the view that the treatment has a negative causal effect.
We run our optimization procedure to obtain partial identification bounds on the causal effect with this manipulation table. The results are summarized in Figure 4. Each plot contains a scatterplot of the outcomes vs. the observed running variable at values near the threshold. We show the model fit to the data below the cutoff in purple and the model fit to the data above the cutoff in blue; a vertical gray line denotes the cutoff.
In the first panel, we can see how a standard analysis would yield a negative causal estimate. In the “Lower Bound” panel, we highlight in red the points identified as worstcase manipulators by our algorithm, and the resultant model fit to the data excluding these points in the dashed blue line. We see the analogous results in the “Upper Bound” panel. The “Together” panel shows all three model fits on one scatterplot.
Several details are immediately noticeable. The bounds now extend between and – and, crucially, the upper bound lies above zero, casting doubt on the presumed negative causal effect. We see the manipulators identified in the second and third panels are always among the highest or lowest values of at each value of the running variable, but they swap between the values of 26 and 27 so as to optimize the curvature of the polynomial fit.
If we have access to additional covariates, the identified best and worstcase manipulators may not strictly lie among the highest and lowest values of at each value of the running variable. We simulate the case where we have access to an additional covariate , and we generate different versions such that each has the same variance as the true running variable but correlations ranging from to (and a linear relationship with when the correlation is nonzero). We then follow the procedure laid out in Section 4 with an alpha level of and a (correctly specified) model of as a function of .
In Table 2, we show the average of the bounds over five runs of the boundtightening procedure. We can immediately see that the correlation must be quite high to see any meaningful tightening. At , we do see a modest decline in the interval width. It is plausible that the presence of multiple variables highly correlated with the running variable would yield tighter partial identification bounds.
Correlation  Lower Bound  Upper Bound 

0.0  2.70  0.70 
0.2  2.70  0.70 
0.4  2.69  0.70 
0.6  2.66  0.68 
0.8  2.47  0.60 
We can also see in Figure 5 the set of individuals identified as manipulators for the lower tightened bound imposed for a covariate highly correlated with the running variable. Notably, the set of identified manipulators no longer adheres strictly to the highest and lowest individuals at each value of the running variable.
Lastly, we bootstrap the procedure to estimate full confidence bounds (without covariate tightening). We draw 200 bootstrap replicates from the data and compute upper and lower confidence bounds within each replicate. Taking the 2.5% quantile on the lower bounds and the 97.5% quantile on the upper bounds yields a final interval of , about one unit wider than our bounds on the point estimates. Again, it bears emphasizing that these intervals cover the true effect of zero, while bootstrap intervals not accounting for the manipulation are bounded away from zero.
6 Conclusions
We have considered a common problem in the analysis of the regression discontinuity design: the presence of manipulators, or individuals who are able to exhibit precise control over their value of the running variable. The presence of such individuals undermines a key assumption in the RD design: that of local randomization about the threshold. Point identification on the causal effect is thus lost. In its absence, we propose a twostage method for instead estimating partial identification bounds when the running variable is discrete.
In the first stage, we estimate the unmanipulated distribution of the running variable, making use of a logconcavity assumption and a method proposed in (Diamond and Persson, 2016). Combined with an assumption that manipulation may occur to or from a value – but not both – this allows us to estimate the distribution of nonmanipulators in our data. We use this distribution to define an optimization problem to bound the treatment effect, and propose to solve the (nonconvex) problem via sequential convex programming. We also propose methods to tighten the bounds using auxiliary covariates, and we suggest a bootstrap procedure for obtaining confidence bounds.
Our method is quite general and can be used with several common models leveraged in the analysis of the RD design, including local polynomial and spline regressions. We think this method will be of use to applied researchers when analyzing RD designs with clear manipulation. If the bootstrapped bounds share the same sign, this provides particularly strong evidence of a directional causal effect even under the most adversarial manipulation.
There are several obvious next steps for this work. While we believe the current heuristic method is providing optimal or nearoptimal solutions, we hope to derive certificates of optimality when solving Optimization Problem 3. This would allow us to drop the “conservative” modifier from our partial identification bounds. Relaxing the logconcavity assumption on the unmanipulated density would also help to generalize these methods to new settings.
The extension of our methods to continuous running variables is reasonably straightforward: we need only bin the running variable values. The “trick” lies in choosing appropriate bandwidths for binning. We hope to determine datadriven approaches for binning the data and thus easily extend these techniques to the common case of a continuous .
Lastly, in this manuscript we discuss only briefly the possibility of using covariates to identify plausible manipulators better. This approach shows some promise in simulations. We hope to extend these methods to incorporate richer covariate data in the future.
7 Acknowledgments
Evan Rosenman was supported by the Department of Defense (DoD) through the National Defense Science & Engineering Graduate Fellowship (NDSEG) Program, and by Google.
References
 Almond and Doyle (2011) Almond, D. and Doyle, J. J. (2011). After midnight: A regression discontinuity design in length of postpartum hospital stays. American Economic Journal: Economic Policy, 3(3):1–34.
 Bartalotti et al. (2016) Bartalotti, O. C., Calhoun, G., and He, Y. (2016). Bootstrap confidence intervals for sharp regression discontinuity designs with the uniform kernel.
 Calonico et al. (2014) Calonico, S., Cattaneo, M. D., and Titiunik, R. (2014). Robust nonparametric confidence intervals for regressiondiscontinuity designs. Econometrica, 82(6):2295–2326.
 Cattaneo et al. (2015) Cattaneo, M. D., Jansson, M., and Ma, X. (2015). A simple local polynomial density estimator with an application to manipulation testing. Technical report, Working Paper.
 Cheng et al. (1997) Cheng, M.Y., Fan, J., Marron, J. S., et al. (1997). On automatic boundary corrections. The Annals of Statistics, 25(4):1691–1708.
 Chetty et al. (2013) Chetty, R., Friedman, J. N., and Saez, E. (2013). Using differences in knowledge across neighborhoods to uncover the impacts of the eitc on earnings. American Economic Review, 103(7):2683–2721.
 Cohodes and Goodman (2014) Cohodes, S. R. and Goodman, J. S. (2014). Merit aid, college quality, and college completion: Massachusetts’ adams scholarship as an inkind subsidy. American Economic Journal: Applied Economics, 6(4):251–85.
 Dee et al. (2016) Dee, T. S., Dobbie, W., Jacob, B. A., and Rockoff, J. (2016). The causes and consequences of test score manipulation: Evidence from the New York regents examinations. Technical report, National Bureau of Economic Research.
 Diamond and Persson (2016) Diamond, R. and Persson, P. (2016). The longterm consequences of teacher discretion in grading of highstakes tests. Technical report, National Bureau of Economic Research.
 Duchi et al. (2018) Duchi, J., Boyd, S., and Mattingley, J. (2018). Sequential convex programming. Dept. Elect. Eng., Stanford Univ., Stanford, CA, USA, Tech. Rep. EE364b.
 Efron (1992) Efron, B. (1992). Bootstrap methods: another look at the jackknife. In Breakthroughs in statistics, pages 569–593. Springer.
 Efron (2012) Efron, B. (2012). Largescale inference: empirical Bayes methods for estimation, testing, and prediction, volume 1. Cambridge University Press.
 Fleury (1989) Fleury, C. (1989). Conlin: An efficient dual optimizer based on convex approximation concepts. Structural optimization, 1(2):81–89.
 Frandsen (2017) Frandsen, B. R. (2017). Party bias in union representation elections: Testing for manipulation in the regression discontinuity design when the running variable is discrete. In Regression Discontinuity Designs: Theory and Applications, pages 281–315. Emerald Publishing Limited.
 Gerard et al. (2015) Gerard, F., Rokkanen, M., and Rothe, C. (2015). Partial identification in regression discontinuity designs with manipulated running variables. Unpublished manuscript, Columbia University.
 Hahn et al. (2001) Hahn, J., Todd, P., and Van der Klaauw, W. (2001). Identification and estimation of treatment effects with a regressiondiscontinuity design. Econometrica, 69(1):201–209.
 Haile and Tamer (2003) Haile, P. A. and Tamer, E. (2003). Inference with an incomplete model of english auctions. Journal of Political Economy, 111(1):1–51.
 Imbens and Wager (2018) Imbens, G. and Wager, S. (2018). Optimized regression discontinuity designs. Review of Economics and Statistics, (0).
 Imbens and Lemieux (2008) Imbens, G. W. and Lemieux, T. (2008). Regression discontinuity designs: A guide to practice. Journal of econometrics, 142(2):615–635.
 Kleven and Waseem (2012) Kleven, H. J. and Waseem, M. (2012). Behavioral responses to notches: Evidence from pakistani tax records. Londres, London School of Economics, mimeo.
 Lee (2008) Lee, D. S. (2008). Randomized experiments from nonrandom selection in us house elections. Journal of Econometrics, 142(2):675–697.
 Lee and Lemieux (2010) Lee, D. S. and Lemieux, T. (2010). Regression discontinuity designs in economics. Journal of economic literature, 48(2):281–355.
 Lemieux and Milligan (2008) Lemieux, T. and Milligan, K. (2008). Incentive effects of social assistance: A regression discontinuity approach. Journal of Econometrics, 142(2):807–828.
 Manski et al. (1989) Manski, C. F. et al. (1989). Nonparametric bounds on treatment effects. Social Systems Research Institute, University of Wisconisn.
 Manski and Tamer (2002) Manski, C. F. and Tamer, E. (2002). Inference on regressions with interval data on a regressor or outcome. Econometrica, 70(2):519–546.
 McCrary (2008) McCrary, J. (2008). Manipulation of the running variable in the regression discontinuity design: A density test. Journal of econometrics, 142(2):698–714.
 Otsu et al. (2013) Otsu, T., Xu, K.L., and Matsushita, Y. (2013). Estimation and inference of discontinuity in density. Journal of Business & Economic Statistics, 31(4):507–524.
 ScottClayton (2011) ScottClayton, J. (2011). On money and motivation a quasiexperimental analysis of financial incentives for college achievement. Journal of Human resources, 46(3):614–646.
 Sekhon and Titiunik (2016) Sekhon, J. S. and Titiunik, R. (2016). Understanding regression discontinuity designs as observational studies. Observational Studies, 2:173–181.
 Thistlethwaite and Campbell (1960) Thistlethwaite, D. L. and Campbell, D. T. (1960). Regressiondiscontinuity analysis: An alternative to the ex post facto experiment. Journal of Educational psychology, 51(6):309.
 Zhao et al. (2019) Zhao, Q., Small, D. S., and Bhattacharya, B. B. (2019). Sensitivity analysis for inverse probability weighting estimators via the percentile bootstrap. Journal of the Royal Statistical Society: Series B (Statistical Methodology).