Numerical optimal control for HIV prevention with dynamic budget allocation^{1}
Abstract
This paper is about numerical control of HIV propagation. The contribution of the paper is threefold: first, a novel model of HIV propagation is proposed; second, the methods from numerical optimal control are successfully applied to the developed model to compute optimal control profiles; finally, the computed results are applied to the real problem yielding important and practically relevant results.
keywords:
HIV, AIDS, Highly active antiretroviral therapy, Treatment as prevention, Preexposure prophylaxis, Optimal control, Resource allocation1 Introduction
Since the outbreak of the global HIV/AIDS pandemic in the early 1980s about 35 million people died from AIDSrelated illnesses [1]. Although the annual number of new cases of HIV (incidence) has been decreasing globally, among men who have sex with men (MSM) in developed countries incidence is increasing [8, 20]. The current public health challenge in these populations is effective triage of limited prevention resources. Two recent advancements in HIV prevention science are relevant to this issue. First, new forms of highly active antiretroviral therapy (HAART) are so effective that viral load in infected patients become undetectable and that patients on effective treatment are essentially noninfectious [13]; we will refer to this as the Treatment as Prevention (TaP) modality. Second, new treatments targeted at uninfected persons have been shown to reduce the rate of infection to nearly zero when treatment is adhered to [4], which is referred to as PreExposure Prophylaxis (PrEP). We have therapies that are effective at both blocking transmissions from infected persons and protecting uninfected persons from becoming infected; however, there is little consensus on how these technologies should be deployed [12, 14]. The main obstacle is that while the individuallevel efficacy of these interventions can be ascertained with controlled trials, their effectiveness as public health interventions cannot. This is due to the fact that populations are different from one another both in terms of the fundamental transmission dynamics of HIV, but also in the availability of prevention resources. Identifying the optimal strategy of resource allocation must be based on a model of the underlying medical, biological, and social processes that captures the relevant features of the population.
Theoretical studies of intervention effectiveness are generally based on either statespace models represented by a system of ordinary differential equations (ODE) (often referred to as ‘compartmental‘ models in the epidemiology literature, [21, 19]) and agentbased models where largepopulations of individuals with complex behavioral patters are directly simulated. The tradeoff between these approaches exchanges verisimilitude in the agentbased formation for efficient computation in the compartmental formulation. Due to the computational challenges in calculating optimal allocation strategies, compartmental models are generally preferred [22, 30, 25, 17]. In this paper, we present a method for calculating the optimal allocation strategy for a hypothetical health agency in a major US city, which receives funding to set up a longterm HIV prevention program for MSM in that city. Our analysis is based on a compartmental model of HIV transmission that includes natural history of infection, dynamic risk behavior, and partner preference components. For the sake of simplicity we restrict the interventions the health agency can allocate resources to TaP and PrEP. This in turn allows us to address the important question to which extent the investment into PrEP for MSM should be scaled up in the future, which is an open question in HIV prevention policy. While this type of model is in widespread use, our approach to modeling the enrollment of infecteds and highrisk susceptible into ART and PrEP, respectively, and our ability to find dynamic optimal resource allocation patterns over long time horizons is unique.
The described problem of determining the optimal allocation strategy can be formulated as an optimal control problem with a specific set of constraints. We developed an efficient numerical scheme to deal with this problem. The described approach is very general and can be applied to a wide class of optimization problems. To facilitate the reuse of the proposed scheme we provide a detailed description of all steps and indicate possible extensions and ramifications of the method. When solving optimal control problems, there are two classes of methods: indirect and direct ones. Indirect methods employ the firstorder optimality condition to formulate a twopoint boundaryvalue problem for the system of Hamiltonian equation, [10], which is solved, either analytically or numerically (see, e.g., [26, 34]). Analytical solutions to these problems are available only for systems of low order and will not be treated here. The use of numerical schemes is however also restricted. Numerical implementations of indirect methods are often numerically unstable and require a good initial guess which is in most cases not available, [3]. Furthermore, many nonstandard constraints cannot be addressed within the framework of the classical optimal control theory. This is in particular true for the problem considered in the paper. Thus one has to resort to direct methods.
Direct methods attempt to directly minimize the cost function using constrained nonlinear programming. Within the class of direct methods, there are different possible approaches, most notable are shooting methods and simultaneous methods (see [27, 7]). Shooting methods parametrize the controls and use numerical simulation to obtain the solution of the system’s equations on the whole interval (single shooting) or on a number of subintervals (multiple shooting), [33]. Nonlinear programming is then used to optimize the control parameters while obeying the constraints. The multiple shooting method has proven to be very efficient for solving many practical problems. However, the necessity of numerical integration of system’s equations makes this method very time consuming. Simultaneous methods parametrize both the controls and the system’s trajectory and determine the missing values of parameters by solving a (typically) large system of nonlinear algebraic equations. All unknown parameters are determined simultaneously thus giving the name to this class of methods. Along with system’s equations, one describes the constraints using the introduced parametrization. This is referred to as the transcription procedure. The optimal control problem is thus formulated as a large scale nonlinear optimization problem, [15].
A modification of the latter method was employed in the paper. It was used it to compute a set of optimal allocation strategies for a particular scenario of HIV propagation for different parameters of the treatment.
The paper is organized as follows: in Section 2, a model of HIV propagation is derived and analyzed, Section 3 describes the optimal control problem of resource allocation for HIV treatment and prevention while Section 4 describes the numerical scheme used to solve the optimal control problem. Finally, Section 5 presents and analyzes the numerical results.
2 Epidemiological model
2.1 Derivation of the model
We base our approach on a population balance model. This means that we divide the whole population of MSM in the respective city into a number of groups. All individuals within a given group are assumed to be identical in their evolution. The state variables of the model correspond to the number of people within each group. These are described in Table 1, together with other important notation used throughout the paper. The dynamics of each state variable can be described by the following differential equation:
where and describe the in and outflows. That is the number of people that enter or leave the respective group within the unit time interval. The disaggregation is based on whether an individual is infected and in which stage, what is his risk behavior, and whether he receives treatment and which one. In our model, the time unit is set equal to [month].
Size variables:  

Dimension of the state vector  
Number of solution segments  
Number of collocation points  
Dimension of the control vector  
Points:  
Knot points  
Grid points  
State vectors:  
State evolution of the system,  
Polynomial interpolation of through the grid points ,  
The solution of the uncontrolled system on the th interval,  
Matrix of the state values at grid points  
State variables:  
untreated susceptible individuals  
untreated infected individuals  
infected individuals on TaP  
susceptible individuals on PrEP  
individuals deceased due to AIDS  
all individuals  
all individuals displaying a given risk behavior  
Indices:  
Infectious stage (acute/chronic)  
Risk status (high/low) 
The structure of the flows within the system is shown in Fig. 1, the model parameters are summarized in Table 2. Transitions between states happen due to individuals becoming infected, progressing from acute to chronic stage, and dying of AIDS (). Moreover, individuals change their risk behavior (, ), are put on PrEP or TaP treatment or cancel it (, ). Finally there is flow into the system due to individuals reaching an age of sexual activity and outflow from the system because of nonHIV related death or individuals becoming sexual inactive or settling in a monogamous, lifelong relationship.
Infection parameters:  

,  transmission rate for highrisk resp. lowrisk susceptibles 
rate that acutely infected individuals become chronically infected  
rate that chronically infected individuals die due to AIDS  
,  contact rate of highrisk resp. low risk individuals 
\hdashline,  infection probability per act for acutely resp. chronically infecteds 
probability that a sexual contact takes place at the respective preferred sites  
Treatment parameters:  
x  rate at which PrEP fails or is canceled 
baseline enrollment rate into TaP  
y  rate at which TaP fails or is canceled 
Other parameters:  
,  recruitment rate of new highrisk resp. lowrisk susceptibles 
rate that adults die of nonHIV related causes, reach an age of  
sexual inactivity, or settle in a monogamous, lifelong relationship  
rate that highrisk persons become lowrisk  
rate that lowrisk persons become highrisk 
Parameters used in the model were selected from studies of general MSM populations from the United States. Behavioral parameters including the proportion of the population that is highrisk, the contact rate ratio of high and lowrisk individuals, and the rate at which highrisk individuals become low risk and vice versa were taken from an analysis of longitudinal sexual contact rate data of unaffected gay men in 3 large cities in the United States [28].
Finally, we selected values of and such that the endemic equilibrium prevalence is about 20% and about 25% of infected individuals are on effective treatment, which is consistent with an average large MSM population in the United States [32, 29]. However, there are several important aspects of the model which deserve particular attention. These are described below.
Mixing
To model infection events (which take place at rate resp. in our model), we assume that transmissions occur at three distinct sites:

locations frequented exclusively by highrisk individuals

locations frequented exclusively by lowrisk individuals

locations jointly used by both low and highrisk individuals
The total rate of contact of high resp. lowrisk individuals is denoted by resp. . Hereby, both risk groups make a proportion of their contacts at the site which is exclusively frequented by their own risk group and the remainder of contacts at the mixing site.
To derive and , we omit the time dependency in the state variables for now. Then, the total number of contacts made at a given time is given by
with
The probability that, given a random individual has a contact, this contact is with a high resp. low risk individual is
Then, denoting the probability of transmission per contact by for acutestage infecteds and for chronicstage infecteds, the probability of transmission per contact at the mixing site is
Consequently, the percapita instant rate of a highrisk susceptible becoming infected at the common site is
Moreover, the percapita instant rate of a highrisk susceptible becoming infected at the highrisk preferred site is
Therefore, the total percapita transmission rate for highrisk susceptibles is given by
Analogously, the percapita instant rate of a lowrisk susceptible becoming infected at the common site is
and the percapita instant rate of a lowrisk susceptible becoming infected at the lowrisk preferred site is
The total percapita transmission rate for lowrisk susceptibles is given by
Enrollment
For the enrollment on PrEP and TaP, we assume that both is done by randomly sampling individuals at locations where highrisk individuals resp. chronically infecteds are prevalent. In case a sampled individual turns out to be a highrisk susceptible resp. chronically infected, he is urged to enroll in PrEP resp. TaP.
That is, for PrEP we assume that there is a highrisk environment (HRE) where high risk individuals are overrepresented, like bars or sex clubs. The recruitment of the patience then would be carried out during the period in which the location is strongly frequented, i.e., typically in the evening on weekends. Then, denoting the probability that a random high resp. low risk individual is in a HRE at a random moment during the recruitment period by
The probability of a random individual encountered in a HRE at a random moment to be a highrisk susceptible is
where . That is, is the odds of a highrisk person to go to a HRE compared to a lowrisk person, which we assume to be in our work. Consequently, if is the relative rate at which individuals are sampled at HREs and then put on PrEP in case they are high risk susceptibles, the absolute rate for transition from to is
Using the same modeling approach for TaP, we obtain the absolute rate for transition from to beyond the baseline to be with
Analogously, the absolute rate for transition from to beyond the baseline is with
Dynamical system
Based on the considerations presented in the previous paragraphs, we obtain the following system of ODEs describing the dynamics of our system:
(1)  
where is the vector of state variables; is the sum of all states, , is the column of ones; , , , , and are the nonlinear (rational) functions of which take on nonnegative values for any ; and are the control inputs which correspond to the fraction of total population being involved either in PrEP () or in TaP (), and all the remaining terms are nonnegative constants.
Below, we analyze an important property of the system (1) that will be used later on.
2.2 Nonnegativity
Consider the control system
(2) 
where , and .
The system (2) is said to be nonnegative if any solution starting at from belongs to for all , i.e., , , (see [18, 19] for more details).
Definition 2.0.
Let . Then is essentially nonnegative if , for all , and such that , where denotes the th element of .
We have the following result:
Theorem 2.2.
The system (2) is nonnegative for any nonnegative control if and are essentially nonnegative.
Proof.
This result can be proved geometrically by examining the direction of the vector field on the bounding hyperplanes . The essential nonnegativeness condition guarantees that on each the system’s vector field points towards the positive orthant . ∎
In our case, we can readily observe that and as well as , , and are positive for all . Also, we have that the latter three functions turn to zero when, respectively, , or is equal to zero. Thus the right hand side of (1) is essentially nonnegative and the system’s dynamics is nonnegative either.
3 Optimal control problem
The optimal control problem is to minimize the total cost while respecting certain structural and budgetary restrictions. The instantaneous cost is defined as and the total cost to be minimized is
(3) 
where is the time horizon which is chosen to correspond to the duration of the intervention. In the following we set the initial time to 0, but the problem can be easily modified to account for an arbitrary initial time. Note that a possible problem statement could include a discounting factor , which is typically used to describe our priorities: one may attach greater importance to decreasing the incidence in the near future while paying less attention to what will happen in the farther future. This factor could be used to discount some of the future benefits for shorter term benefits to account to for uncertainty in the validity of the model assumptions in the far future.
Remark 3.0.
Note that the optimization problem with the cost functional (3) is wellposed if is nonnegative for all admissible values of and . Since the system (1) is nonnegative, an instantaneous cost function defined as a linear combination of the state variables (as it is typical in epidemiological applications) would satisfy this requirement.
The restrictions imposed on the system are twofold:
Restriction on the admissible control policies
The first class of restrictions is due to the structural limitations of the decision unit. Since the intervention is performed by a medical organization the control profile must be sufficiently regular. We assume that the set of admissible controls consists of piecewise constant functions with a fixed interval between two consecutive switches. This restriction can be easily relaxed in different ways: we can assume that the switches occur at some nonregular times, that the control is not piecewise constant, but rather piecewise linear (or even piecewise continuous) and so forth.
Let , be the time instants at which control switches occur along with the initial and final time. We assume that for any , the duration of the respective interval is constant: . In practice, is chosen to be a multiple of 1 year. The set of admissible controls is thus:
(4) 
The controls are assumed to be rightcontinuous at . In this way, a continuous control is described by a set of its discrete values . The goal of the optimization is to determine these values in order to minimize the cost function (3) while respecting the constraints.
Note that the controls are bounded by zero from below, but there are no upper bounds. The upper bounds are imposed implicitly as will be described below.
Dynamic budget allocation
The second class of constraints is due to the budgetary limitations. In practice, an intervention incurs large expenses which are compensated by the government only to some extent. We assume that at the beginning of each control interval the government sets a baseline budget by estimating the expected expenses and allocates the money to be spent for the intervention starting from this baseline. This works as follows:
The total expenditures related to treating people with TaP or PrEP and the enrollment costs for TaP or PrEP are captured by the following cost function:
where are certain positive defined functions. In our case, we formulate the optimal control problem to minimize the incidence of HIV infection. The instantaneous cost is thus defined as the incidence rate of HIV, i.e.,
Furthermore, when computing the budgetary restrictions we assumed
for and . Here, and are the monthly cost for treatment with TaP and PrEP, respectively, per patient. The coefficients and represent the costs for approaching and, if necessary, enrolling one patient into TaP and PrEP, respectively.
We compute this cost for the uncontrolled case to determine the baseline expenses, i.e., the expenses that the government would defray if there is no intervention. The assigned budget is allocated atop the baseline budget. Let , be the uncontrolled () solution of (1) with initial condition . The dynamic budget constraint (DBC) is thus formulated as follows:
(5) 
where and
The constraints (5) set an implicit limit to the set of admissible controls . In terms of optimal control theory such constraints can be classified as mixed integral inequality path constraints. There is in general no way to handle such constraints analytically, but they can be treated numerically as will be shown below.
Finally we formulate the resulting optimization problem as follows. Determine the discrete values of control , s.t.
(6) 
4 Numerical solution of the optimal control problem
Consider a dynamic system (2). From now on we will follow the established convention and will assume that the state is a row vector. Both and are thus rowvalued vector functions.
A collocation method interpolates the state and the control functions at a number of time points (called grid points), and requires the solution to satisfy the respective differential equation at the collocation points which may not necessarily coincide with the grid points. Some grid points can be used to ensure additional conditions on the solution, e.g., continuity.
4.1 Lagrange interpolation
The optimal state trajectory is a piecewise smooth function whose first derivative is discontinuous at the points . Therefore it is natural to break it into intervals coinciding with and interpolate on each interval separately using the basis of Lagrange polynomials, [6]
where is the number of collocation points within the th interval
Figure 2 shows a family of Lagrangian polynomials defined on a nonuniform grid. Notice that for any grid point there is only one polynomial that takes on a nonzero value (which is equal to 1) at this point.
Let be the state trajectory for . We define to be the matrix of the values of the state at times , i.e., , . The interpolating polynomial for the th component of the state over the th interval is thus
It is well known that for a regular (i.e., equispaced) grid the interpolating polynomial may fluctuate heavily between the interpolation points, especially close to the endpoints of the interval (this is referred to as the Runge phenomenon, see, e.g., [16]). To overcome this drawback one uses unevenly spaced grid points whose distribution density increases as we approach the endpoints of the interval. There are two standard choices for the grid points: Legendre and Chebyshev points which are zeros of Legendre or Chebyshev polynomials. These polynomials belong to the class of orthogonal polynomials thus giving the name to the method (orthogonal collocations), [16, 11].
Figure 3 illustrates this thesis. It shows a sample trajectory along with its Lagrange interpolation obtained using 3 and 5 nonuniformly distributed points per interval. It is seen that the interpolation becomes nearly exact already with 5 grid points. In contrast to this, when using regular grid points the interpolating function may deviate from the approximated trajectory by several orders of magnitude.
There has been an extensive discussion regarding the merits and drawbacks of either of two choices (see, e.g., [35] and references therein). However, it seems that neither of the two is clearly superior to another one. In this study, we have chosen to use the Legendre points.
4.2 Collocation at LegendreGaussRadau points
In the previous step the state was described using Lagrange interpolation polynomials which go through values of the state at . Determining the values corresponds to determining an approximation of the solution of (2). To do so the derivatives of the interpolating polynomials are computed at points , , i.e., and the computed derivatives are required to collocate with the righthand sides of (2) computed for . Additionally, the continuity of the trajectory can be ensured by requiring the interpolating polynomials to be attached to each other at the knot points , : .
To get the required distribution of points we use a particular class of Legendre points, referred to as the LegendreGaussRadau (LGR) points, which are the roots of , where is the th degree normalized Legendre polynomial, . The LGR points have the property that one of these points coincides with the left endpoint of the interval, i.e., the respective polynomial has a root at
Remark 4.0.
Note that alternatively one can use the LegendreGauss (LG) or the LegendreGaussLobato (LGL) points which are determined as the roots of or as the roots of together with . The LG points lie completely within the interval while the LGL points include both endpoints. That is, the LG are defined on and the LGL points on .
Any can be associated with by the affine transformation
(7) 
Consider the LGR points , with . One can map the LGR points to the respective grid points using (7), i.e., . Following the procedure described above we use Lagrange polynomials to approximate the state trajectory at points . For the th component of the state this results in a polynomial of degree at most equal to :
(8) 
Differentiating (8) and evaluating at the collocation point we get
(9) 
where is an differentiation matrix whose th element is the derivative of the Lagrange polynomial at the collocation point , . Note that we do not collocate at the knot points . The differentiation matrix extends the idea of finite difference approximations of derivatives which are computed based on the trajectory evaluation at a finite number of points. In our case, the approximation of the derivative is a function of the state at all the grid points.
Remark 4.0.
In [15], it was shown that the differential matrix turns out to be singular when the number of collocation points is equal to the number of grid points (here denoted by ). The reason for this is that while the interpolating polynomial is degree its derivative is degree and so, requires only conditions to be uniquely determined. Thus the number of collocation points has to be one less than the number of grid points.
To overcome this difficulty, it is proposed in [15] to use the set of LGR or LG points along with a boundary point or . The interpolating polynomial is computed for the extended set of points while the collocation is carried out only at the LGR, resp. LG points.
The flaw of this approach is that it results in a set of grid points which is not longer produced by an orthogonal polynomial. This leads to a decrease in the accuracy of the resulting polynomial interpolation. The obvious remedy is to compute the interpolating polynomial for the whole set of orthogonal points while collocating at all but one point. The remaining point can be used to enforce the continuity condition as it is done in this paper.
The Lagrange basis polynomials defined for the LGR points can be written in barycentric form [6] as
whence (see [31] for the derivation)
Remark 4.0.
Note that the Lagrange basis polynomials are invariant with respect to the shift or the dilatation of the abscissa axis. Thus the approximate differentiation matrix does not change for different intervals provided the number and the type of grid points do not change. However, the use of the affine transformation (7) implies that the system’s differential equations should be modified accordingly to take into account the transformed time variable. In practice, this means that the righthand sides of (2) are to be multiplied with the correction factor .
Denoting by the approximate values of the derivatives of at collocation points we write compactly . We also compute the derivatives of the state trajectory by evaluating the righthand side of the ODE (2). We write and the resulting set of constraints is hence
(10) 
4.3 Defect constraints
While (10) determine values of samples of the state trajectory, there are still free ’s which can be used to ensure continuity of the trajectory. These are the values of at the knot points , . The initial value determines the first points: . The remaining values should be determined from the defect constraints which can be written as follows:
The second term in the lefthand side is and the first term is
(11) 
The integral in (11) can be computed numerically using Gaussian quadrature:
(12) 
where are the weights computed for the given distribution of grid points, [11, 16].
Now we have constraints to determine the same number of discrete values of the state function. Solving these equations is equivalent to getting a numerical solution of the respective differential equations. In other words, we reduced the procedure of integrating the system of ODEs to solving a system of nonlinear algebraic equations.
Now we have the solution for given values of the control , . Before proceeding to the optimization we have to formulate the constraints imposed on the controls.
4.4 Numerical approximation of dynamic budget constraints
To compute the dynamic budget constraints one has to solve the system’s ODEs with zero controls for intervals with initial conditions determined by the solution of the controlled system. This problem can be formulated within the considered framework in the following way. Let be state values at the grid points within the th interval. We set and the remaining discrete values , are determined from
(13) 
The resulting set of inequality constraints is thus
4.5 Constrained nonlinear programming problem
The cost function (3) is computed numerically using Gaussian quadrature and hence the optimal control problem (6) turns into the following constrained optimization problem:
The above nonlinear optimization problem was implemented in Matlab with the use of fmincon function. The optimization was performed using Sequential quadratic programming (SQP) algorithm, see Sec. 5.1 for details.
5 Results
We calculated the optimal allocation strategy for an outbreak scenario in a large US city with 100,000 atrisk MSM individuals. That is, the introduction of HIV into a subpopulation with a low prevalence (e.g. people aged 1525) is considered. We assume the infection probability to be = 0.015 and = 0.001, i.e., acutely infecteds are 15 times more contagious than chronically infecteds [5]. Furthermore, we consider a situation in which risk is static, that is, individuals do not change their risk behavior over time, . Hereby, 90 % of the MSM population exhibits a low risk behavior, the remaining 10 % display highrisk behavior. Hence, we set to be nine times so that the probability that a newly entering individual is highrisk is 10%.
We set , leading to the individuals staying 30 years in the system on average if there was no HIVrelated removal. Here removal represents either natural death, becoming sexually inactive for medical or social reasons, or settling in a monogamous relationship of two uninfected individuals. There is little data available on the mixing patterns among high and lowrisk individuals that we could use to fix . However, we assumed that which is a common assumption when mixing dynamics are unknown. Furthermore, we assume that individuals from the high risk group have ten times more sexual contacts than the ones from the low risk group based on analysis of longitudinal sexual behavioral data [28]. Finally, we optimized over values of and such that at equilibrium the prevalence was 20% and the proportion of infected individuals on treatment was 25%, which is consistent with measured values [32, 29], leading to = 40.9, = 4.09, and = 0.00148. We also assume that treatment never fails and is never stopped, i.e., .
We consider four scenarios varying in the value of : 0, , , . That is, in one scenario PrEP never fails and is never canceled and in three scenarios this happens on average after 5, 2, and 1 year, respectively. These scenarios are meant to correspond to different distribution policies. More precisely, we assume that PrEP never fails and that a high risk individual, once identified, is prescribed PrEP either indefinitely or is only provided with it for 5, 2, and 1 year, respectively. The parameters of the cost function are given in table 3. The cost of enrollment was estimated by considering the total cost per enrollee of a similar intervention program implemented by the New York City Department of Health [9] plus the cost of the labs involved in determining infectious status. The cost of TaP and PrEP is taken from [2].
1299  776  266  213 
The trajectories of the controls are shown in Fig. 4, the ones of the number of individuals in the various states in Fig. 57. We can observe that the lower the value of is, the lower is the total number of newly infecteds and the higher is both the overall values of relative to and the number of individuals on PrEP. These two dependencies are to be expected as we try to optimize the allocation strategy of the agency in charge of fighting HIV in the considered city and this agency does not cover the longterm cost of prescribing PrEP, but only the enrollment costs. Therefore, a lower should render the overall intervention more effective and the agency should become more prone to employ PrEP more intensively.
For , elimination is nearly achieved, whereas for the other values of this is not the case. That is, only the value allows the agency to push the epidemic over the tipping point where the intervention leads to the infection cycle to break down. Moreover, the difference in the number of infecteds over time is much more pronounced between the scenario with and the three scenarios with than between the three scenarios with . Most extreme, the difference in the total number of infecteds is only 0.2 % between and . This is probably due to buffering effects also observed for other exogenous variables [24] leading to a low influenceability of the system for these values of .
Since only the consequences unfolding over the 50 year period considered are taken into account by the cost function, the choice of how to allocate the resources to enrollment into TaP and PrEP, respectively, becomes myopic to the end of the considered period. That is, for and resources are jammed into PrEP towards the end although letting the number of infecteds on treatment drop would ultimately backfire, i.e., leading to number of infecteds higher than necessary after a while. The same effect with the roles of TaP and PrEP switched can be observed for . Moreover, since the number of infecteds or highrisk susceptibles declines over time when TaP or PrEP is favored, successful enrollment into TaP or PrEP, respectively, becomes more and more expensive, eventually favoring the other treatment. This can be observed for , , and . For , TaP is favored over PrEP the whole period except for the switch at the very end.
It is therefore advantageous to modify the optimization problem to avoid such unrealistic results. However, as this reformulation is not always feasible, a possible remedy would be to to discard a period long enough at the end of the time interval for which the controls were optimized.
There are two heuristic approaches to accomplish this: (a) Calculating the costs for an onlyPrEP and an onlyTaP strategy and determining when the last switch between these strategies in terms of which strategy incurs less costs (normally there is only one) takes place. The length of the period discarded is then chosen to be equal to the duration until this switch. (b) Running the optimization procedure on intervals of different length and checking whether the same type of strategy switch taking place at the same distance to the end of the respective calculation period occurs. If yes, this switch is an numerical artifact and needs to be discarded.
5.1 Numerical implementation
The described optimization problem was solved both using the multiple shooting (not described in the paper) and the orthogonal collocation methods. In general, the orthogonal collocations approach is about 35 times faster than the multiple shooting one
Practice shows that it is in general advantageous to choose an initial guess that provides a low value of the cost function while violating constraints. SQP algorithm recovers from the constraints violation in a couple of steps while keeping the cost function relatively small. If on the contrary, one chooses the initial guess in order to satisfy the constraints, the deviation from the optimal value of the cost function may turn out to be rather large. As the convergence of the algorithm is rather slow, it may take unnecessary many steps to achieve the optimal value.
The slow convergence of the algorithm is explained by the particular nonlocal structure of the budget constraints. Since these constraints are computed along the tobeoptimized trajectory, any change of the control variables leads to the variation of the trajectory and thus to the recomputation of the constraints. As an example, a little change of the control during the first interval influences the budget constraints all over the whole optimization interval.
To speed up the computation the gradient of the cost function was supplied to the optimization algorithm. The Hessians were computed numerically using centered finite differences due to the complexity of the respective analytical expressions.
6 Conclusions
This paper presents a novel model describing an HIV propagation dynamics for a geographically concentrated MSM population, along with two control actions. The two controls represent allocation of funding to the two major drugbased interventions available for HIV, HAART / TaP and PrEP, by an health agency. Addressing this setting is of particular interest as it is heavily debated to which extent PrEP should enter the intervention portfolio of health care systems [25, 4, 23, 36]. A suitably modified orthogonal collocations method is applied to compute optimal control profiles for a realistic outbreak scenario in a large US city. Hereby, the influence of the effective cost of PrEP on the optimal allocation strategy and the dynamics of the epidemic is studied, by varying a parameter governing said cost. The obtained results show that the allocation pattern heavily depends on the cost of PrEP, rendering PrEP the dominant intervention or not employed at all depending on said cost. Moreover, whether elimination of HIV in the considered population is achievable also is dependent on the effective cost of PrEP.
Currently, we work on improving the developed algorithm in order to introduce it into the epidemiological community. A manuscript aimed at a public health health audience illustrating use of these methods is currently prepared.
References
Footnotes
 The work of O. S. Serea was partially supported by The French National Research Agency, Project ANR10BLAN 0112
 journal: Applied Mathematics and Computation
 In the following, the upper index will refer to the number of the respective interval , . We will drop this superscript when the reference to a specific interval is not relevant. Furthermore, the lower index will refer to the element of the state vector, i.e., .
 One can also define LGR points which include the right endpoint instead the left one.
 For the optimization time interval equal to 50 years the computational time for orthogonal collocations was typically within the range of seconds while for multiple shooting this time was about seconds. Note that the computational time is strongly influenced by the length of the optimization interval and by the choice of the initial guess.
References
 WHO, Global Health Observatory (GHO) data, HIV/AIDS. http://www.who.int/gho/hiv/en/.
 Sabina S. Alistar, Douglas K. Owens, and Margaret L. Brandeau. Effectiveness and cost effectiveness of oral preexposure prophylaxis in a portfolio of prevention programs for injection drug users in mixed HIV epidemics. PLoS One, 9(1):e86584, 2014.
 Uri M Ascher, Robert MM Mattheij, and Robert D Russell. Numerical solution of boundary value problems for ordinary differential equations. Classics in Applied Mathematics. SIAM, 1994.
 Jared M. Baeten, Deborah Donnell, Patrick Ndase, Nelly R. Mugo, James D. Campbell, Jonathan Wangisi, Jordan W. Tappero, Elizabeth A. Bukusi, Craig R. Cohen, and Elly et. al. Katabira. Antiretroviral Prophylaxis for HIV Prevention in Heterosexual Men and Women. New England Journal of Medicine, 367(5):399–410, August 2012.
 Steve E. Bellan, Jonathan Dushoff, Alison P. Galvani, and Lauren Ancel Meyers. Reassessment of HIV1 acute phase infectivity: Accounting for heterogeneity and study design with simulated cohorts. PLoS Med, 12(3):e1001801, 2015.
 JeanPaul Berrut and Lloyd N Trefethen. Barycentric Lagrange interpolation. SIAM Review, 46(3):501–517, 2004.
 John T Betts. Practical methods for optimal control and estimation using nonlinear programming. SIAM, 2nd edition, 2010.
 Chris Beyrer, Stefan D Baral, Chris Collins, Eugene T Richardson, Patrick S Sullivan, Jorge Sanchez, Gift Trapence, Elly Katabira, Michel Kazatchkine, Owen Ryan, Andrea L Wirtz, and Kenneth H Mayer. The global response to HIV in men who have sex with men. The Lancet, 388(10040):198 – 206, 2016.
 Susan Blank, Kathleen Gallagher, Kate Washburn, and Meighan Rogers. Reaching out to boys at bars: utilizing community partnerships to employ a wellness strategy for syphilis control among men who have sex with men in New York City. Sexually transmitted diseases, 32:S65–S72, 2005.
 V.G. Boltyanskiy, R.V. Gamkrelidze, Y.F. Mischenko, and L.S. Pontryagin. Mathematical theory of optimal processes. John Wiley & Sons, 1962.
 Claudio Canuto, M Yousuff Hussaini, Alfio Maria Quarteroni, and Thomas A Zang. Spectral methods. Fundamentals in Single Domains. Springer, 2006.
 Myron S. Cohen, Christopher Dye, Christophe Fraser, William C. Miller, Kimberly A. Powers, and Brian G. Williams. HIV treatment as prevention: Debate and commentary. will early infection compromise treatmentasprevention strategies? PLoS Med, 9(7):e1001232, 2012.
 Deborah Donnell, Jared M Baeten, James Kiarie, Katherine K Thomas, Wendy Stevens, Craig R Cohen, James McIntyre, Jairam R Lingappa, Connie Celum, Partners in Prevention HSV/HIV Transmission Study Team, et al. Heterosexual HIV1 transmission after initiation of antiretroviral therapy: a prospective cohort analysis. The Lancet, 375(9731):2092–2098, 2010.
 Jeffrey W Eaton, Leigh F Johnson, Joshua A Salomon, Till Bärnighausen, Eran Bendavid, Anna Bershteyn, David E Bloom, Valentina Cambiano, Christophe Fraser, Jan AC Hontelez, et al. HIV treatment as prevention: systematic comparison of mathematical models of the potential impact of antiretroviral therapy on HIV incidence in south africa. PLoS Med, 9(7):e1001245, 2012.
 D. Garg, M. Patterson, W.W. Hager, A.V. Rao, D.A. Benson, and G.T. Huntington. A unified framework for the numerical solution of optimal control problems using pseudospectral methods. Automatica, 46(11):1843–1851, 2010.
 Walter Gautschi. Numerical analysis. Springer, 2011.
 J Griffiths, T England, and J Williams. Analytic solutions to compartmental models of the HIV/AIDS epidemic. IMA Jjournal of Mathematics Applied in Medicine and Biology, 17(4):295–310, 2000.
 Wassim M Haddad and VijaySekhar Chellaboina. Stability and dissipativity theory for nonnegative dynamical systems: a unified analysis framework for biological and physiological systems. Nonlinear Analysis: Real World Applications, 6(1):35–65, 2005.
 Wassim M Haddad, VijaySekhar Chellaboina, and Qing Hui. Nonnegative and compartmental dynamical systems. Princeton University Press, 2010.
 H. Irene Hall, Ruiguang Song, Philip Rhodes, Joseph Prejean, Qian An, Lisa M. Lee, John Karon, Ron Brookmeyer, Edward H. Kaplan, Matthew T. McKenna, and Robert S. Janssen. Etimation of HIV incidence in the United States. JAMA, 300(5):520–529, 2008.
 John A Jacquez and Carl P Simon. Qualitative theory of compartmental systems. Siam Review, 35(1):43–79, 1993.
 C. C. Kerr, R. M. Stuart, R. T. Gray, A. J. Shattock, N. FraserHurt, C. Benedikt, M. Haacker, M. Berdnikov, A. M. Mahmood, S. A. Jaber, M. Gorgens, and D. P. Wilson. Optima: A model for HIV epidemic analysis, program prioritization, and resource optimization. J. Acquir. Immune Defic. Syndr., 69(3):365–376, Jul 2015.
 Kenneth K Mugwanya, Deborah Donnell, Connie Celum, Katherine K Thomas, Patrick Ndase, Nelly Mugo, Elly Katabira, Kenneth Ngure, and Jared M Baeten. Sexual behaviour of heterosexual men and women receiving antiretroviral preexposure prophylaxis for HIV prevention: a longitudinal analysis. The Lancet Infectious Diseases, 13(12):1021–1028, December 2013.
 Kimberly A. Powers, Mirjam E. Kretzschmar, William C. Miller, and Myron S. Cohen. Impact of earlystage hiv transmission on treatment as prevention. Proceedings of the National Academy of Sciences, 111(45):15867–15868, 2014.
 N. Punyacharoensin, W. J. Edmunds, D. De Angelis, V. Delpech, G. Hart, J. Elford, A. Brown, O. N. Gill, and R. G. White. Effect of preexposure prophylaxis and combination HIV prevention for men who have sex with men in the UK: a mathematical modelling study. Lancet HIV, 3(2):e94–e104, Feb 2016.
 Radoslaw Pytlak. Numerical methods for optimal control problems with state constraints. Springer, 1999.
 Anil V Rao. A survey of numerical methods for optimal control. Advances in the Astronautical Sciences, 135(1):497–528, 2009.
 E. O. RomeroSeverson, E. Volz, J. S. Koopman, T. Leitner, and E. L. Ionides. Dynamic variation in sexual contact rates in a cohort of HIVnegative gay men. American Journal of Epidemiology, 2015.
 Eli S. Rosenberg, Gregorio A. Millett, Patrick S. Sullivan, Carlos Del Rio, and James W. Curran. Modeling disparities in HIV infection between black and white men who have sex with men in the United States using the HIV care continuum. The Lancet HIV, 1(3):e112–e118, 2014.
 A. J. Shattock, C. C. Kerr, R. M. Stuart, E. Masaki, N. Fraser, C. Benedikt, M. Gorgens, D. P. Wilson, and R. T. Gray. In the interests of time: improving HIV allocative efficiency modelling via optimal timevarying allocations. J Int AIDS Soc, 19(1):20627, 2016.
 Jie Shen, Tao Tang, and LiLian Wang. Spectral methods: algorithms, analysis and applications, volume 41 of Springer Series in Computational Mathematics. Springer, 2011.
 A. Smith, I. Miles, B. Le, T. Finlayson, A. Oster, and E. DiNenno. Prevalence and awareness of HIV infection among men who have sex with men – 21 Cities, US 2008. Morbidity and Mortality Weekly Report, 59(37):1201–1227, 2010.
 Josef Stoer and Roland Bulirsch. Introduction to numerical analysis, volume 12 of Texts in Applied Mathematics. Springer, 2nd edition, 1993.
 Subchan Subchan and Rafal Żbikowski. Computational optimal control: tools and practice. John Wiley & Sons, 2009.
 Lloyd N Trefethen. Is Gauss quadrature better than ClenshawCurtis? SIAM Review, 50(1):67–87, 2008.
 David A.M.C. van de Vijver, Brooke E. Nichols, Ume L. Abbas, Charles A.B. Boucher, Valentina Cambiano, Jeffrey W. Eaton, Robert Glaubius, Katrina Lythgoe, John Mellors, Andrew Phillips, Kim C. Sigaloff, and Timothy B. Hallett. Preexposure prophylaxis will have a limited impact on HIV1 drug resistance in subSaharan Africa: a comparison of mathematical models. AIDS, 27(18):2943–2951, November 2013.