Projected Reflected Gradient Methods for Monotone Variational Inequalities

Projected Reflected Gradient Methods for Monotone Variational Inequalities

Yu. Malitsky222Department of Cybernetics, Taras Shevchenko National University of Kyiv,
64/13, Volodymyrska Str., Kyiv, 01601, Ukraine (

This paper is concerned with some new projection methods for solving variational inequality problems with monotone and Lipschitz-continuous mapping in Hilbert space. First, we propose the projected reflected gradient algorithm with a constant stepsize. It is similar to the projected gradient method, namely, the method requires only one projection onto the feasible set and only one value of the mapping per iteration. This distinguishes our method from most other projection-type methods for variational inequalities with monotone mapping. Also we prove that it has R-linear rate of convergence under the strong monotonicity assumption. The usual drawback of algorithms with constant stepsize is the requirement to know the Lipschitz constant of the mapping. To avoid this, we modify our first algorithm so that the algorithm needs at most two projections per iteration. In fact, our computational experience shows that such cases with two projections are very rare. This scheme, at least theoretically, seems to be very effective. All methods are shown to be globally convergent to a solution of the variational inequality. Preliminary results from numerical experiments are quite promising.

Key words. variational inequality, projection method, monotone mapping, extragradient method

AMS subject classifications. 47J20, 90C25, 90C30, 90C52

1 Introduction

We consider the classical variational inequality problem (VIP) which is to find a point such that


where is a closed convex set in Hilbert space , denotes the inner product in , and is some mapping. We assume that the following conditions hold

  • The solution set of (LABEL:vip), denoted by , is nonempty.

  • The mapping is monotone, i.e.,

  • The mapping is Lipschitz-continuous with constant , i.e., there exists such that

The variational inequality problem is one of the central problems in nonlinear analysis (see [2, 5, 21]). Also monotone operators have turned out to be an important tool in the study of various problems arising in the domain of optimization, nonlinear analysis, differential equations and other related fields (see [3, 22]). Therefore, numerical methods for VIP with monotone operator have been extensively studied in the literature, see [5, 10] and references therein. In this section we briefly consider the development of projection methods for monotone variational inequality that provide weak convergence to a solution of (LABEL:vip).

The simplest iterative procedure is the well-known projected gradient method

where denotes the metric projection onto the set and is some positive number. In order to converge, however, this method requires the restrictive assumption that be strongly (or inverse strongly) monotone. The extragradient method proposed by Korpelevich and Antipin [11, 1], according to the following formula, overcomes this difficulty


where . The extragradient method has received a great deal of attention by many authors, who improved it in various ways; see, e.g., [9, 7, 17, 4] and references therein. We restrict our attention only to one extension of the extragradient method. It was proposed in [4] by Censor, Gibali and Reich


where . Since the second projection in (LABEL:censor) can be found in a closed form, this method is more applicable when a projection onto the closed convex set is a nontrivial problem.

As an alternative to the extragradient method or its modifications is the following remarkable scheme proposed by Tseng in [20]


where . Both algorithms (LABEL:censor) and (LABEL:tseng) have the same complexity per iteration: we need to compute one projection onto the set and two values of .

Popov in his work [16] proposed an ingenious method, which is similar to the extragradient method, but uses on every iteration only one value of the mapping . Using the idea from [4, 12], Malitsky and Semenov improved Popov algorithm. They presented in [13] the following algorithm


where . It is easy to see that this method needs only one projection onto the set (as in (LABEL:censor) or (LABEL:tseng)) and only one value of per iteration. The latter makes algorithm (LABEL:mal_sem) very attractive for cases when a computation of operator is expensive. This often happens, for example, in a huge-scale VIP or VIP that arises from optimal control.

In this work we propose the following scheme


where . Again we see that both algorithms (LABEL:mal_sem) and (LABEL:our) have the same computational complexity per iteration, but the latter has more simple and elegant structure. Algorithm (LABEL:our) reminds the projected gradient method, however, a value of a gradient is taken in the point that is a reflection of in . Preliminary results from numerical experiments, comparing algorithm (LABEL:our) to others, are promising. Also note that a simple structure of (LABEL:our) allows us to obtain a nonstationary variant of  (LABEL:our) with variable stepsize in a very convenient way.

The paper is organized as follows. Section LABEL:prel presents some basic useful facts which we use throughout the paper. In Section LABEL:algorithm we prove the convergence of the method (LABEL:our). Under some more restrictive assumption we also establish its rate of convergence. In Section LABEL:modif_algorithm we present the nonstationary Algorithm LABEL:algModif1 which is more flexible, since it does not use the Lipschitz constant of the mapping . This makes it much more convenient in practical applications than Algorithm LABEL:our. Section LABEL:num_results contains the results of numerical experiments.

2 Preliminaries

The next three statements are classical. For the proof we refer the reader to [2, 3].

Lemma 2.1

Let be nonempty closed convex set in , . Then

  • ;

  • .

Lemma 2.2 (Minty)

Assume that is a continuous and monotone mapping. Then is a solution of (LABEL:vip) iff is a solution of the following problem

Remark 2.1

The solution set of variational inequality (LABEL:vip) is closed and convex.

As usual, the symbol denotes a weak convergence in .

Lemma 2.3 (Opial)

Let be a sequence in such that . Then for all

The proofs of the next four statements are omitted by their simplicity.

Lemma 2.4

Let be nonempty closed convex set in , and . Then for all

Lemma 2.5

Let , be in H. Then

Lemma 2.6

Let and be two real sequences. Then

Lemma 2.7

Let , be two nonnegative real sequences such that

Then is bounded and .

In order to establish the rate of convergence, we need the following

Lemma 2.8

Let , be two nonnegative real sequences and such that for all the following holds


Then there exist and such that for any .

Proof. First, note that a simple calculus ensures that is decreasing on . Since is continuous and , we can choose such that . Our next task is to find some and such that (LABEL:rate_1) with defined as above will be equivalent to the inequality

It is easy to check that such numbers are

Then by we can conclude that

Iterating the last inequality simply leads to the desired result

where .     

3 Algorithm and its convergence

We first note that solutions of (LABEL:vip) coincide with zeros of the following projected residual function:

where is some positive number. Now we formally state our algorithm.

Algorithm 3.1
  1. Choose and .

  2. Given the current iterate and , compute

  3. If then stop: is a solution. Otherwise compute

    set and return to step LABEL:step.

Next lemma is central to our proof of the convergence theorem.

Lemma 3.1

Let and be two sequences generated by Algorithm LABEL:alg and let . Then


Proof. By Lemma LABEL:proj we have


Since is monotone, . Thus, adding this item to the right side of (LABEL:fst), we get


As , we have by Lemma LABEL:proj

Adding these two inequalities yields

from which we conclude


since .

We next turn to estimating . It follows that


Using (LABEL:est1) and (LABEL:est2), we deduce in (LABEL:snd) that

which completes the proof.     

Now we can state and prove our main convergence result.

Theorem 3.2

Assume that (C1)–(C3) hold. Then any sequence generated by Algorithm LABEL:alg weakly converges to a solution of .

Proof. Let us show that the sequence is bounded. Fix any . Since and

we can deduce from inequality (LABEL:ineq_lemma) that


For let

Then we can rewrite inequality (LABEL:rewrite_ineq) as . By Lemma LABEL:lim_seq, we conclude that is bounded and . Therefore, the sequences and hence are also bounded. From the inequality

we also have that .

As is bounded, there exist a subsequence of such that converges weakly to some . It is clear that is also convergent to that . We show . From Lemma LABEL:proj it follows that

From this we conclude that, for all ,


In the last inequality we used condition (C2). Taking the limit as in (LABEL:weak) and using that , we obtain

which implies by Lemma LABEL:minty that .

Let us show . From (LABEL:rewrite_ineq) it follows that the sequence is monotone for any . Taking into account its boundedness we deduce that it is convergent. At last, the sequence is also convergent, therefore, is convergent.

We want to prove that is weakly convergent. On the contrary, assume that the sequence has at least two weak cluster points and such that . Let be a sequence such that as . Then by Lemma LABEL:liminf and LABEL:opial we have

We can now proceed analogously to the proof that

which is impossible. Hence we can conclude that weakly converges to some .     

It is well-known (see [19]) that under some suitable conditions the extragradient method has R-linear rate of convergence. In the following theorem we show that our method has the same rate of convergence under a strongly monotonicity assumption of the mapping , i.e.,

for some .

Theorem 3.3

Assume that (C1), (C2*), (C3) hold. Then any sequence generated by Algorithm LABEL:alg converges to the solution of (LABEL:vip) at least R-linearly.

Proof. Since is strongly monotone, (LABEL:vip) has a unique solution, which we denote by . Note that by Lemma LABEL:u_v

From this and from (C2*) we conclude that, for all ,


From now on, let be any number in . Then adding the left part of (LABEL:str_mon) to the right side of (LABEL:fst), we get

For items and we use estimations (LABEL:est1) and (LABEL:est2) from Lemma LABEL:main_lemma. Therefore, we obtain

From the last inequality it follows that


In the first inequality we used that , and in the second we used that


As is arbitrary in , we can rewrite (LABEL:rate_ineq1) in the new notation as

Since , we can conclude by Lemma LABEL:rate_1 that for some and . This means that converges to at least R-linearly.     

4 Modified Algorithm

The main shortcoming of all algorithms mentioned in §LABEL:intro is a requirement to know the Lipschitz constant or at least to know some estimation of it. Usually it is difficult to estimate the Lipschitz constant more or less precisely, thus stepsizes will be quite tiny and, of course, this is not practical. For this reason, algorithms with constant stepsize are not applicable in most cases of interest. The usual approaches to overcome this difficulty consist in some prediction of a stepsize with its further correction (see [9, 14, 20]) or in a usage of Armijo-type linesearch procedure along a feasible direction (see [7, 17]). Usually the latter approach is more effective, since very often the former approach requires too many projections onto the feasible set per iteration.

Nevertheless, our modified method uses the prediction-correction strategy. However, in contrast to algorithms in [9, 14, 20], we need at most two projections per iteration. This is explained by the fact that for direction in Algorithm LABEL:alg we use a very simple and cheap formula: . Although we can not explain this theoretically, but numerical experiments show that cases with two projections per iteration are quite rare, so usually we have only one projection per iteration that is a drastic contrast to other existing methods.

Looking on the proof of Theorem LABEL:th_1, we can conclude that inequality (LABEL:est2) is the only place where we use Lipschitz constant . Therefore, choosing such that the inequality holds for some fixed we can obtain the similar estimation as in (LABEL:est2). All this leads us to the following

Algorithm 4.1

Although numerical results showed us effectiveness of Algorithm LABEL:algModif0, we can not prove its convergence for all cases. Nevertheless, we want to notice that we did not find any example where Algorithm LABEL:algModif0 did not work. Thus, even Algorithm LABEL:algModif0 seems to be very reliable for many problems.

Now our task is to modify Algorithm LABEL:algModif0 in a such way that we will be able to prove convergence of the obtained algorithm. For this we need to distinguish the good cases, where Algorithm LABEL:algModif0 works well, from the bad ones, where it possibly does not.

From now on we adopt the convention that . Clearly, it follows that as well. The following algorithm gets round the difficulty of bad cases of Algorithm LABEL:algModif0.

Algorithm 4.2
  • Choose , , , and some large . Compute

  • Given and , set and let be defined by



  • If then stop: is a solution. Otherwise compute

  • If then set and go to step LABEL:modifstep. Otherwise we have two cases and .

    • If then choose such that



      Set , and go to step LABEL:modifstep.

    • If then find such that


      Then choose such that



      Set , , , and go to step LABEL:modifstep.

It is clear that on every iteration in Algorithm LABEL:algModif1 we need to use a residual function with different , namely, .

First, let us show that Algorithm LABEL:algModif1 is correct, i.e., it is always possible to choose and on steps (LABEL:step_la_n.i) and (LABEL:step_la_n.ii). For this we need two simple lemmas.

Lemma 4.1

Step (LABEL:step_la_n.i) in Algorithm LABEL:algModif1 is well-defined.

Proof. From the inequality

we can see that it is sufficient to take . (However, it seems better for practical reasons to choose as great as possible).     

Lemma 4.2

Step (LABEL:step_la_n.ii) in Algorithm LABEL:algModif1 is well-defined.

Proof. First, let us show that for all and . It is clear that for every

Then and hence by induction

Therefore, it is sufficient to take . (But, as above, it seems better to choose as great as possible).

At last, we can prove the existence of such that (LABEL:la_nD) will hold by the same arguments as in Lemma LABEL:4i.     

The following lemma yields an analogous inequality to (LABEL:ineq_lemma).

Lemma 4.3

Let and be two sequences generated by Algorithm LABEL:algModif1 and let , . Then


Proof. Proceeding analogously as in (LABEL:fst) and in (LABEL:snd), we get


The same arguments as in (LABEL:est1) yield


Using (LABEL:est2) and the inequality