Complexity of the relaxed PeacemanRachford splitting method for the sum of two maximal strongly monotone operators
Abstract
This paper considers the relaxed PeacemanRachford (PR) splitting method for finding an approximate solution of a monotone inclusion whose underlying operator consists of the sum of two maximal strongly monotone operators. Using general results obtained in the setting of a nonEuclidean hybrid proximal extragradient framework, we extend a previous convergence result on the iterates generated by the relaxed PR splitting method, as well as establish new pointwise and ergodic convergence rate results for the method whenever an associated relaxation parameter is within a certain interval. An example is also discussed to demonstrate that the iterates may not converge when the relaxation parameter is outside this interval.
1 Introduction
In this paper, we consider the relaxed PeacemanRachford (PR) splitting method for solving the monotone inclusion
(1) 
where and are maximal strongly monotone (pointtoset) operators for some (with the convention that strongly monotone means simply monotone, and strongly monotone with means strongly monotone in the usual sense). Recall that the relaxed PR splitting method is given by
(2) 
where is a fixed relaxation parameter and . The special case of the relaxed PR splitting method in which is known as the PeacemanRachford (PR) splitting method and the one with is the widelystudied DouglasRachford (DR) splitting method. Convergence results for them are studied for example in [1, 2, 3, 4, 8, 13, 14, 22].
The analysis of the relaxed PR splitting method for the case in which has been undertaken in a number of papers which are discussed in this paragraph. Convergence of the sequence of iterates generated by the relaxed PR splitting method is wellknown when (see for example [1, 7, 14]) and, according to [16], its limiting behavior for the case in which is not known. We actually show in Subsection 5.2 that the sequence (2) does not necessarily converge when . An (strong) pointwise convergence rate result is established in [18] for the relaxed PR splitting method when . Moreover, when and where and are proper lower semicontinuous convex functions, papers [9, 10, 11] derive strong pointwise (resp., ergodic) convergence rate bounds for the relaxed PR method when (resp., ) under different assumptions on the functions. Assuming only strong monotonicity of , where , some smoothness property on , and maximal monotonicity of , [16] shows that the relaxed PR splitting method has linear convergence rate for for some . Linear rate of convergence of the relaxed PR splitting method and its two special cases, namely, the DR splitting and PR splitting methods, are established in [2, 3, 4, 11, 15, 16, 22] under relatively strong assumptions on and/or (see also Table 2).
This paper assumes that , and hence its analysis applies to the case in which both and are monotone () and the case in which both and are strongly monotone (). This paragraph discusses papers dealing with the latter case. Paper [12] establishes convergence of the sequence generated by the relaxed PR splitting method for any and, under some strong assumptions on and , establishes its linear convergence rate. We complement the convergence results in [12] by showing that for , the sequence of iterates generated by the relaxed PR splitting method also converge, and describe an instance showing its nonconvergence when . Moreover, we establish strong pointwise and ergodic convergence rate results (Theorems 4.6 and 4.8) for the relaxed PR splitting method when and , respectively.
Finally, by imposing strong assumptions requiring one of the operators to be strong monotone and one of them to be Lipschitz (and hence pointtopoint), [11, 15, 16] establish linear convergence rate of the relaxed PR splitting method. As opposed to these papers, the assumptions in [12] and this paper do not imply the operators or to be pointtopoint.
Our analysis of the relaxed PR splitting method for solving (1) is based on viewing it as an inexact proximal point method, more specifically, as an instance of a nonEuclidean hybrid proximal extragradient (HPE) framework for solving the monotone inclusion problem. The proximal point method, proposed by Rockafellar [28], is a classical iterative scheme for solving the latter problem. Paper [29] introduces an Euclidean version of the HPE framework which is an inexact version of the proximal point method based on a certain relative error criterion. Iterationcomplexities of the latter framework are established in [24] (see also [25]). Generalizations of the HPE framework to the nonEuclidean setting are studied in [17, 21, 30]. Applications of the HPE framework can be found for example in [19, 20, 25, 24].
This paper is organized as follows. Section 2 describes basic concepts and notation used in the paper. Section 3 discusses the nonEuclidean HPE framework which is used to the study the convergence properties of the relaxed PR splitting method in Sections 4 and 5. Section 4 derives convergence rate bounds for the relaxed PeacemanRachford (PR) splitting method. Section 5, which consists of two subsections, discusses a convergence result of the relaxed PR splitting method in the first subsection and provides an example showing that its iterates may not converge when in the second subsection. Finally, Section 6 discusses the numerical performance of the relaxed PR splitting method for solving the weighted Lasso minimization problem. Section 7 gives some concluding remarks.
2 Basic concepts and notation
This section presents some definitions, notation and terminology which will be used in the paper.
We denote the set of real numbers by and the set of nonnegative real numbers by . Let and be functions with the same domain and whose values are in . We write that if there exists constant such that . Also, we write if and .
Let be a finitedimensional real vector space with inner product denoted by (an example of is endowed with the standard inner product) and let denote an arbitrary seminorm in . Its dual (extended) seminorm, denoted by , is defined as . It is easy to see that
(3) 
The following straightforward result states some basic properties of the dual seminorm associated with a matrix seminorm. Its proof can be found for example in Lemma A.1(b) of [23].
Proposition 2.1
Let be a selfadjoint positive semidefinite linear operator and consider the seminorm in given by for every . Then, and for every .
Given a setvalued operator , its domain is denoted by and its inverse operator is given by . The graph of is defined by . The operator is said to be monotone if
Moreover, is maximal monotone if it is monotone and, additionally, if is a monotone operator such that for every , then . The sum of two setvalued operators is defined by for every . Given a scalar , the enlargement of a monotone operator is defined as
(4) 
3 A nonEuclidean hybrid proximal extragradient framework
This section discusses the nonEuclidean hybrid proximal extragradient (NEHPE) framework and describes its associated convergence and iteration complexity results. The results of the section will be used in Sections 4 and 5 to study the convergence and iteration complexity properties of the relaxed PR splitting method (2). It contains two subsections. The first one describes a class of distance generating functions introduced in [17] and derives some of its basic properties. The second one describes the NEHPE framework and its corresponding convergence and iteration complexity results.
3.1 A class of distance generating functions
We start by introducing a class of distance generating functions (and its corresponding Bregman distances) which is needed for the presentation of the NEHPE framework in Subsection 3.2.
Definition 3.1
For a given convex set , a seminorm in and scalars , we let denote the class of realvalued functions which are differentiable on and satisfy
(5)  
(6) 
A function is referred to as a distance generating function with respect to the seminorm and its associated Bregman distance is defined as
(7) 
Throughout our presentation, we use the second notation instead of the first one although the latter one makes it clear that is a function of two arguments, namely, and . Clearly, it follows from (5) that is a convex function on which is in fact strongly convex on whenever is a norm.
The following simple result summarizes the main identities about the Bregman distance .
Lemma 3.2
For some convex set and scalars , let be given. Then, the following identities hold for every :
(8)  
(9)  
(10)  
(11) 
Proof: Identities (8) and (9) follow straightforwardly from the definition of the Bregman distance in (7). The first inequality in (10) follows easily from (5) and the definition of in (7). The second inequality in (10) follows from (3), (6), the definition of in (7), and the identity
It is easy to see that (11) immediately follows from (6), (8) and (10).
Note that if the seminorm in Definition 3.1 is a norm, then (5) implies that is strongly convex on , in which case the corresponding is said to be nondegenerate on . However, since Definition 3.1 does not necessarily assume that is a norm, it admits the possibility of being not strongly convex on , or equivalently, being degenerate on .
The following result gives some useful properties of distance generating functions.
Lemma 3.3
For some convex set and scalars , let be given. Then, for every and , we have
(12) 
3.2 The NEHPE framework
This subsection describes the NEHPE framework and its corresponding convergence and iteration complexity results.
Throughout this subsection, we assume that scalars , convex set , seminorm and distance generating function with respect to are given. Our problem of interest in this section is the MIP
(13) 
where is a maximal monotone operator satisfying the following conditions:

;

the solution set of (13) is nonempty.
We now state a nonEuclidean HPE (NEHPE) framework for solving the MIP (13) which generalizes its Euclidean counterparts studied in the literature (see for example in [24, 26, 29]).
Framework 1 (An NEHPE framework for solving (13)). Let and be given, and set ; choose and find such that (14) (15) set and go to step 1. end
We now make some remarks about Framework 1. First, it does not specify how to find and satisfying (14) and (15). The particular scheme for computing and will depend on the instance of the framework under consideration and the properties of the operator . Second, if is strongly convex on and , then (15) implies that and for every , and hence that in view of (14). Therefore, the HPE error conditions (14)(15) can be viewed as a relaxation of an iteration of the exact nonEuclidean proximal point method, namely,
We observe that NEHPE frameworks have already been studied in [17], [21] and [30]. The approach presented in this section differs from these three papers as follows. Assuming that is an open convex set, is continuously differentiable on and continuous on its closure, [30] studies a special case of the NEHPE framework in which for every , and presents results on convergence of sequences rather than iteration complexity. Paper [21] deals with distance generating functions which do not necessarily satisfy conditions (5) and (6), and as consequence, obtains results which are more limited in scope, i.e., only an ergodic convergence rate result is obtained for operators with bounded feasible domains (or, more generally, for the case in which the sequence generated by the HPE framwework is bounded). Paper [17] introduces the class of distance generating functions but only analyzes the behavior of a HPE framework for solving inclusions whose operators are strongly monotone with respect to a fixed (see condition A1 in Section 2 of [17]). This section on the other hand assumes that but it does assume any strong monotonicity of with respect to .
Before presenting the main results about the the NEHPE framework, namely, Theorems 3.8 and 3.9 establishing its pointwise and ergodic iteration complexities, respectively, and Propositions 3.10 and 3.11 showing that and/or approach in terms of the Bregman distance , we first establish a few preliminary technical results.
Lemma 3.4
For every and , we have:
(16)  
(17)  
(18) 
Proof: Using (9) twice and the definition of in (14), we conclude that
and hence that (16) holds. Inequality (17) follows immediately from (16) and (15). Moreover, (18) follows by adding (17) from to .
Proposition 3.5
For every and , we have
(19) 
As a consequence, the following statements hold:

is nonincreasing;

;

.
Proof: Let be given. The first inequality in (19) follows from (17) with and the last inequality in (19) follows from the fact that and , and the definition of . Finally, statements (a) and (b) follow immediately from (19) while (c) follows by adding (19) over and using the fact that for every .
For the purpose of stating the convergence rate results below, define
(20) 
Lemma 3.6
For every , define
(21) 
Then, .
Proof: For every , it follows from (14), (8), (11), (15), the triangle inequality for norms and the above definition of , that
The last inequality, (15) and the definition of then imply that for every . Hence, if , it follows that
where the last inequality follows from Proposition 3.5(c). The lemma now follows from the latter relation and the definition of in (20).
Lemma 3.7
Proof: It follows from Lemma 3.6 that
which, in view of the definition of in (21), can be easily seen to be equivalent to the conclusion of the lemma.
The following pointwise convergence rate result describes the convergence rate of the sequence of residual pairs associated to the sequence . Note that its convergence rate bounds are derived on the best residual pair among for rather than on the last residual pair .
Theorem 3.8
Proof: Statements (a) (resp., (b)) follows from Lemma 3.7 with (resp., ).
From now on, we focus on the ergodic convergence rate of the NEHPE framework. For , define and the ergodic sequences
(24) 
The following ergodic convergence result describes the association between the ergodic iterate and the residual pair , and gives a convergence rate bound on the latter residual pair.
Theorem 3.9
Proof: The inequality and the inclusion follows from (24) and the transportation formula (see [5, Theorem 2.3]). Now, let be given. Using (8), (14) and (24), we easily see that
Hence, in view of Proposition 3.5(a), and relations (11) and (21), we have
This inequality together with definition of clearly imply the bound on . We now establish the bound on . Using inequality (18) with , noting (24), and using the fact that is convex and , we conclude that
On the other hand, (12) with implies that for every and ,
where the last inequality is due to Proposition 3.5(a). Combining the above two relations and using the definitions of and , we then conclude that the bound on holds.
We now establish the bounds on under either one of the conditions (a) or (b). First, if , then it follows from (15) and Proposition 3.5 that
for every and . Noting (20) and (25), we then conclude that (26) holds. Assume now that is bounded. Using (12) with and Proposition 3.5(a), and noting the definition of in (b), we conclude that
for every and . Hence, noting (20) and (25), we conclude that (27) holds.
In the remaining part of this subsection, we state some results about the sequence generated by an instance of the NEHPE framework. We assume from now on that such instance generates an infinite sequence of iterates, i.e., the instance does not terminate in a finite number of steps and no termination criterion is checked. Since we are not assuming that the distance generating function is nondegenerate on , it is not possible to establish convergence of the sequence generated by the NEHPE framework to a solution of (13). However, under some mild assumptions, it is possible to establish that approaches a point if the proximity measure used is the actual Bregman distance.
Proposition 3.10
Assume that for some infinite index set and some , we have
(28) 
Then, . If, in addition, , then .
Proof: Using the two limits in (28), and the fact that every maximal monotone operator is closed and for every , we conclude that . This conclusion together with Assumption A0 then imply that the first assertion of the proposition holds and that is nonincreasing in view of Proposition 3.5(a). To show the second assertion, assume that . Since Lemma 3.3 with implies
and the second limit in (28) clearly implies that , we then conclude that . Clearly, since is nonincreasing, we have that , and hence that the second assertion holds.
Proposition 3.11
Assume that , and is bounded. Then, there exists such that
(29) 
Proof: The assumption that and together with Theorem 3.8(b) imply that there exists subsequence converging to zero. Since is bounded, we may assume without loss of generality (by passing to a subsequence if necessary) that converges to some . Hence, by the first part of Proposition 3.10, we conclude that