Topological Interference Management with User Admission Control via Riemannian Optimization
Abstract
Topological interference management (TIM) provides a promising way to manage interference only based on the network connectivity information. Previous works on the TIM problem mainly focus on using the index coding approach and graph theory to establish conditions of network topologies to achieve the feasibility of topological interference management. In this paper, we propose a novel user admission control approach via sparse and lowrank optimization to maximize the number of admitted users for achieving the feasibility of topological interference management. To assist efficient algorithms design for the formulated rankconstrained (i.e., degreesoffreedom (DoF) allocation) norm maximization (i.e., user capacity maximization) problem, we propose a regularized smoothed norm minimization approach to induce sparsity pattern, thereby guiding the user selection. We further develop a Riemannian trustregion algorithm to solve the resulting rankconstrained smooth optimization problem via exploiting the quotient manifold of fixedrank matrices. Simulation results demonstrate the effectiveness and nearoptimal performance of the proposed Riemannian algorithm to maximize the number of admitted users for topological interference management.
Topological interference management, user admission control, sparse and lowrank modeling, Riemannian optimization, quotient manifold.
1 Introduction
The popularization of innovative applications and new services, such as Internet of Things (IoT) and wearable devices [1], is driving the era of wireless big data [2], thereby revolutionizing the segments of the society. In particular, with ultralow latency and ultrareliable requirements, Tactile Internet [3] enables a new paradigm shift from contentdelivery to skillset delivery networks. Network densification [4, 5], supported by the advanced wireless technologies (e.g., massive MIMO [6], CloudRAN [7, 8], and small cells [9, 10]), becomes the key enabling technology to accommodate the exponential mobile data traffic growth, as well as provide ubiquitous connectivity for massive devices. However, by adding more radio access points per volume, interference becomes becomes the bottleneck to harness the benefits of network densification. Although the recent development of interference alignment [11] and interference coordination [12] have been shown to be effective in the interferencelimited communication scenarios, the significant signaling overhead of obtaining the global channel state information (CSI) limits applicability to dense wireless networks [13].
To reduce the CSI acquisition overhead and make it scalable in dense wireless networks, topological interference management (TIM) approach was proposed in [13] to manage interference only based on the network connectivity information. However, establishing the feasibility of topological interference management is a challenging task. In the slow fading scenario, i.e., channels stay constant during transmission, the TIM problem turns out to be equivalent to the index coding problem [14], which is, however, NPhard in general and only some special cases have been solved [15, 13]. Furthermore, the topological interference management with transmitters cooperation and multiple transmitter antennas were investigated in [16] and [17], respectively. In the fast fading scenario, the graph theory and matroids theory were adopted to find the conditions of network topologies to achieve a certain amount of DoF allocation [18, 19]. A lowrank matrix completion approach with Riemannian algorithms has recently been proposed in [20] to find the minimum channel uses to achieve feasibility for any network topology.
In contrast, in this paper, we propose a different viewpoint: given any network topology and DoF allocation, we aim at finding the maximum number of admitted users to achieve the feasibility of topological interference management. We call this problem as user admission control in topological interference management. User admission control is critical in wireless communication networks (i.e., cognitive radio access networks [21], heterogeneous networks [22] and CloudRAN [23]) when qualityofservices (QoS) requirements are unsatisfied or the channel conditions are unfavorable [24]. Although the user admission control problems are normally nonconvex mixed combinatorial optimization problems, a large body of recent work has demonstrated the effectiveness of convex relaxation for solving such problems [21, 22, 23, 24] based on the sumofinfeasibilities in optimization theory [25]. This is achieved by relaxing the original nonconvex norm minimization problem for user admission control to the convex norm minimization problem [25, 26].
Unfortunately, the user admission control problem in topological interference management turns out to be highly intractable, which needs to optimize over continuous and combinatorial variables. To address the intractability, in this paper, we propose a sparse and lowrank modeling framework to compute the proposed solutions within polynomial time. In this model, sparsity of the diagonal entries of the matrix (i.e., the number of nonzero entries) represents the number of the admitted users. The fixed lowrank constraint indicates the DoF allocation [20]. However, the unique challenges arise in the proposed sparse and lowrank optimization model including the nonconvex fixedrank constraint and user capacity maximization objective function, i.e., norm objective maximization. Novel algorithms thus need to be developed.
1.1 Related Works
User Admission Control
In dense wireless networks, user admission control is critical to maximize the user capacity while satisfying the QoS requirements for all the admitted users. To address the NPhardness of the mixed combinatorial optimization problem, sparse optimization (e.g., norm minimization) approach, supported by the efficient algorithms (e.g., norm convex relaxation [22, 21] and the iterative reweighted algorithm [23]), provided an efficient way to find high quality solutions. However, convex relaxation approach is inapplicable in our sparse and lowrank optimization problem due to the norm maximization as the objective. For the norm relaxation approach, it yields a norm maximization problem, which is still nonconvex. Furthermore, maximizing norm shall yield unbounded values.
LowRank Models
Lowrank models [27, 28] inspire enormous applications in machine learning, recommendation systems, sensor localization, etc. Due to the nonconvexity of lowrank constraint or objective, many heuristic algorithms with optimality guarantees have been proposed in the last few years. In particular, convex relaxation approach using nuclear norm [29] provides a polynomial time complexity algorithm with optimality guarantees via convex geometry and conic integral geometry analysis [30].
The other popular way for lowrank optimization is based on matrix factorization, e.g., the alternating minimization [31, 28] and Riemannian optimization method [32]. In particular, the Riemannian optimization approach requires the smoothness of the objective function, while the alternating approach requires the convexity of the objective function. However, due to the nonconvex and nonsmooth objective function, we can not directly apply the existing matrix factorization approaches to solve the proposed sparse and lowrank optimization framework for user admission control.
Based on the above discussions, in contrast to the previous works on user admission control [21, 22, 23, 24] and lowrank optimization problems [27, 31, 28, 32], we need to address the following coupled challenges to solve the sparse and lowrank optimization for user admission control in topological interference management:

The objective of maximizing the nonconvex norm to maximize the user capacity, i.e., the number of admitted users;

Nonconvex fixedrank constraint to achieve a certain amount of DoF allocation.
Therefore, unique challenges arise in the user admission control problem for topological interference management. We need to redesign the sparsityinducing function and the efficient approach to deal with the fixedrank constraint.
1.2 Contributions
In this paper, we propose a sparse and lowrank optimization framework for user admission control in topological interference management. The Riemannian trustregion algorithm is developed to solve the proposed regularized smoothed norm sparsity inducing minimization problem, thereby guiding user selection. The main contributions are summarized as follows:

We propose a novel sparse and lowrank optimization framework to maximize the number of admitted users for achieving the feasibility of topological interference management.

To avoid unboundness in the relaxed norm maximization problem, a regularized smoothed norm is proposed to induce sparsity pattern with bounded values, thereby guiding user selection.

A Riemannian trustregion algorithm is developed to solve the resulting rankconstrained smooth optimization problem for sparsity inducing. This is achieved by exploiting the quotient manifold of fixedrank matrices.

Simulation results demonstrate the effectiveness and nearoptimal performance of the proposed Riemannian algorithm to maximize the user capacity for topological interference management.
1.3 Organization
The remainder of the paper is organized as follows. Section 2 presents the system model and problem formulation. A sparse and lowrank optimization framework for user admission control is proposed in Section 3. The Riemannian optimization algorithm is developed in Section 4. The ingredients of optimization on quotient manifold are presented in Section 5. Numerical results are illustrated in Section 6. Finally, conclusions and discussions are presented in Section 7.
Notations
Throughout this paper, is the norm. Boldface lower case and upper case letters represent vectors and matrices, respectively. and denote the inverse, transpose, Hermitian and trace operators, respectively. We use and to represent complex domain and real domain, respectively. denotes the expectation of a random variable. stands for either the size of a set or the absolute value of a scalar, depending on the context. We denote and as a diagonal matrix of order and the identity matrix of order , respectively.
2 System Model and Problem Formulation
In this section, we present the channel model, followed by the user admission control problem to achieve the feasibility of topological interference management.
2.1 Channel Model
Consider the topological interference management problem in the partially connected user interference channel with each node quipped with a single antenna [13, 20]. Let be the index set of the connected transceiver pairs such that the channel coefficient between the transmitter and receiver is nonzero if , and is zero otherwise. Each transmitter wishes to send a message to its corresponding receiver . The message is encoded into a vector of length . Therefore, over the channel uses, the received signal at receiver is given by
(1) 
where is the additive noise at receiver . We consider the block fading channel, where the channel coefficients stay constant during transmission, i.e., the channel coherence time is larger than channel uses for transmission. We assume each transmitter has an average power constraint, i.e., with as the maximum average transmit power.
The rate tuple is said to be achievable if there exists a code scheme such that the average decoding error probability is vanishing as the code length approaches infinity. Here, we assume that each message is uniformly and independently chose over the message sets . In this paper, we choose our performance metric as the symmetric DoF [13, 16], i.e., the highest DoF achieved by all the users simultaneously,
(2) 
where is the capacity region defined as the set of all the achievable rate tuples. The metric of DoF gives the firstorder measurement of data rates [33].
2.2 Topological Interference Management
In this paper, we restrict the class of the linear interference management strategies [11, 13, 20]. Specifically, each transmitter encodes its message by a linear precoding vector over channel uses:
(3) 
where is the transmitted data symbol. Here the precoding vectors ’s only depend on the knowledge of network topology . In this paper, we assume that the network connectivity information is available at the transmitters. Therefore, over the channel uses, the received signal at receiver can be rewritten as
(4) 
Let be the decoding vector for each message at receiver . In the regime of asymptotically high signaltonoise ratio (SNR), to accomplish decoding, we impose the following interference alignment condition [11, 13, 20] for the precoding and decoding vectors:
(5)  
(6) 
where the first condition is to preserve the desired signal and the second condition is to align and cancel the interference signals. If conditions (5) and (6) are satisfied, the parallel interferencefree channels can be obtained over channel uses. Therefore, the symmetric DoF of is achieved for each message [13]. We call this problem as topological interference management [13], as only network topology information is required to establish the interference alignment conditions.
However, establishing the conditions on , and to achieve feasibility of the interference alignment conditions (5) and (6) is challenging. In particular, given a number of users and channel uses (or DoF allocation ), the index coding approach [13] and graph theory [16, 19, 18] were adopted to establish the conditions on the network topologies to achieve feasibility for the interference alignment conditions (5) and (6). The lowrank matrix completion approach [20] has recently been proposed to find the minimum number of channel uses satisfying conditions (5) and (6), given any network topology information and the number of uses . The feasibility conditions of antenna configuration for interference alignment in MIMO interference channel has also been extensively investigated using algebraic geometry [34, 35, 36].
In this paper, we put forth a different point of view on the feasibility conditions of topological interference management: given a number of users with any network topology and the symmetric DoF allocation , we present a novel user admission control approach to find the maximum number of the admitted users while satisfying the interference alignment conditions (5) and (6). Although user admission control has been extensively investigated in the scenarios of multiuser coordinated beamforming [24], cognitive radio networks [21], heterogeneous cellular networks [22] and CloudRAN [23], this is the first time using the principle of user admission control in the framework of topological interference management. This shall provide a systematic framework for efficient algorithms design, as well as provide numerical insights into this challenging problem of topological interference management.
3 A Sparse and LowRank Optimization Framework for User Admission Control
In this section, we present a user admission control approach to maximize the user capacity, i.e., find the maximum number of admitted users while satisfying the interference alignment conditions (5) and (6). This viewpoint is different from the previous works on finding the conditions of network topologies to achieve the feasibility of interference alignment [13, 19, 16, 18].
3.1 Feasibility of Interference Alignment
Given any network connectivity information for the partially connected user interference channel, we say that the symmetric DoF allocation is feasible if there exists precoding vectors and decoding vectors such that the interference alignment conditions (5) and (6) are satisfied. Specifically, the feasibility of topological interference management problem can be formulated as
(7)  
where and are optimization variables.
However, the solutions to the feasibility problem (3.1) is unknown in general. In particular, the index coding approach [13] and the graph theory [16, 19, 18] were adopted to establish the conditions on the network topology to achieve feasibility of interference alignment. On the other hand, the lowrank matrix completion approach was proposed in [20] to find the minimum number of channel uses to achieve interference alignment feasibility for any network topology .
In contrast, in this paper, our goal is to maximize the user capacity, i.e., the find the maximum number of admitted users while satisfying the interference alignment conditions:
(8)  
where is the admitted users, and . This problem is called as the user admission control problem. Unfortunately, it turns out to be highly intractable due to the nonconvex quadratic constraints and the nonconvex combinatorial objective function. To assist efficient algorithms design, in this paper, we propose a sparse and lowrank optimization for user admission control via exploiting the sparse and lowrank structures in problem (3.1).
3.2 Sparse and LowRank Optimization Paradigms for User Admission Control
Let with . The interference alignment conditions (5) and (6) thus can be rewritten as
(9)  
(10) 
For other entries , they can be any values. Observing that the achievable symmetric DoF is given by
(11) 
a lowrank matrix completion problem was proposed in [20] to find the minimum channel uses while satisfying the interference alignment conditions. Fig. 1 demonstrates the procedure of transforming the topological interference alignment conditions (5) and (6) into the associated incomplete matrix .
Define as the submatrix of , i.e., . The rank of the submatrix equals . The user admission control problem (3.1) can be further reformulated as follows:
(12)  
where the first constraint is to preserve the symmetric DoF allocation as . However, problem (3.2) is still a highly intractable mixed combinatorial optimization problem with a nonconvex fixedrank constraint and a combinatorial objective function.
To enable the capability of polynomialtime complexity algorithm design, we further reveal the sparsity structure in problem (3.2) for user admission control. We notice that
(13) 
where extracts the diagonal of a matrix and is the norm of a vector, i.e., the count of nonzero entries. Problem (3.2) can be further reformulated as the following sparse and lowrank optimization problem, i.e.,
(14)  
Notice that we only need to consider problem in the real field without losing any performance in terms of admitted users. The reason is that the affine constrain (14) is restricted in real field and the diagonal entries of matrix can be further restricted to the real field while achieving the same value of in the complex field.
Sparse optimization has shown to be powerful for the user admission problems [24, 21, 22, 23] via norm minimization using the sumofinfeasibilities convex relaxation heuristic in optimization theory [25, Section 11.4]. In particular, to maximize the number of admitted users is equivalent to minimize the number of violated inequalities for the qualityofservice (QoS) constraints. Although problem adopts the same philosophy of norm to count the number of admitted users (13), it reveals unique challenges due to norm maximization and nonconvex fixedrank constraint. However, compared with the original formulation (3.2), the sparse and lowrank optimization formulation (3.2) holds algorithmic advantages, which are demonstrated in the sequel via the Riemannian optimization approach [37].
3.3 Problem Analysis
In this subsection, we reveal the unique challenges of solving the sparse and lowrank optimization problem for user admission control in topological interference management.
Nonconvex Objective Function
Although norm serves the convex surrogate for the nonconvex norm [25, 26], it is inapplicable in problem for norm maximization, as it yields unbounded values. To aid efficient algorithms design, we propose a novel regularized norm to induce sparsity with bounded values. This is achieved by adding a quadratic term in the norm as follows:
(15) 
where and is a weighting parameter. A typical example with and is illustrated in Fig. 2, which upper bounds all the diagonal values by .
Nonconvex Fixedrank Constraint
Matrix factorization serves a powerful way to address the nonconvexity of the fixedrank matrices. One popular way is to factorize a fixed rank matrix (in real field) as with and , followed by alternatively optimizing over and holding the other fixed [28, 31]. However, due to the nonconvex objective function in problem , the resulting optimization problem over or is still nonconvex. Furthermore, such factorization is not unique as remains unchanged under the transformation of the factors
(16) 
for all nonsingular matrices of size . As a result, the critical points of an objective function parameterized with and are not isolated on . This profoundly affects the performance of secondorder optimization algorithms which require non degenerate critical points, which is no longer the case here. We propose to address this issue by exploiting the quotient manifold geometry of the set of fixedrank matrices [38]. The resulting nonconvex optimization problem is further solved by exploiting the Riemannian optimization framework which provides systematic ways to develop algorithms on quotient manifolds [37].
In summary, in this paper, we propose a new powerful approach to induce the sparsity in the solution to problem , followed by the Riemannian optimization approach via exploiting the quotient manifold geometry of fixedrank matrices. The induced sparsity pattern guides user selection for user admission control.
4 Regularized Smoothed Minimization for Sparse and LowRank Optimization via Riemannian Optimization
In this section, we present a Riemannian framework for sparse and lowrank optimization problem via regularized smoothed minimization by exploiting the quotient manifold geometry of fixedrank matrices. The induced sparsity solution to problem provides guideline for user admission control, supported by a user selection procedure. In the final stage, a lowrank matrix completion approach with Riemannian optimization is adopted to design the linear topological interference management strategy. The proposed threestage Riemannian framework for user admission control in topological interference management is presented in Fig. 3.
4.1 Stage One: Regularized Smoothed Minimization for Sparsity Inducing
In order to make problem (3.2) numerically tractable, we relax the nonconvex norm objective function to its convex surrogate norm, resulting in the following optimization problem:
(17) 
Although the norm is tractable, it is unbounded from above due to norm maximization, which makes problem (17) illposed. Note that maximizing a convex norm is still nonconvex.
To circumvent the unboundness issue, we add the quadratic term to the objective function in problem (17), where is a weighting parameter that bounds the overall objective function from above leading to the formulation
(18) 
For example, if , then the diagonal values of are upper bounded by . It should be emphasized that the role of in (18) is to upper bound the objective function and it does not affect the sparsity pattern that is expected from (17). This is further be confirmed in Section 4.4 via simulations. Additionally, if is the solution to (3.2), then is also a solution of (3.2) for all nonzero scalar . Equivalently, there exists continuum of solutions, which is effectively resolved by the objective function in (18).
Although problem (18) is still nonconvex due to the nonconvex objective (i.e., maximizing a convex function) and nonconvex fixedrank constraint, it has the algorithmic advantage that it can be solved efficiently (i.e., numerically) in the framework of Riemannian optimization [37].
Riemannian Optimization for FixedRank Optimization
In this subsection, we propose a Riemannian optimization algorithm to solve the nonconvex optimization problem (18), which is equivalent to
(19) 
However, the intersection of rank constraint and the affine constraint is challenging to characterize. We, therefore, propose to solve problem (19) via a regularized version as follows:
(20) 
where is the regularization parameter and is the parameter that approximates with the smooth term that makes the objective function differentiable. A very small leads to illconditioning of the objective function in (4.1.1). Since we intend to obtain the sparsity pattern of the optimal , we set to a high value, e.g., , to make problem (4.1.1) well conditioned. Problem is an optimization problem over the set of fixedrank matrices and can be solved via a Riemannian trustregion algorithm [37].
4.2 Stage Two: Finding Sparsity Pattern for User Admission Control
Let be the solution to the regularized smoothed minimization problem . We order the diagonal entries of matrix , i.e., the vector , in the descending order: . The user with larger coefficients has a higher priority to be admitted. We adopt the bisection search procedure to find the maximum number of admitted users. Specifically, let be the maximum number of users that can be admitted while satisfying the interference alignment conditions. To determine the value of , a sequence of the following sizereduced topological interference management feasibility problem needs to be solved,
(21)  
where .
To check the feasibility, we rewrite problem (4.2) as follows:
(22) 
where and is the orthogonal projection operator onto the subspace of matrices which vanish outside such that the th component of equals to if and zero otherwise. If the objective value approaches to zero, we say that the set of users can be admitted. Problem (4.2) can be solved by Riemannian trustregion algorithms [39] via Manopt [40]. Note that, theoretically, the Riemannian algorithm can only guarantee convergence to a firstorder critical point, but empirically, we observe convergence to critical points that are local minima.
4.3 Stage Three: LowRank Matrix Completion for Topological Interference Management
Let be the admitted users. We need to solve the following sizedreduced rankconstrained matrix completion problem:
(23) 
to find the precoding vectors ’s and decoding vectors ’s for the admitted users in .
Therefore, the proposed threestage Riemannian optimization based user admission control algorithm is presented in Algorithm 4.3.
Step 0: Solve the sparse inducing optimization
problem (4.1.1) using the Riemannian trustregion algorithm in Section 5.
Obtain the solution and sort the diagonal entries in the descending
order: , go to Step 1.
Step 1: Initialize , , .
Step 2: Repeat

Set .
Step 3: Until , obtain
and obtain the admitted users set .
Step 4: Solve problem
(4.3) to obtain the precoding and decoding vectors for the
admitted users.
End
4.4 The Framework of FixedRank Riemannian Manifold Optimization
The optimization problems (4.1.1), (4.2), and (4.3) are leastsquare optimization problems with fixed rank constraint. A rank matrix is parameterized as , where and are full columnrank matrices. Such a factorization, however, is not unique as remains unchanged under the transformation of the factors
(24) 
for all nonsingular matrices , the set of nonsingular matrices. Equivalently, for all nonsingular matrices . As a result, the critical points of an objective function parameterized with and are not isolated on .
The classical remedy to remove this indeterminacy requires further (triangularlike) structure in the factors and . For example, LU decomposition is a way forward. In contrast, we encode the invariance map (24) in an abstract search space by optimizing directly over a set of equivalence classes
(25) 
The set of equivalence classes is termed as the quotient space and is denoted by
(26) 
where the total space is the product space .
Consequently, if an element has the matrix characterization , then (4.1.1), (4.2), and (4.3) are of the form
(27) 
where is defined in (25) and is a smooth function on , but now induced (with slight abuse of notation) on the quotient space (26).
The quotient space has the structure of a smooth Riemannian quotient manifold of by [38]. The Riemannian structure conceptually transforms a rankconstrained optimization problem into an unconstrained optimization problem over the nonlinear manifold . Additionally, it allows to compute objects like gradient (of an objective function) and develop a Riemannian trustregion algorithm on that uses secondorder information for faster convergence [37].
Matrix representation 

Total space 

Group action 

Quotient space 



Vectors in the ambient space  
Matrix representation of a tangent vector in  
Metric for any  
Vertical tangent vectors in 



Horizontal tangent vectors in  


Projection of a tangent vector on the horizontal space 
, where . 


Retraction of a horizontal vector onto the manifold 



Matrix representation of the Riemannian gradient 
, where and are the partial derivatives of with respect to and , respectively. 


Matrix representation of the Riemannian Hessian along a horizontal vector 
, where has the representation shown above. The matrix representation of the Riemannian connection is shown in (34). Finally, the projection operator is defined in (31). 

5 Optimization on Quotient Manifold
Consider an equivalence relation in the total (computational) space . The quotient manifold generated by this equivalence property consists of elements that are equivalence classes of the form . Equivalently, if is an element in , then its matrix representation in is . In the context of rank constraint, is identified with , i.e., the fixedrank manifold. Fig. 4 shows a schematic viewpoint of optimization on a quotient manifold. Particularly, we need the notion of “linearization” of the search space, “search” direction, and a way “move” on a manifold. Below we show the concrete development of these objects that allow to do develop a secondorder trustregion algorithm on manifolds. The concrete manifoldrelated ingredients are shown in Table 1, which are based on the developments in [41].
Since the manifold is an abstract space, the elements of its tangent space at also call for a matrix representation in the tangent space that respects the equivalence relation . Equivalently, the matrix representation of should be restricted to the directions in the tangent space on the total space at that do not induce a displacement along the equivalence class . This is realized by decomposing into complementary subspaces, the vertical and horizontal subspaces such that . The vertical space is the tangent space of the equivalence class . On the other hand, the horizontal space , which is any complementary subspace to in , provides a valid matrix representation of the abstract tangent space [37, Section 3.5.8]. An abstract tangent vector at has a unique element in the horizontal space that is called its horizontal lift. Our specific choice of the horizontal space is the subspace of that is the orthogonal complement of in the sense of a Riemannian metric (an inner product).
A Riemannian metric or an inner product at in the total space defines a Riemannian metric , i.e.,
(28) 
on the quotient manifold , provided that the expression does not depend on a specific representation along the equivalence class . Here and are tangent vectors in , and are their horizontal lifts in at . Equivalently, if is another element that belongs to and and are the horizontal lifts of and at , then the metric in (28) obeys the equality . Such a metric is then said to be invariant to the equivalence relation .
In the context of fixedrank matrices, there exist metrics which are invariant. A particular invariant Riemannian metric on the total space that takes into account the symmetry (24) imposed by the factorization model and that is well suited to a leastsquares objective function [41] is
(29) 
where and . It should be noted that the tangent space has the matrix characterization , i.e., (and similarly ) has the matrix representation .
To show that (29) is invariant to the transformation (24), we assume that another element has matrix representation for a non singular square matrix . Similarly, we assume that the tangent vector (similarly ) has matrix representation . If and (similarly for and ) are the horizontal lifts of at and , respectively. Then, we have and [37, Example 3.5.4]. Similarly for . A few computations then show that , which implies that the metric (29) is invariant to the transformation (24) along the equivalence class . This implies that we have a unique metric on the quotient space .
Motivation for the metric (29) comes from the fact that it is induced from a block diagonal approximation of the Hessian of a simpler cost function , which is strictly convex in and individually. This block diagonal approximation ensures that the cost of computing (29) depends linearly on and the metric is well suited for leastsquares problems. Similar ideas have also been exploited in [20, 42, 43] which show robust performance of Riemannian algorithms for various leastsquares problems.
Once the metric (29) is defined on , the development of the geometric objects required for secondorder optimization follow [37, 41]. The matrix characterizations of the tangent space , vertical space , and horizontal space are straightforward with the expressions:
(30) 
Apart from the characterization of the horizontal space, we need a linear mapping that projects vectors from the tangent space onto the horizontal space. Projecting an element onto the horizontal space is accomplished with the operator
(31) 
where is uniquely obtained by ensuring that belongs to the horizontal space characterized in (30). Finally, the expression of is
5.1 Gradient and Hessian Computations
The choice of the metric (29) and of the horizontal space (as the orthogonal complement of ) turns the quotient manifold into a Riemannian submersion of [37, Section 3.6.2]. This special construction allows for a convenient matrix representation of the gradient [37, Section 3.6.2] and the Hessian [37, Proposition 5.3.3] on the quotient manifold . Below we show the gradient and Hessian computations for the problem (27).
The Riemannian gradient of on is uniquely represented by its horizontal lift in which has the matrix representation
(32) 
where is the gradient of in and and are the partial derivatives of with respect to and , respectively.
In addition to the Riemannian gradient computation (32), we also require the directional derivative of the gradient along a search direction. This is captured by a connection , which is the covariant derivative of vector field with respect to the vector field . The Riemannian connection on the quotient manifold is uniquely represented in terms of the Riemannian connection in the total space [37, Proposition 5.3.3] which is
(33) 
where and are vector fields in and and are their horizontal lifts in . Here is the projection operator defined in (31). It now remains to find out the Riemannian connection in the total space . We find the matrix expression by invoking the Koszul formula [37, Theorem 5.3.1]. After a routine calculation, the final expression is [41]
(34) 
and is the Euclidean directional derivative