Topological Interference Management with User Admission Control via Riemannian Optimization

Topological Interference Management with User Admission Control via Riemannian Optimization

Abstract

Topological interference management (TIM) provides a promising way to manage interference only based on the network connectivity information. Previous works on the TIM problem mainly focus on using the index coding approach and graph theory to establish conditions of network topologies to achieve the feasibility of topological interference management. In this paper, we propose a novel user admission control approach via sparse and low-rank optimization to maximize the number of admitted users for achieving the feasibility of topological interference management. To assist efficient algorithms design for the formulated rank-constrained (i.e., degrees-of-freedom (DoF) allocation) -norm maximization (i.e., user capacity maximization) problem, we propose a regularized smoothed -norm minimization approach to induce sparsity pattern, thereby guiding the user selection. We further develop a Riemannian trust-region algorithm to solve the resulting rank-constrained smooth optimization problem via exploiting the quotient manifold of fixed-rank matrices. Simulation results demonstrate the effectiveness and near-optimal performance of the proposed Riemannian algorithm to maximize the number of admitted users for topological interference management.

\IEEEpeerreviewmaketitle{IEEEkeywords}

Topological interference management, user admission control, sparse and low-rank modeling, Riemannian optimization, quotient manifold.

\IEEEpeerreviewmaketitle

1 Introduction

The popularization of innovative applications and new services, such as Internet of Things (IoT) and wearable devices [1], is driving the era of wireless big data [2], thereby revolutionizing the segments of the society. In particular, with ultra-low latency and ultra-reliable requirements, Tactile Internet [3] enables a new paradigm shift from content-delivery to skill-set delivery networks. Network densification [4, 5], supported by the advanced wireless technologies (e.g., massive MIMO [6], Cloud-RAN [7, 8], and small cells [9, 10]), becomes the key enabling technology to accommodate the exponential mobile data traffic growth, as well as provide ubiquitous connectivity for massive devices. However, by adding more radio access points per volume, interference becomes becomes the bottleneck to harness the benefits of network densification. Although the recent development of interference alignment [11] and interference coordination [12] have been shown to be effective in the interference-limited communication scenarios, the significant signaling overhead of obtaining the global channel state information (CSI) limits applicability to dense wireless networks [13].

To reduce the CSI acquisition overhead and make it scalable in dense wireless networks, topological interference management (TIM) approach was proposed in [13] to manage interference only based on the network connectivity information. However, establishing the feasibility of topological interference management is a challenging task. In the slow fading scenario, i.e., channels stay constant during transmission, the TIM problem turns out to be equivalent to the index coding problem [14], which is, however, NP-hard in general and only some special cases have been solved [15, 13]. Furthermore, the topological interference management with transmitters cooperation and multiple transmitter antennas were investigated in [16] and [17], respectively. In the fast fading scenario, the graph theory and matroids theory were adopted to find the conditions of network topologies to achieve a certain amount of DoF allocation [18, 19]. A low-rank matrix completion approach with Riemannian algorithms has recently been proposed in [20] to find the minimum channel uses to achieve feasibility for any network topology.

In contrast, in this paper, we propose a different viewpoint: given any network topology and DoF allocation, we aim at finding the maximum number of admitted users to achieve the feasibility of topological interference management. We call this problem as user admission control in topological interference management. User admission control is critical in wireless communication networks (i.e., cognitive radio access networks [21], heterogeneous networks [22] and Cloud-RAN [23]) when quality-of-services (QoS) requirements are unsatisfied or the channel conditions are unfavorable [24]. Although the user admission control problems are normally non-convex mixed combinatorial optimization problems, a large body of recent work has demonstrated the effectiveness of convex relaxation for solving such problems [21, 22, 23, 24] based on the sum-of-infeasibilities in optimization theory [25]. This is achieved by relaxing the original non-convex -norm minimization problem for user admission control to the convex -norm minimization problem [25, 26].

Unfortunately, the user admission control problem in topological interference management turns out to be highly intractable, which needs to optimize over continuous and combinatorial variables. To address the intractability, in this paper, we propose a sparse and low-rank modeling framework to compute the proposed solutions within polynomial time. In this model, sparsity of the diagonal entries of the matrix (i.e., the number of non-zero entries) represents the number of the admitted users. The fixed low-rank constraint indicates the DoF allocation [20]. However, the unique challenges arise in the proposed sparse and low-rank optimization model including the non-convex fixed-rank constraint and user capacity maximization objective function, i.e., -norm objective maximization. Novel algorithms thus need to be developed.

1.1 Related Works

User Admission Control

In dense wireless networks, user admission control is critical to maximize the user capacity while satisfying the QoS requirements for all the admitted users. To address the NP-hardness of the mixed combinatorial optimization problem, sparse optimization (e.g., -norm minimization) approach, supported by the efficient algorithms (e.g., -norm convex relaxation [22, 21] and the iterative reweighted -algorithm [23]), provided an efficient way to find high quality solutions. However, convex relaxation approach is inapplicable in our sparse and low-rank optimization problem due to the -norm maximization as the objective. For the -norm relaxation approach, it yields a -norm maximization problem, which is still non-convex. Furthermore, maximizing -norm shall yield unbounded values.

Low-Rank Models

Low-rank models [27, 28] inspire enormous applications in machine learning, recommendation systems, sensor localization, etc. Due to the non-convexity of low-rank constraint or objective, many heuristic algorithms with optimality guarantees have been proposed in the last few years. In particular, convex relaxation approach using nuclear norm [29] provides a polynomial time complexity algorithm with optimality guarantees via convex geometry and conic integral geometry analysis [30].

The other popular way for low-rank optimization is based on matrix factorization, e.g., the alternating minimization [31, 28] and Riemannian optimization method [32]. In particular, the Riemannian optimization approach requires the smoothness of the objective function, while the alternating approach requires the convexity of the objective function. However, due to the non-convex and non-smooth objective function, we can not directly apply the existing matrix factorization approaches to solve the proposed sparse and low-rank optimization framework for user admission control.

Based on the above discussions, in contrast to the previous works on user admission control [21, 22, 23, 24] and low-rank optimization problems [27, 31, 28, 32], we need to address the following coupled challenges to solve the sparse and low-rank optimization for user admission control in topological interference management:

  • The objective of maximizing the non-convex -norm to maximize the user capacity, i.e., the number of admitted users;

  • Non-convex fixed-rank constraint to achieve a certain amount of DoF allocation.

Therefore, unique challenges arise in the user admission control problem for topological interference management. We need to re-design the sparsity-inducing function and the efficient approach to deal with the fixed-rank constraint.

1.2 Contributions

In this paper, we propose a sparse and low-rank optimization framework for user admission control in topological interference management. The Riemannian trust-region algorithm is developed to solve the proposed regularized smoothed -norm sparsity inducing minimization problem, thereby guiding user selection. The main contributions are summarized as follows:

  1. We propose a novel sparse and low-rank optimization framework to maximize the number of admitted users for achieving the feasibility of topological interference management.

  2. To avoid unboundness in the relaxed -norm maximization problem, a regularized smoothed -norm is proposed to induce sparsity pattern with bounded values, thereby guiding user selection.

  3. A Riemannian trust-region algorithm is developed to solve the resulting rank-constrained smooth optimization problem for sparsity inducing. This is achieved by exploiting the quotient manifold of fixed-rank matrices.

  4. Simulation results demonstrate the effectiveness and near-optimal performance of the proposed Riemannian algorithm to maximize the user capacity for topological interference management.

1.3 Organization

The remainder of the paper is organized as follows. Section 2 presents the system model and problem formulation. A sparse and low-rank optimization framework for user admission control is proposed in Section 3. The Riemannian optimization algorithm is developed in Section 4. The ingredients of optimization on quotient manifold are presented in Section 5. Numerical results are illustrated in Section 6. Finally, conclusions and discussions are presented in Section 7.

Notations

Throughout this paper, is the -norm. Boldface lower case and upper case letters represent vectors and matrices, respectively. and denote the inverse, transpose, Hermitian and trace operators, respectively. We use and to represent complex domain and real domain, respectively. denotes the expectation of a random variable. stands for either the size of a set or the absolute value of a scalar, depending on the context. We denote and as a diagonal matrix of order and the identity matrix of order , respectively.

2 System Model and Problem Formulation

In this section, we present the channel model, followed by the user admission control problem to achieve the feasibility of topological interference management.

2.1 Channel Model

Consider the topological interference management problem in the partially connected -user interference channel with each node quipped with a single antenna [13, 20]. Let be the index set of the connected transceiver pairs such that the channel coefficient between the transmitter and receiver is non-zero if , and is zero otherwise. Each transmitter wishes to send a message to its corresponding receiver . The message is encoded into a vector of length . Therefore, over the channel uses, the received signal at receiver is given by

(1)

where is the additive noise at receiver . We consider the block fading channel, where the channel coefficients stay constant during transmission, i.e., the channel coherence time is larger than channel uses for transmission. We assume each transmitter has an average power constraint, i.e., with as the maximum average transmit power.

The rate tuple is said to be achievable if there exists a code scheme such that the average decoding error probability is vanishing as the code length approaches infinity. Here, we assume that each message is uniformly and independently chose over the message sets . In this paper, we choose our performance metric as the symmetric DoF [13, 16], i.e., the highest DoF achieved by all the users simultaneously,

(2)

where is the capacity region defined as the set of all the achievable rate tuples. The metric of DoF gives the first-order measurement of data rates [33].

2.2 Topological Interference Management

In this paper, we restrict the class of the linear interference management strategies [11, 13, 20]. Specifically, each transmitter encodes its message by a linear precoding vector over channel uses:

(3)

where is the transmitted data symbol. Here the precoding vectors ’s only depend on the knowledge of network topology . In this paper, we assume that the network connectivity information is available at the transmitters. Therefore, over the channel uses, the received signal at receiver can be rewritten as

(4)

Let be the decoding vector for each message at receiver . In the regime of asymptotically high signal-to-noise ratio (SNR), to accomplish decoding, we impose the following interference alignment condition [11, 13, 20] for the precoding and decoding vectors:

(5)
(6)

where the first condition is to preserve the desired signal and the second condition is to align and cancel the interference signals. If conditions (5) and (6) are satisfied, the parallel interference-free channels can be obtained over channel uses. Therefore, the symmetric DoF of is achieved for each message [13]. We call this problem as topological interference management [13], as only network topology information is required to establish the interference alignment conditions.

However, establishing the conditions on , and to achieve feasibility of the interference alignment conditions (5) and (6) is challenging. In particular, given a number of users and channel uses (or DoF allocation ), the index coding approach [13] and graph theory [16, 19, 18] were adopted to establish the conditions on the network topologies to achieve feasibility for the interference alignment conditions (5) and (6). The low-rank matrix completion approach [20] has recently been proposed to find the minimum number of channel uses satisfying conditions (5) and (6), given any network topology information and the number of uses . The feasibility conditions of antenna configuration for interference alignment in MIMO interference channel has also been extensively investigated using algebraic geometry [34, 35, 36].

In this paper, we put forth a different point of view on the feasibility conditions of topological interference management: given a number of users with any network topology and the symmetric DoF allocation , we present a novel user admission control approach to find the maximum number of the admitted users while satisfying the interference alignment conditions (5) and (6). Although user admission control has been extensively investigated in the scenarios of multiuser coordinated beamforming [24], cognitive radio networks [21], heterogeneous cellular networks [22] and Cloud-RAN [23], this is the first time using the principle of user admission control in the framework of topological interference management. This shall provide a systematic framework for efficient algorithms design, as well as provide numerical insights into this challenging problem of topological interference management.

3 A Sparse and Low-Rank Optimization Framework for User Admission Control

In this section, we present a user admission control approach to maximize the user capacity, i.e., find the maximum number of admitted users while satisfying the interference alignment conditions (5) and (6). This viewpoint is different from the previous works on finding the conditions of network topologies to achieve the feasibility of interference alignment [13, 19, 16, 18].

3.1 Feasibility of Interference Alignment

Given any network connectivity information for the partially connected -user interference channel, we say that the symmetric DoF allocation is feasible if there exists precoding vectors and decoding vectors such that the interference alignment conditions (5) and (6) are satisfied. Specifically, the feasibility of topological interference management problem can be formulated as

(7)

where and are optimization variables.

However, the solutions to the feasibility problem (3.1) is unknown in general. In particular, the index coding approach [13] and the graph theory [16, 19, 18] were adopted to establish the conditions on the network topology to achieve feasibility of interference alignment. On the other hand, the low-rank matrix completion approach was proposed in [20] to find the minimum number of channel uses to achieve interference alignment feasibility for any network topology .

In contrast, in this paper, our goal is to maximize the user capacity, i.e., the find the maximum number of admitted users while satisfying the interference alignment conditions:

(8)

where is the admitted users, and . This problem is called as the user admission control problem. Unfortunately, it turns out to be highly intractable due to the non-convex quadratic constraints and the non-convex combinatorial objective function. To assist efficient algorithms design, in this paper, we propose a sparse and low-rank optimization for user admission control via exploiting the sparse and low-rank structures in problem (3.1).

3.2 Sparse and Low-Rank Optimization Paradigms for User Admission Control

Figure 1: (a) The topological interference alignment problem for the partially connected -user interference channel with only the knowledge of the network connectivity information available. The interference links are marked as red while the desired links are marked as black. (b) The corresponding incomplete matrix with “0” indicating interference alignment and cancellation and “1” representing desired signal preserving.

Let with . The interference alignment conditions (5) and (6) thus can be rewritten as

(9)
(10)

For other entries , they can be any values. Observing that the achievable symmetric DoF is given by

(11)

a low-rank matrix completion problem was proposed in [20] to find the minimum channel uses while satisfying the interference alignment conditions. Fig. 1 demonstrates the procedure of transforming the topological interference alignment conditions (5) and (6) into the associated incomplete matrix .

Define as the submatrix of , i.e., . The rank of the submatrix equals . The user admission control problem (3.1) can be further reformulated as follows:

(12)

where the first constraint is to preserve the symmetric DoF allocation as . However, problem (3.2) is still a highly intractable mixed combinatorial optimization problem with a non-convex fixed-rank constraint and a combinatorial objective function.

To enable the capability of polynomial-time complexity algorithm design, we further reveal the sparsity structure in problem (3.2) for user admission control. We notice that

(13)

where extracts the diagonal of a matrix and is the -norm of a vector, i.e., the count of non-zero entries. Problem (3.2) can be further reformulated as the following sparse and low-rank optimization problem, i.e.,

(14)

Notice that we only need to consider problem in the real field without losing any performance in terms of admitted users. The reason is that the affine constrain (14) is restricted in real field and the diagonal entries of matrix can be further restricted to the real field while achieving the same value of in the complex field.

Sparse optimization has shown to be powerful for the user admission problems [24, 21, 22, 23] via -norm minimization using the sum-of-infeasibilities convex relaxation heuristic in optimization theory [25, Section 11.4]. In particular, to maximize the number of admitted users is equivalent to minimize the number of violated inequalities for the quality-of-service (QoS) constraints. Although problem adopts the same philosophy of -norm to count the number of admitted users (13), it reveals unique challenges due to -norm maximization and non-convex fixed-rank constraint. However, compared with the original formulation (3.2), the sparse and low-rank optimization formulation (3.2) holds algorithmic advantages, which are demonstrated in the sequel via the Riemannian optimization approach [37].

3.3 Problem Analysis

In this subsection, we reveal the unique challenges of solving the sparse and low-rank optimization problem for user admission control in topological interference management.

Non-convex Objective Function

Figure 2: The regularized sparsity inducing norm with bounded values in .

Although -norm serves the convex surrogate for the non-convex -norm [25, 26], it is inapplicable in problem for -norm maximization, as it yields unbounded values. To aid efficient algorithms design, we propose a novel regularized -norm to induce sparsity with bounded values. This is achieved by adding a quadratic term in the -norm as follows:

(15)

where and is a weighting parameter. A typical example with and is illustrated in Fig. 2, which upper bounds all the diagonal values by .

Non-convex Fixed-rank Constraint

Matrix factorization serves a powerful way to address the non-convexity of the fixed-rank matrices. One popular way is to factorize a fixed rank- matrix (in real field) as with and , followed by alternatively optimizing over and holding the other fixed [28, 31]. However, due to the non-convex objective function in problem , the resulting optimization problem over or is still non-convex. Furthermore, such factorization is not unique as remains unchanged under the transformation of the factors

(16)

for all non-singular matrices of size . As a result, the critical points of an objective function parameterized with and are not isolated on . This profoundly affects the performance of second-order optimization algorithms which require non degenerate critical points, which is no longer the case here. We propose to address this issue by exploiting the quotient manifold geometry of the set of fixed-rank matrices [38]. The resulting non-convex optimization problem is further solved by exploiting the Riemannian optimization framework which provides systematic ways to develop algorithms on quotient manifolds [37].

In summary, in this paper, we propose a new powerful approach to induce the sparsity in the solution to problem , followed by the Riemannian optimization approach via exploiting the quotient manifold geometry of fixed-rank matrices. The induced sparsity pattern guides user selection for user admission control.

4 Regularized Smoothed -Minimization for Sparse and Low-Rank Optimization via Riemannian Optimization

In this section, we present a Riemannian framework for sparse and low-rank optimization problem via regularized smoothed -minimization by exploiting the quotient manifold geometry of fixed-rank matrices. The induced sparsity solution to problem provides guideline for user admission control, supported by a user selection procedure. In the final stage, a low-rank matrix completion approach with Riemannian optimization is adopted to design the linear topological interference management strategy. The proposed three-stage Riemannian framework for user admission control in topological interference management is presented in Fig. 3.

Figure 3: The proposed three-stage Riemannian framework for user admission control in topological interference alignment via sparse and low-rank optimization. is the induced sparsity pattern for user selection and is set of admitted users.

4.1 Stage One: Regularized Smoothed -Minimization for Sparsity Inducing

In order to make problem (3.2) numerically tractable, we relax the non-convex -norm objective function to its convex surrogate -norm, resulting in the following optimization problem:

(17)

Although the -norm is tractable, it is unbounded from above due to -norm maximization, which makes problem (17) ill-posed. Note that maximizing a convex -norm is still non-convex.

To circumvent the unboundness issue, we add the quadratic term to the objective function in problem (17), where is a weighting parameter that bounds the overall objective function from above leading to the formulation

(18)

For example, if , then the diagonal values of are upper bounded by . It should be emphasized that the role of in (18) is to upper bound the objective function and it does not affect the sparsity pattern that is expected from (17). This is further be confirmed in Section 4.4 via simulations. Additionally, if is the solution to (3.2), then is also a solution of (3.2) for all non-zero scalar . Equivalently, there exists continuum of solutions, which is effectively resolved by the objective function in (18).

Although problem (18) is still non-convex due to the non-convex objective (i.e., maximizing a convex function) and non-convex fixed-rank constraint, it has the algorithmic advantage that it can be solved efficiently (i.e., numerically) in the framework of Riemannian optimization [37].

Riemannian Optimization for Fixed-Rank Optimization

In this subsection, we propose a Riemannian optimization algorithm to solve the non-convex optimization problem (18), which is equivalent to

(19)

However, the intersection of rank constraint and the affine constraint is challenging to characterize. We, therefore, propose to solve problem (19) via a regularized version as follows:

(20)

where is the regularization parameter and is the parameter that approximates with the smooth term that makes the objective function differentiable. A very small leads to ill-conditioning of the objective function in (4.1.1). Since we intend to obtain the sparsity pattern of the optimal , we set to a high value, e.g., , to make problem (4.1.1) well conditioned. Problem is an optimization problem over the set of fixed-rank matrices and can be solved via a Riemannian trust-region algorithm [37].

4.2 Stage Two: Finding Sparsity Pattern for User Admission Control

Let be the solution to the regularized smoothed -minimization problem . We order the diagonal entries of matrix , i.e., the vector , in the descending order: . The user with larger coefficients has a higher priority to be admitted. We adopt the bi-section search procedure to find the maximum number of admitted users. Specifically, let be the maximum number of users that can be admitted while satisfying the interference alignment conditions. To determine the value of , a sequence of the following size-reduced topological interference management feasibility problem needs to be solved,

(21)

where .

To check the feasibility, we rewrite problem (4.2) as follows:

(22)

where and is the orthogonal projection operator onto the subspace of matrices which vanish outside such that the -th component of equals to if and zero otherwise. If the objective value approaches to zero, we say that the set of users can be admitted. Problem (4.2) can be solved by Riemannian trust-region algorithms [39] via Manopt [40]. Note that, theoretically, the Riemannian algorithm can only guarantee convergence to a first-order critical point, but empirically, we observe convergence to critical points that are local minima.

4.3 Stage Three: Low-Rank Matrix Completion for Topological Interference Management

Let be the admitted users. We need to solve the following sized-reduced rank-constrained matrix completion problem:

(23)

to find the precoding vectors ’s and decoding vectors ’s for the admitted users in .

Therefore, the proposed three-stage Riemannian optimization based user admission control algorithm is presented in Algorithm 4.3.

{algorithm}

User Admission Control for Topological Interference Management via Riemannian Optimization Step 0: Solve the sparse inducing optimization problem (4.1.1) using the Riemannian trust-region algorithm in Section 5. Obtain the solution and sort the diagonal entries in the descending order: , go to Step 1.
Step 1: Initialize , , .
Step 2: Repeat

  1. Set .

  2. Solve problem (4.2) via (4.2) using the Riemannian trust-region algorithm in Section 5: if
    it is feasible, set ; otherwise, set .

Step 3: Until , obtain and obtain the admitted users set .
Step 4: Solve problem (4.3) to obtain the precoding and decoding vectors for the admitted users.
End

Figure 4: Optimization on a quotient manifold. The dotted lines represent abstract objects and the solid lines are their matrix representations. The points and in the total (computational) space belong to the same equivalence class (shown in solid blue color) and they represent a single point in the quotient space . An algorithm by necessity is implemented in the computation space, but conceptually, the search is on the quotient manifold. Given a search direction at , the updated point on is given by the retraction mapping .

4.4 The Framework of Fixed-Rank Riemannian Manifold Optimization

The optimization problems (4.1.1), (4.2), and (4.3) are least-square optimization problems with fixed rank constraint. A rank- matrix is parameterized as , where and are full column-rank matrices. Such a factorization, however, is not unique as remains unchanged under the transformation of the factors

(24)

for all non-singular matrices , the set of non-singular matrices. Equivalently, for all non-singular matrices . As a result, the critical points of an objective function parameterized with and are not isolated on .

The classical remedy to remove this indeterminacy requires further (triangular-like) structure in the factors and . For example, LU decomposition is a way forward. In contrast, we encode the invariance map (24) in an abstract search space by optimizing directly over a set of equivalence classes

(25)

The set of equivalence classes is termed as the quotient space and is denoted by

(26)

where the total space is the product space .

Consequently, if an element has the matrix characterization , then (4.1.1), (4.2), and (4.3) are of the form

(27)

where is defined in (25) and is a smooth function on , but now induced (with slight abuse of notation) on the quotient space (26).

The quotient space has the structure of a smooth Riemannian quotient manifold of by [38]. The Riemannian structure conceptually transforms a rank-constrained optimization problem into an unconstrained optimization problem over the non-linear manifold . Additionally, it allows to compute objects like gradient (of an objective function) and develop a Riemannian trust-region algorithm on that uses second-order information for faster convergence [37].


Matrix representation

Total space

Group action

Quotient space


Vectors in the ambient space
Matrix representation of a tangent vector in
Metric for any

Vertical tangent vectors in

Horizontal tangent vectors in


Projection of a tangent vector on the horizontal space
, where .


Retraction of a horizontal vector onto the manifold





Matrix representation of the Riemannian gradient
, where and are the partial derivatives of with respect to and , respectively.




Matrix representation of the Riemannian Hessian along a horizontal vector
, where has the representation shown above. The matrix representation of the Riemannian connection is shown in (34). Finally, the projection operator is defined in (31).

Table 1: Manifold-related ingredients

5 Optimization on Quotient Manifold

Consider an equivalence relation in the total (computational) space . The quotient manifold generated by this equivalence property consists of elements that are equivalence classes of the form . Equivalently, if is an element in , then its matrix representation in is . In the context of rank constraint, is identified with , i.e., the fixed-rank manifold. Fig. 4 shows a schematic viewpoint of optimization on a quotient manifold. Particularly, we need the notion of “linearization” of the search space, “search” direction, and a way “move” on a manifold. Below we show the concrete development of these objects that allow to do develop a second-order trust-region algorithm on manifolds. The concrete manifold-related ingredients are shown in Table 1, which are based on the developments in [41].

Since the manifold is an abstract space, the elements of its tangent space at also call for a matrix representation in the tangent space that respects the equivalence relation . Equivalently, the matrix representation of should be restricted to the directions in the tangent space on the total space at that do not induce a displacement along the equivalence class . This is realized by decomposing into complementary subspaces, the vertical and horizontal subspaces such that . The vertical space is the tangent space of the equivalence class . On the other hand, the horizontal space , which is any complementary subspace to in , provides a valid matrix representation of the abstract tangent space [37, Section 3.5.8]. An abstract tangent vector at has a unique element in the horizontal space that is called its horizontal lift. Our specific choice of the horizontal space is the subspace of that is the orthogonal complement of in the sense of a Riemannian metric (an inner product).

A Riemannian metric or an inner product at in the total space defines a Riemannian metric , i.e.,

(28)

on the quotient manifold , provided that the expression does not depend on a specific representation along the equivalence class . Here and are tangent vectors in , and are their horizontal lifts in at . Equivalently, if is another element that belongs to and and are the horizontal lifts of and at , then the metric in (28) obeys the equality . Such a metric is then said to be invariant to the equivalence relation .

In the context of fixed-rank matrices, there exist metrics which are invariant. A particular invariant Riemannian metric on the total space that takes into account the symmetry (24) imposed by the factorization model and that is well suited to a least-squares objective function [41] is

(29)

where and . It should be noted that the tangent space has the matrix characterization , i.e., (and similarly ) has the matrix representation .

To show that (29) is invariant to the transformation (24), we assume that another element has matrix representation for a non singular square matrix . Similarly, we assume that the tangent vector (similarly ) has matrix representation . If and (similarly for and ) are the horizontal lifts of at and , respectively. Then, we have and [37, Example 3.5.4]. Similarly for . A few computations then show that , which implies that the metric (29) is invariant to the transformation (24) along the equivalence class . This implies that we have a unique metric on the quotient space .

Motivation for the metric (29) comes from the fact that it is induced from a block diagonal approximation of the Hessian of a simpler cost function , which is strictly convex in and individually. This block diagonal approximation ensures that the cost of computing (29) depends linearly on and the metric is well suited for least-squares problems. Similar ideas have also been exploited in [20, 42, 43] which show robust performance of Riemannian algorithms for various least-squares problems.

Once the metric (29) is defined on , the development of the geometric objects required for second-order optimization follow [37, 41]. The matrix characterizations of the tangent space , vertical space , and horizontal space are straightforward with the expressions:

(30)

Apart from the characterization of the horizontal space, we need a linear mapping that projects vectors from the tangent space onto the horizontal space. Projecting an element onto the horizontal space is accomplished with the operator

(31)

where is uniquely obtained by ensuring that belongs to the horizontal space characterized in (30). Finally, the expression of is

5.1 Gradient and Hessian Computations

The choice of the metric (29) and of the horizontal space (as the orthogonal complement of ) turns the quotient manifold into a Riemannian submersion of [37, Section 3.6.2]. This special construction allows for a convenient matrix representation of the gradient [37, Section 3.6.2] and the Hessian [37, Proposition 5.3.3] on the quotient manifold . Below we show the gradient and Hessian computations for the problem (27).

The Riemannian gradient of on is uniquely represented by its horizontal lift in which has the matrix representation

(32)

where is the gradient of in and and are the partial derivatives of with respect to and , respectively.

In addition to the Riemannian gradient computation (32), we also require the directional derivative of the gradient along a search direction. This is captured by a connection , which is the covariant derivative of vector field with respect to the vector field . The Riemannian connection on the quotient manifold is uniquely represented in terms of the Riemannian connection in the total space [37, Proposition 5.3.3] which is

(33)

where and are vector fields in and and are their horizontal lifts in . Here is the projection operator defined in (31). It now remains to find out the Riemannian connection in the total space . We find the matrix expression by invoking the Koszul formula [37, Theorem 5.3.1]. After a routine calculation, the final expression is [41]

(34)

and is the Euclidean directional derivative