Multiplicative Iteration for Nonnegative Quadratic Programming

# Multiplicative Iteration for Nonnegative Quadratic Programming

## Abstract

In many applications, it makes sense to solve the least square problems with nonnegative constraints. In this article, we present a new multiplicative iteration that monotonically decreases the value of the nonnegative quadratic programming (NNQP) objective function. This new algorithm has a simple closed form and is easily implemented on a parallel machine. We prove the global convergence of the new algorithm and apply it to solving image super-resolution and color image labelling problems. The experimental results demonstrate the effectiveness and broad applicability of the new algorithm.

Nonnegative Constraints; Multiplicative Iteration; NNQP; NNLS
\runningheads

X. Xiao and D. ChenMultiplicative Iteration for NNQP

\corraddr

School of Securities and Futures, Southwestern University of Finance and Economics, Sichuan, China, 610041. E-mail: chendonghui@swufe.edu.cn

## 1 Introduction

Numerical problems with nonnegativity constraints on solutions are pervasive throughout science, engineering and business. These constraints usually come from physical grounds corresponding to amounts and measurements , such as solutions associated with image restoration and reconstruction [6, 16, 18, 26] and chemical concentrations , etc.. Nonnegativity constraints very often arise in least squares problems, i.e. nonnegative least squares (NNLS)

 argmin xF(x)=argmin x||Ax−b||22s.tx≥0. (1)

The problem can be stated equivalently as the following nonnegative quadratic programing (NNQP),

 argmin xF(x)=argmin x12xTQx−xThs.tx≥0. (2)

NNLS \eqrefeq:nnls and NNQP \eqrefeq:nnqp have the same unique solution. In this article, we assume is a symmetric positive definite matrix, and vector .

Since the first NNLS algorithm introduced by Lawson and Hanson , researchers have developed many different techniques to solve \eqrefeq:nnls and \eqrefeq:nnqp, such as active set methods [3, 5, 15], interior point methods , iterative approaches  etc.. Among these methods, gradient projection methods are known as the most efficient methods in solving problems with simple constraints . In this paper, we develop a new multiplicative gradient projection algorithm for NNQP problem. Other similar research can be found in the literature [4, 9, 23].

In the paper , the authors studied a special NNLS problem with nonnegative matrix and vector , and proposed an algorithm called the image space reconstruction algorithm (ISRA). The corresponding multiplicative iteration is

 xi←xi[hi(Qx)i]. (3)

The proof of the convergence property of ISRA can be found in [1, 10, 12, 20]. More recently, Lee and Seung generalized the idea of ISRA to the problem of non-negative matrix factorization (NMF) . For general matrix and vector which have both negative and positive entries, the authors proposed another multiplicative iteration 

 xi←xi⎡⎢ ⎢⎣hi+√h2i+4(Q+x)i(Q−x)i2(Q+x)i⎤⎥ ⎥⎦.

In , Brand and Chen also introduced a multiplicative iteration

 xi←xi[(Q−x)i+h+i(Q+x)i+h−i],

where , , , , “max” is element-wise comparison of two matrices or vectors. Both above algorithms are proved monotonically converging to global minimum of NNQP objective function \eqrefeq:nnqp.

In this paper, we present a new iterative NNLS algorithm along with its convergence analysis. We prove that the quality of the approximation improves monotonically, and the iteration is guaranteed to converge to the global optimal solution. The focus of this paper is theoretical proof of the monotone convergence of the new algorithm. We leave the comparison with other NNLS algorithms for future research. The remainder of this paper is organized as follows. Section 2 presents the new multiplicative NNLS algorithm, we prove the algorithm monotonically decrease the NNQP objective function. In section 3, we discuss two applications of the new algorithm to image processing problems, including image super-resolution and color image labelling. Finally, in section 4, we conclude by summarizing the main advantage of our approach..

## 2 Multiplicative Iteration and Its Convergence Analysis

In this section we derive the multiplicative iteration and discuss its convergence properties. Consider the NNQP problem \eqrefeq:nnqp

 argmin xF(x)=argmin x12xTQx−xThs.t.x≥0,

where is positive definite, .

### 2.1 Multiplicative Iteration

The proposed multiplicative iteration for solving \eqrefeq:nnqp is

 xi←xi[2(Q−x)i+h+i+δ(|Q|x)i+h−i+δ], (4)

where , constant stablizer guarantees the iteration monotonically convergent. We will discuss how to choose later in this section. If all the entries in and are nonnegative, the multiplicative update \eqrefeq:update reduced to ISRA .

Since all the components of the multiplicative factors, i.e. matrices , , , and vectors , , are nonnegative,

 2(Q−x)i+h+i+δ(|Q|x)i+h−i+δ,

given a nonnegative starting initial guess, , all the generated iterations, , are nonnegative. In generating the sequence , the iteration computes the new update by using only the previous vector . It does not need to know all the previous updates, . And the major computational tasks to be performed at each iteration are computations of the matrix-vector products, and . These remarkable properties imply that the multiplicative iteration requires little storage and computation for each iteration.

The iteration \eqrefeq:update is a gradient projection method which can be shown by

 xk+1i−xki = [2(Q−xk)i+h+i+δ(|Q|xk)i+h−i+δ]xki−xki = [2(Q−xk)i+h+i−(|Q|xk)i−h−i(|Q|xk)i+h−i+δ]xki = −[(Qxk)i−hi(|Q|xk)i+h−i+δ]xki = −[xki(|Q|xk)i+h−i+δ]((Qxk)i−hi) = −γk∇(F(xk)),

where the step-size , and the gradient of the NNQP objection function \eqrefeq:nnqp .

### 2.2 Fixed Point

The proposed iteration \eqrefeq:update is motivated by the Karush-Kuhn-Tucker (KKT) first-order optimal condition . Consider the Lagrangian function of NNQP objective function \eqrefeq:nnqp defined by

 L(x,μ)=12xTQx−xTh−μx, (5)

with the scalar Lagrangian multiplier , assume is the optimal solution of Lagrangian function \eqrefeq:nnqplag, the KKT conditions are

 x∗∘(Qx∗−h−μ) = 0 μ∘x∗ = 0,

with denoting the Hadamard product. Above two equalities imply that either th constraint is active , or and when the th constraint is inactive (). Because , equality implies , we obtain , which is equivalent to

 2(Q−x∗)i+h+i+δ(|Q|x∗)i+h−i+δ=1,

which means the th multiplicative factor is constant .Therefore, any optimal solution satisfying the KKT conditions conrresponds to a fixed point of the multiplicative iteration.

### 2.3 Convergence Analysis

In this section, we prove the proposed multiplicative iteration \eqrefeq:update monotonically decrease the value of the NNQP objective function \eqrefeq:nnqp to its global minimum. This analysis is based on construction of an auxiliary function of . Similar techniques have been used in the papers [16, 23, 24].

For the sake of completeness, we begin our discussion with a brief review of the definition of auxiliary function.

###### Definition 2.1

Let and be two positive vectors, function is an auxiliary function of if it satisfies the following two properties

• ;

• . Figure 1: Graph illustrating how to searching the minimum of the objective function F(x) by minimizing the auxiliary function G(x,y) iteratively. Each new update xi is computed by searching the minimum of the auxiliary function G(x,y) in every step.

Figure 1 illustrates the relationship between the auxiliary function and the corresponding objective function . In each iteration, the updated is computed by minimizing the auxiliary function. The iteration stops when it reaches a stationary point. The following lemma, which is also presented in [16, 23, 24], proves the iteration by minimizing auxiliary function in each step decreases the value of the objective function .

###### Lemma 2.2

Let be an auxiliary function of , then is strictly decreasing under the update

 xk+1=argmin xG(x,xk),

if .

{proof}

By the definition of auxiliary function, if , we have

 F(xk+1)

The middle inequality is because of the assumption .

Deriving a suitable auxiliary function for NNQP objective function  \eqrefeq:nnqp is a key step to prove the convergence of our multiplicative iteration. In the following lemma, we prove two positive semi-definite matrices which are used to build our auxiliary function later.

###### Lemma 2.3

Let be a nonnegative real symmetric matrix without all-zero rows and be a vector whose entries are positive. Define the diagonal matrix ,

 Dij={0ifi≠j(Px)ixiotherwise

Then, the matrices, , are positive semi-definite.

{proof}

Consider the matrices,

 M1=diag(xi)(D+P)diag(xi),M2=diag(xi)(D−P)diag(xi),

where represents the diagonal matrix whose entries on the main diagonal are the entries of vector . Since is a positive vector, is invertible. Hence, are congruent with , , correspondingly. The matrices are positive semi-definite if and only if and are positive semi-definite .

Given any nonzero vector ,

 zTM1z = ∑ij(Dij+Pij)xixjzizj = ∑ijDijxixjzizj+∑ijPijxixjzizj = ∑iDiix2iz2i+∑ijPijxixjzizj = ∑i(Px)ixiz2i+∑ijPijxixjzizj = ∑ijPijxixjz2i+∑ijPijxixjzizj = 12∑ijPijxixj(zi+zj)2≥0

Hence, is positive semi-definite. Similarly, we have is positive semi-definite,

 zTM2z=12∑ijPijxixj(zi−zj)2≥0.

Therefore, are positive semi-definite.

Combining two previous lemmas, we construct an auxiliary function for NNQP \eqrefeq:nnqp as follows.

###### Lemma 2.4 (Auxiliary Function)

Let vectors and represent two positive vectors, define the diagonal matrix, , with diagonal entries

 Dii=(|Q|y)i+h−i+δyi,i=1,2,⋯,n,δ>0.

Then the function

 G(x,y)=F(y)+(x−y)T∇F(y)+12(x−y)TD(y)(x−y) (6)

is an auxiliary function for quadratic model

 F(x)=12xTQx−xTh.
{proof}

First of all, it is obvious that which is the second property in Definition 2.1. Next, we have to show the first property, for .

Notice that is the Hesssian matrix of . The Taylor expansion of at is

 F(x)=F(y)+(x−y)T∇F(y)+12(x−y)TQ(x−y).

The difference between and is

 G(x,y)−F(x)=12(x−y)T(D(y)−Q)(x−y).

for if and only if is positive definite.

Recall that , ,

 D(y)−Q = diag((|Q|y)i+h−i+δyi)−Q = diag((|Q|y)i+h−i+δyi)−(Q+−Q−) = diag((Q+y)iyi)−Q++% diag((Q−y)iyi)+Q−+diag(h−i+δyi)

Because is positive definite for , and by Lemma 2.3, and are positive semi-definite. Thus, is positive definite.

Hence, we obtain for any vectors . Therefore, is an auxiliary of .

In previous proof, we use the fact to prove the diagonal matrix is positive definite. Because matrix is positive semi-definite for vector with all positive entries, we can choose to be any positive number. In our experiments, it is chosen to be . Armed with previous lemmas, we are ready to prove the convergence theorem for our multiplicative iteration \eqrefeq:update.

###### Theorem 2.5 (Monotone Convergence)

The value of the objective function in \eqrefeq:nnqp is monotonically decreasing under the multiplicative update

 xk+1i=xki[2(Q−xk)i+h+i+δ(|Q|xk)i+h−i+δ].

It attains the global minimum of at the stationary point of the iteration.

{proof}

By the auxiliary function definition 2.1 and Lemma 2.4, in \eqrefeq:aux is an auxialiry of function . Lemma 2.2 shows that the objective function is monotonically decreasing under the update

 xk+1=argmin x G(x,xk)ifxk+1≠xk.

It remains to show that the proposed iteration \eqrefeq:update approaches the minimum of .

By Fermat’s theorem , taking the first partial derivative of with respect to , and setting it to , we obtain that

 ∇xG(x,xk)=∇F(xk)+D(xk)(x−xk)=0. (7)

Hence,

 x = xk−(D(xk))−1∇F(xk) = xk−(D(xk))−1(Qxk−h) = xk−(D(xk))−1(|Q|xk+h−+δ−2Q−xk−h+−δ) = xk−(D(xk))−1(|Q|xk+h−+δ)+(D(xk))−1(2Q−xk+h++δ) = (D(xk))−1(2Q−xk+h++δ) = diag(2(Q−xk)i+h+i+δ(|Q|xk)i+h−i+δ)xk

where we used the facts that .

The decreasing sequence is bounded below . By Monotone Convergence Theorem , the sequence converges to the limit . Because is continuous, given any compact domain, there exists such that . Since is the global minimum of , the gradient of at is zero, i.e.

 ∇F(x∗)=Qx∗−h=0,

which is equivalent to

 2(Q−x∗)i+h+i+δ(|Q|x∗)i+h−i+δ=1,

which means is a stationary point. Thus, the sequence converges to the global minimum as approaches to the limit point .

## 3 Numerical Experiments

We now illustrate two applications of the proposed NNLS algorithm in image processing problems.

### 3.1 Image Super-Resolution (a) 4 frames of 30 input low-resolution frames

Image super-resolution (SR) refers to the process of combining a set of low resolution images into a single high-resolution image [7, 8]. Each low-resolution image is assumed to be generated from an ideal high-resolution image via a displacement , a decimation , and a noise process :

 yk=DkSkx+nk,k=1,2,⋯,K. (8)

We use the bilinear interpolation proposed by Chung, Haber and Nagy  2 to estimate the displacement matrices . Then we reconstruct the high-resolution image by iteratively solving the NNLS

 argmin x12K∑k=1||DkSkx−yk||2,s.t.x≥0

The low-resolution test data set is taken from the Multi-Dimensional Signal Processing Research Group (MDSP) . Figure a shows 4 of the uncompressed low-resolution text images of size pixels. The reconstructed high-resolution image of size pixels is computed by our algorithm is shown in Figure b. Figure a shows 4 of low-resolution EIA images of size pixels. Figure b shows the reconstructed pixels high-resolution image computed by the proposed multiplicative iteration. As shown in the figures, the high-resolution images are visually much better than the low-resolution images.

### 3.2 Color Image Labeling

In Markov random fields (MRF)-based interactive image segmentation techniques, the user labels a small subset of pixels, and the MRF propagates these labels across the image, typically finding high-gradient contours as segmentation boundaries where the labeling changes . These techniques require users to impose hard constraints by indicating certain pixels (seeds) that absolutely have to be part of the labeling . Intuitively, the hard constraints provide clues as to what the user intends to segment. Denote as the -by- test RGB images, represent a -by- vector at pixel . (a) flowers

The class set is denoted by . The probabilistic labeling approaches compute a probability measure field for each pixel ,

 X={Xkij:k∈C,i=1,2,⋯,m,j=1,2,⋯,n}

with the constraints

 K∑k=1Xkij=1,Xkij≥0,∀k∈C. (9)

Denoting as the set of neighbors of pixel , the cost function are in the following quadratic form

 argmin xK∑k=1m∑i=1n∑j=1⎛⎝α2∑(i′,j′)∈Nijωiji′j′(Xki′j′−Xkij)2+DkijXkij⎞⎠, (10)

with the constraints \eqrefeq:segcon. is the cost of assigning label to pixel . The first term, , which controls the granularity of the regions, promotes smooth regions. The spatial smoothness is controlled by the positive parameter, , and weight, , which is chosen such that if the neighbouring pixels and are likely to belong to the same class and otherwise. In these experiments, is defined to be the cosine of the angle between two neighbouring pixels,

 ωiji′j′=XTijXi′j′|Xij|⋅|Xi′j′|.

The cost of labeling at each pixel , , is trained with a Gaussian mixture model  using seeds labeled by the user. Given sample mean, , and variance, , for the seeds with labeling , is computed as the Mahalanobis distance  between each pixel of the image and the seeds,

 Dkij=12K∑k=1(Xij−μk)T(Σk)−1(Xij−μk)+12log(Σk).

The KKT optimality conditions

 α∑(i′,j′)∈Nijωiji′j′(Xkij−Xki′j′)+Dkij−λij = 0 αXkij⎛⎜⎝∑(i′,j′)∈Nijωiji′j′⎞⎟⎠−α∑(i′,j′)∈Nijωiji′j′Xkij+Dkij−λij = 0

yield a two-step Algorithm 1.

In the experiments, we implement our NNLS algorithm without explicitly constructing the matrix . Figure 4 shows the results of the labeled images.

## 4 Summary

In this paper, we presented a new graduent projection NNLS algorithm and its convergence analysis. By expressing the KKT first-order optimality conditions as a ratio, we obtained a multiplicative iteration that could quickly solves large quadratic programs on low-cost parallel compute devices. The iteration monotonically converges from any positive initial guess. We demonstrated applications to image super-resolution and color image labelling. Our algorithm is also applicable to solve other optimization problems involving nonnegativity contraints. Future research includes comparing the performance of this new NNLS algorithm with other existing NNLS algorithms.

### Footnotes

1. E-mail: xxiao@swufe.edu.cn
2. Thanks to Julianne Chung for providing the Matlab code.

### References

1. G.E.B. Archer and D.M. Titterington. The iterative image space reconstruction algorithm as an alternative to the EM algorithm for solving positive linear inverse problems. Statistica Sinica, 5:77–96, 1995.
2. S. Bellavia, M. Macconi, and B. Morini. An interior point Newton-like method for non-negative least-squares problems with degenerate solution. Numerical Linear Algebra with Applications, 13(10):825–846, 2006.
3. M. Van Benthem and M. Keenan. Fast algorithm for the solution of large-scale non-negativity-constrained least squares problems. J. of Chemometrics, 18(10):441–450, 2004.
4. M. Brand and D. Chen. Parallel quadratic programming for image processing. In 18th IEEE International Conference on Image Processing (ICIP), pages 2261 –2264, September 2011.
5. R. Bro and S. De Jong. A fast non-negativity-constrained least squares algorithm. J. of Chemometrics, 11(5):393–401, 1997.
6. D. Calvetti, G. Landi, L. Reichel, and F. Sgallari. Non-negativity and iterative methods for ill-posed problems. Inverse Problems, 20(6):1747, 2004.
7. D. Chen. Comparisons of multiframe super-resolution algorithms for pure translation motion. Master’s thesis, Wake Forest University, Winston-Salem, NC, August 2008.
8. J. Chung, E. Haber, and J. Nagy. Numerical methods for coupled super-resolution. Inverse Problem, 22:1261–1272, 2006.
9. M. E. Daube-Witherspoon and G. Muehllehner. An iterative image space reconstruction algorthm suitable for volume ECT. Medical Imaging, IEEE Transactions on, 5(2):61 –66, June 1986.
10. P. P. B. Eggermont. Multiplicative iterative algorithms for convex programming. Linear Algebra and its Applications, 130:25–42, 1990.
11. S. Farsiu, D. Robinson, M. Elad, and P. Milanfar. Fast and robust multi-frame super-resolution. IEEE Transactions on Image Processing, 13:1327–1344, 2003.
12. J. Han, L. Han, M. Neumann, and U. Prasad. On the rate of convergence of the image space reconstruction algorithm. Operators and Matrices, 3(1):41–58, 2009.
13. R.A. Horn and C.R. Johnson. Matrix Analysis. Cambridge University Press, 1990.
14. D. Kim, S. Sra, and I. Dhillon. A new projected quasi-newton approach for the non-negative least squares problem. Technical report, The University of Texas at Austin, 2006.
15. C. Lawson and R. Hanson. Solving Least Squares Problems. SIAM, 3rd edition, 1995.
16. D. Lee and S. Seung. Algorithms for non-negative matrix factorization. In Advances in Neural Information Processing Systems 13, pages 556–562. MIT Press, April 2001.
17. P. Mahalanobis. On the generalised distance in statistics. In Proceedings National Institute of Science, India, number 2 in 1, pages 49–55, April 1936.
18. J. Nagy and Z. Strako. Enforcing nonnegativity in image reconstruction algorithms. In SPIE Conference Series, volume 4121 of SPIE Conference Series, pages 182–190, October 2000.
19. J. Nocedal and S. Wright. Numerical Optimization. Springer, New York, 2nd edition, 2006.
20. A. De Pierro. On the convergence of the iterative image space reconstruction algorithm for volume ECT. Medical Imaging, IEEE Transactions on, 6(2):174 –175, June 1987.
21. M. Rivera, O. Dalmau, and J. Tago. Image segmentation by convex quadratic programming. In 19th International Conference on Pattern Recognition, 2008, pages 1–5, 2008.
22. W. Rudin. Principles of mathematical analysis. McGraw-Hill Book Co., New York, third edition, 1976. International Series in Pure and Applied Mathematics.
23. F. Sha, Y. Lin, L. Saul, and D. Lee. Multiplicative updates for nonnegative quadratic programming. Neural Comput., 19(8):2004–2031, 2007.
24. F. Sha, L. Saul, and D. Lee. Multiplicative updates for nonnegative quadratic programming in support vector machines. In Advances in Neural Information Processing Systems 15, pages 1041–1048. MIT Press, 2002.
25. L. Xu and M. Jordan. On convergence properties of the EM algorithm for gaussian mixtures. Neural Computation, 8:129–151, 1995.
26. R. Zdunek and A. Cichocki. Nonnegative matrix factorization with quadratic programming. Neurocomput., 71:2309–2320, June 2008.
Comments 0
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters Loading ...
130532 You are asking your first question!
How to quickly get a good answer:
• Keep your question short and to the point
• Check for grammar or spelling errors.
• Phrase it like a question Test
Test description