Demystifying the characterization of SDP matrices in mathematical programming Daniel Porumbel_2

2025-05-06 0 0 1.71MB 104 页 10玖币

侵权投诉

Demystifying the characterization of SDP matrices in mathematical

programming

Daniel Porumbel

October 25, 2022

Argument

This manuscript was written because I found no other introduction to SDP programming that targets the same

audience. This work is intended to be accessible to anybody who does not hate maths, who knows what a derivative

is and accepts (or has a proof of) results like det(AB) = det(A) det(B). If you know this, I think you can understand

most of this text without buying other books; my goal is not to remind/enumerate a list of results but to (try to)

make the reader examine (the proofs of) these results to get full insight into them.

A ﬁrst diﬀerence compared to other existing introductions to SDP is that this work comes out of a mind that was

itself struggling to understand. This may seem to be only a weakness, but, paradoxically, it is both a weakness and a

strength. First, I did not try to overpower the reader, but I tried to minimize the distance between the author and

the reader as much as possible, even hoping to achieve a small level of mutual empathy. This enabled me avoid a

quite common pitfall: many long-acknowledged experts tend to forget the diﬃculties of beginners. Other experts try

to make all proofs as short as possible and to dismiss as unimportant certain key results they have seen thousands of

time in their career. I also avoided this, even if I did shorten a few proofs when I revised this manuscript two years

after it was ﬁrst written. However, I also kept certain proofs that seem longer than necessary because I feel they

oﬀer more insight; an important goal is to capture the “spirit” of each proven result instead of reducing it to a ﬂow

of formulae.

The very ﬁrst key step towards mastering SDP programming is to get full insight into the eigen-decomposition

of real symmetric matrices. It is enough to see the way many other introductions to SDP programming address

this eigen-decomposition to understand that their target audience is diﬀerent from mine. They often list the eigen-

decomposition without proof, while I give two proofs to really familiarize the reader with it.1

If you can say “Ok, I still vaguely remember the eigen-decomposition (and other key SDP properties) from my

(under-)graduate studies some n≥5 years ago; I don’t need a proof”, then you do not belong to my target audience.

I am highly skeptical that such approach can lead to anything but superﬁcial learning. Anyhow, my brain functions

in the most opposite manner. I like to learn by checking all proofs by myself and I can’t stand taking things for

granted. The only unproven facts from this manuscript are the fundamental theorem of algebra and two results from

Section 5.3.2.3. But I do provide complete proofs, for instance, for the Cholesky decomposition of SDP matrices, the

strong duality theorem for linear conic programming (including SDP programming), six equivalent formulations of the

Lov´asz theta number, the copositive formulation of the maximum stable, a few convexiﬁcation results for quadratic

programs and many others. I tried to prove everything by myself, so that certain proofs are original although this

introduction was not intended to be research work; of course I got help multiple times from the internet, articles and

books, the references being indicated as footnote citations.

The most essential building blocks are presented in the ﬁrst part. One should really master this ﬁrst part before

jumping to the second one; the essentials from the ﬁrst part may even be generally useful for reading other SDP

1In “Handbook of Semideﬁnite Programming Theory, Algorithms, and Applications” by H. Wolkowicz, R. Saigal and

L. Vandenberghe, the eigen-decomposition (called spectral theorem) is listed with no proof in Chapter 2 “Convex Analysis on

Symmetric Matrices”. The introduction of the “Handbook on Semideﬁnite, Conic and Polynomial Optimization” by M. An-

jost and J.B. Lasserre refers the reader to the (700 pages long) book “Matrix analysis” by Horn and Johnson. As a side

remark, the introductions of both these handbooks are rather short (14 or resp. 22 pages) and they mainly remind/enumerate

diﬀerent key results pointing to other books for proofs. In “Semideﬁnite Programming for Combinatorial Optimization”, by

C. Helmberg, the eigen-decomposition is presented in an appendix and redirects the reader to the same “Matrix analysis” book.

The slides of the course “Programmation lin´eaire et optimisation combinatoire” of Fr´ed´eric Roupin for the “Master Parisien

de Recherche Op´erationnelle” (lipn.univ-paris13.fr/~roupin/docs/MPROSDPRoupin2018-partie1.pdf) provide many results

from my manuscript but no proof is given. The MIT course “Introduction to Semideﬁnite Programming ” by R. Freund does

not even provide the SDP deﬁnition or the eigen-decomposition. The book “Convex Optimization” by S. Boyd and L. Vanden-

berghe starts using SDP matrices from the beginning (e.g., to deﬁne ellipsoids in Section 2.2.2) without deﬁning the concept of

SDP matrix, not even in appendix. The argument could extend to other non-trivial concepts that are taken as pre-requisite in

above works. For instance, the above “Convex Optimization” book introduces the square root of an SDP matrix (in ﬁve lines

in Appendix A.5.2), without showing the uniqueness – the proof takes half a page in Appendix B.4 of this manuscript.

arXiv:2210.13072v1 [math.OC] 24 Oct 2022

work. In fact, the goal of this manuscript is to give you all the tools needed to move to the next level and carry out

research work.

Contents

PART 1 THE ESSENTIAL BUILDING BLOCKS

1 Characterization of semideﬁnite positive (SDP) matrices 4

1.1 Real symmetric matrices, eigenvalues and the eigendecomposition . . . . . . . . . . . . . . . . 4

1.2 EquivalentSDPdeﬁnitions ..................................... 5

1.3 Schur complements, the self-duality of the SDP cone and related properties . . . . . . . . . . 8

1.4 Three easy ways to generate (semi-)deﬁnite positive matrices . . . . . . . . . . . . . . . . . . 10

1.5 Positive deﬁnite matrices: unique Cholesky factorization and Sylvester criterion . . . . . . . . 11

1.6 Cholesky decomposition of semideﬁnite positivematrices .................... 13

1.7 Any A0has inﬁnitely many factorizations A=V V >related by rotations and reﬂections . 18

1.8 Convex functions have an SDP Hessian assuming the Hessian is symmetric . . . . . . . . . . 19

2 Primal-Dual SDP programs and optimization considerations 20

2.1 PrimalanddualSDPprograms................................... 20

2.2 Relations between the primal optimum and the dual optimum . . . . . . . . . . . . . . . . . . 27

2.3 Strongduality............................................. 29

2.4 The diﬃculty of exactly solving (SDP) and algorithmic comments . . . . . . . . . . . . . . . 33

3 Interesting SDP programs 34

3.1 An SDP program does not always reach its min (inf) or max (sup) value . . . . . . . . . . . . 34

3.2 The lowest and greatest eigenvalue using the SDP duality . . . . . . . . . . . . . . . . . . . . 34

3.3 Change of variable in SDP programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.4 Convex quadratic programming is a particular case of SDP programming . . . . . . . . . . . 35

3.5 An LP with equality constraints as an SDP program in the dual form . . . . . . . . . . . . . 38

PART 2 MORE ADVANCED SDP PROGRAMMING

4 Six equivalent formulations of the Lov´asz theta number ϑ(G)39

4.1 A ﬁrst SDP formulation of the theta number . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.2 A second SDP formulation (ϑ0

G)ofthethetanumber....................... 42

4.3 A formulation ϑ00(G) of the theta number without SDP matrices . . . . . . . . . . . . . . . . 43

4.4 A fourth formulation ϑ`(G)ofthethetanumber ......................... 44

4.5 Two formulations of the theta number using maximum eigenvalues . . . . . . . . . . . . . . . 45

4.6 The theta number ϑ(G) is bounded by the fractional chromatic number χ∗(G) of G..... 46

5 A taste of copositive optimization and sum of squares hierarchies 48

5.1 Introducing the completely positive and the copositive cones . . . . . . . . . . . . . . . . . . . 48

5.2 Reformulating a homogeneous quadratic program as a copositive problem . . . . . . . . . . . 49

5.3 Relaxations of the copositive formulation of the maximum stable . . . . . . . . . . . . . . . . 52

5.4 Further characterization of the completely positive and the copositive cones . . . . . . . . . . 58

5.5 A ﬁnal short property: the Schur complement does not apply in Cn∗.............. 60

6 SDP relaxations and convexiﬁcations of quadratic programs 60

6.1 The most general quadratic program: SDP relaxation and total Lagrangian . . . . . . . . . . 61

6.2 Partial and total Lagrangians for quadratic programs with linear equality constraints . . . . . 62

6.3 The case of 0 −1 quadratic programs: partial and total Lagrangians . . . . . . . . . . . . . . 67

7 Basic elements of several other research topics: under construction 73

7.1 Approximation algorithms using SDP programming . . . . . . . . . . . . . . . . . . . . . . . 73

7.2 Strong duality in the more general context of linear conic programming . . . . . . . . . . . . 74

7.3 PolynomialOptimization ...................................... 79

7.4 Algorithms for SDP optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

APPENDIX

A On ranks, determinants and space dimensions 79

A.1 The rank-nullity theorem and other interesting rank properties . . . . . . . . . . . . . . . . . 79

A.2 Results on determinants and space dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . 81

B Three decompositions: eigenvalue, QR and square root 84

B.1 Preliminaries on eigen-values/vectors and similar matrices . . . . . . . . . . . . . . . . . . . . 84

B.2 Theeigenvaluedecomposition.................................... 85

B.3 The QR decomposition of real matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

B.4 An SDP matrix has a unique SDP square root factor . . . . . . . . . . . . . . . . . . . . . . . 90

C Useful related facts 91

C.1 Optimality conditions for linearly-constrained quadratic programs . . . . . . . . . . . . . . . 91

C.2 More insight and detail into the convexiﬁcations from Section 6................. 93

C.3 A convex function with an asymmetric Hessian . . . . . . . . . . . . . . . . . . . . . . . . . . 97

C.4 The separating hyperplane theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

PART 1: THE ESSENTIAL BUILDING BLOCKS

1 Characterization of semideﬁnite positive (SDP) matrices

1.1 Real symmetric matrices, eigenvalues and the eigendecomposition

This light introduction aims at familiarizing the reader with the main concepts of real symmetric matrices,

eigenvalues and the eigenvalue decomposition. Experts on this topic can skip to Section 1.2 or even to further

sections. Absolute beginners should ﬁrst consult Appendix Ato get the notion of matrix rank, (sub-)space

dimension, (principal) minor, or to recall how det(A)=0 ⇐⇒ ∃xs. t. Ax=0. To familiarize with such

introductory concepts it is also useful to solve a few exercises, but the only exercise I propose is to ask the

reader to prove by himself all theorems whose proof does not exceed half a page.

Given (real symmetric) matrix A, we say λis a (real) eigenvalue of Aif there exists eigenvector vsuch

that Av=λv. Notice that by multiplying the eigenvector with a constant we obtain another eigenvector.

An eigenvector is called unitary if it has a norm of 1, i.e.,|v|2=v>v=Pn

i=1 v2

i= 1. We say vis an

eigenvector if it satisﬁes λInv−Av=0, equivalent to (λIn−A)v=0, which also means det(λIn−A) = 0.

We can also deﬁne the eigenvalues as the roots of the characteristic polynomial det(xIn−A). Indeed, if

det(λ1In−A) = 0, there exists a (real) eigenvector v1such that (λ1In−A)v1= 0, equivalent to Av1=λ1v1.

The fact that v1needs to exist is formally proven in Prop A.2.2; this Prop A.2.2 proves the statement in

the more general context of complex matrices because we should not yet take for granted that root λ1is not

complex, i.e.,λ1In−Acould have complex entries in principle. However, it is possible to show that a real

symmetric matrix has only real eigenvalues and real proper eigenvectors. The proof is not really completely

obvious, it takes a third of a page and it is given in appendix (Prop. B.1.2). By developing the characteristic

polynomial, one can also prove that the determinant is the product of the eigenvalues (Prop. A.2.4).

The characteristic polynomial has degree n, and so, it has nroots (eigenvalues), but some of them can have

multiplicities greater than 1, i.e., some eigenvalues can appear more than once. However, each eigenvalue is

associated to at least one eigenvector. An eigenvalue with multiplicity greater than 1 can have more than

one eigenvector. The following result is called the eigendecomposition of the real symmetric matrix A.

unitary eigenvectors of A

↓ ↓ . . . ↓

A=





v11 v21 . . . vn,1

v21 v22 . . . vn,2

.....

vn,1vn,2. . . vn,n











λ10. . . 0

0λ2. . . 0

.....

0 0 . . . λn











v11 v12 . . . v1,n

v12 v22 . . . v2,n

.....

v1,n v2,n . . . vn,n







←

=hv1v2. . . vnidiag(λ1, λ2, . . . λn)









=UΛU>

i=1

λiviv>

(1.1.1)

transposed

eigenvectors

eigenvalues of A

where v1,v2, . . . vnare the unitary (column) eigenvectors of Aassociated to (some repeated) eigenvalues

λ1, λ2,...λnand Λ = diag(λ1, λ2...,λn) is the diagonal matrix with Λii =λi∀i∈[1..n]. The eigenvectors

are unitary and orthogonal, meaning that v>

ivj= 0,∀i, j ∈[1..n], i 6=jand v>

ivi= 1 ∀i∈[1..n]. This

directly leads to

[v1v2. . . vn]>[v1v2. . . vn] = In, equivalent to [v1v2. . . vn][v1v2. . . vn]>=In,(1.1.2)

using the very well known property XY =I=⇒Y X =I(Prop. A.2.5).

The simple case of distinct eigenvalues

It is important to familiarize with this decomposition. For this, let us ﬁrst examine a proof that works for

symmetric matrices Awith distinct eigenvalues (all with multiplicity one). We will ﬁrst show that the eigen-

vectors of symmetric Aare orthogonal. Let v1,v2, . . . vnbe the unitary eigenvectors of resp. λ1, λ2,...λn.

For any i, j ∈[1..n], i 6=j, we can write v>

iAvj=v>

iλjvj=λjv>

ivjand also v>

iAvj=λiv>

ivjbased on

iA= (v>

iA)>>= (A>vi)>= (Avi)>=λiv>

i. This leads to λjv>

ivj=λiv>

ivj, and, using λi6=λj, we

obtain v>

ivj= 0 ∀i, j ∈[1..n], i 6=j.

We now construct the eigen-decomposition. Using λivi=Avi∀i∈[1..n], we obtain A[v1v2. . . vn] =

[λ1v1λ2v2. . . λnvn], equivalent to A[v1v2. . . vn] = [v1v2. . . vn]diag(λ1, λ2,...λn). We now multiply

both sides at right by [v1v2. . . vn]>which is equal to [v1v2. . . vn]−1by virtue (1.1.2); we thus obtain

A= [v1v2. . . vn]diag(λ1, λ2,...λn)[v1v2. . . vn]>, which is exactly (1.1.1).

Generalizing the proof for the case of eigenvalues with diﬀerent multiplicities

We now prove the eigen-decomposition in the general case by extending the above construction. We need

two steps

Step. 1 We ﬁrst show that the above construction for eigenvalues with unitary multiplicities can be readily

generalized to the case where there is a diﬀerent eigenvector for each of the kirepetitions of each

root λiof the characteristic polynomial (i.e., for any eigenvalue λi). In this case, there are kilinearly

independent eigenvectors associated to λi; we say that the eigenspace dimension k0

iof λi(the geometric

multiplicity k0

i) is equal to the algebraic multiplicity ki. This step is relatively straightforward and it

is proved in full detail (and slightly generalized) in Prop. B.2.2.

Step. 2 We need to show that the eigenspace of λihas dimension ki. To show this, we use the notion

of similar matrices: Xand Yare similar if there exists Usuch that X=UY U−1; we say that

Yis the representation of Xin the basis composed by the columns of U. Two simple properties

(Prop. B.1.4 and B.1.5) show that: (i) two similar matrices have the same characteristic polynomial so

that any common eigenvalue λiis repeated kitimes and (ii) similar matrices have the same eigenspace

dimension k0

ifor any common eigenvalue λi. Assuming k0

i< kifor symmetric A∈Rn×n, we can

construct U∈Rn×nby putting k0

iorthogonal unitary eigenvectors of λion the ﬁrst k0

icolumns and by

ﬁlling the other columns (introducing n−k0

iunitary vectors perpendicular on the ﬁrst k0

icolumns) such

that U>U=In. We obtain U>AU =hλiIk0

0Di.Since similar matrices have the same characteristic

polynomial and k0

i< k,λineeds to be a root of the characteristic polynomial of D, and so, λineeds

to have an eigenvector in D. This means that the eigenspace of λiin hλiIk0

0Dihas dimension at least

i+1, contradicting the fact that similar matrices have the same eigenspace dimension for λi. As such,

the assumption k0

i< kiwas false and we obtain k0

i=ki,i.e.,λihas the same algebraic and geometric

multiplicity ki. For the skeptical, this Step 2 is proven in full detail in Prop. B.2.3.

1.2 Equivalent SDP deﬁnitions

Deﬁnition 1.2.1. A symmetric matrix A∈Rn×nis positive semideﬁnite (SDP) if the following holds for

any vector x∈Rn

x>Ax=x

Ax=A

xx>=trace(Xxx>) =

i=1

j=1

aij xixj≥0,(1.2.1)

where

stands for the scalar product. If the above inequality is always strict, the matrix is positive deﬁnite.

If the inequality is reversed, the matrix is negative semideﬁnite.

Proposition 1.2.2. A symmetric matrix A∈Rn×nis positive semideﬁnite (resp. deﬁnite) if and only if all

eigenvalues λiverify λi≥0(resp. >0).

Proof.

=⇒

We consider Ais positive semideﬁnite (resp. deﬁnite). Assume there exist an eigenvalue λ < 0 (resp λ≤0)

associated to eigenvector v. We have v>Av=v>λv=λ(v2

1+v2

2+. . . v2

n). If λ < 0 (resp λ≤0), then

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DemystifyingthecharacterizationofSDPmatricesinmathematicalprogrammingDanielPorumbelOctober25,2022ArgumentThismanuscriptwaswrittenbecauseIfoundnootherintroductiontoSDPprogrammingthattargetsthesameaudience.Thisworkisintendedtobeaccessibletoanybodywhodoesnothatemaths,whoknowswhataderivativeisandaccepts...

展开>> 收起<<

Demystifying the characterization of SDP matrices in mathematical programming Daniel Porumbel_2.pdf

共104页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Demystifying the characterization of SDP matrices in mathematical programming Daniel Porumbel_2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: