Demystifying the characterization of SDP matrices in mathematical programming Daniel Porumbel_2

2025-05-06 0 0 1.71MB 104 页 10玖币
侵权投诉
Demystifying the characterization of SDP matrices in mathematical
programming
Daniel Porumbel
October 25, 2022
Argument
This manuscript was written because I found no other introduction to SDP programming that targets the same
audience. This work is intended to be accessible to anybody who does not hate maths, who knows what a derivative
is and accepts (or has a proof of) results like det(AB) = det(A) det(B). If you know this, I think you can understand
most of this text without buying other books; my goal is not to remind/enumerate a list of results but to (try to)
make the reader examine (the proofs of) these results to get full insight into them.
A first difference compared to other existing introductions to SDP is that this work comes out of a mind that was
itself struggling to understand. This may seem to be only a weakness, but, paradoxically, it is both a weakness and a
strength. First, I did not try to overpower the reader, but I tried to minimize the distance between the author and
the reader as much as possible, even hoping to achieve a small level of mutual empathy. This enabled me avoid a
quite common pitfall: many long-acknowledged experts tend to forget the difficulties of beginners. Other experts try
to make all proofs as short as possible and to dismiss as unimportant certain key results they have seen thousands of
time in their career. I also avoided this, even if I did shorten a few proofs when I revised this manuscript two years
after it was first written. However, I also kept certain proofs that seem longer than necessary because I feel they
offer more insight; an important goal is to capture the “spirit” of each proven result instead of reducing it to a flow
of formulae.
The very first key step towards mastering SDP programming is to get full insight into the eigen-decomposition
of real symmetric matrices. It is enough to see the way many other introductions to SDP programming address
this eigen-decomposition to understand that their target audience is different from mine. They often list the eigen-
decomposition without proof, while I give two proofs to really familiarize the reader with it.1
If you can say “Ok, I still vaguely remember the eigen-decomposition (and other key SDP properties) from my
(under-)graduate studies some n5 years ago; I don’t need a proof”, then you do not belong to my target audience.
I am highly skeptical that such approach can lead to anything but superficial learning. Anyhow, my brain functions
in the most opposite manner. I like to learn by checking all proofs by myself and I can’t stand taking things for
granted. The only unproven facts from this manuscript are the fundamental theorem of algebra and two results from
Section 5.3.2.3. But I do provide complete proofs, for instance, for the Cholesky decomposition of SDP matrices, the
strong duality theorem for linear conic programming (including SDP programming), six equivalent formulations of the
Lov´asz theta number, the copositive formulation of the maximum stable, a few convexification results for quadratic
programs and many others. I tried to prove everything by myself, so that certain proofs are original although this
introduction was not intended to be research work; of course I got help multiple times from the internet, articles and
books, the references being indicated as footnote citations.
The most essential building blocks are presented in the first part. One should really master this first part before
jumping to the second one; the essentials from the first part may even be generally useful for reading other SDP
1In “Handbook of Semidefinite Programming Theory, Algorithms, and Applications” by H. Wolkowicz, R. Saigal and
L. Vandenberghe, the eigen-decomposition (called spectral theorem) is listed with no proof in Chapter 2 “Convex Analysis on
Symmetric Matrices”. The introduction of the “Handbook on Semidefinite, Conic and Polynomial Optimization” by M. An-
jost and J.B. Lasserre refers the reader to the (700 pages long) book “Matrix analysis” by Horn and Johnson. As a side
remark, the introductions of both these handbooks are rather short (14 or resp. 22 pages) and they mainly remind/enumerate
different key results pointing to other books for proofs. In “Semidefinite Programming for Combinatorial Optimization”, by
C. Helmberg, the eigen-decomposition is presented in an appendix and redirects the reader to the same “Matrix analysis” book.
The slides of the course “Programmation lin´eaire et optimisation combinatoire” of Fr´ed´eric Roupin for the “Master Parisien
de Recherche Op´erationnelle” (lipn.univ-paris13.fr/~roupin/docs/MPROSDPRoupin2018-partie1.pdf) provide many results
from my manuscript but no proof is given. The MIT course “Introduction to Semidefinite Programming ” by R. Freund does
not even provide the SDP definition or the eigen-decomposition. The book “Convex Optimization” by S. Boyd and L. Vanden-
berghe starts using SDP matrices from the beginning (e.g., to define ellipsoids in Section 2.2.2) without defining the concept of
SDP matrix, not even in appendix. The argument could extend to other non-trivial concepts that are taken as pre-requisite in
above works. For instance, the above “Convex Optimization” book introduces the square root of an SDP matrix (in five lines
in Appendix A.5.2), without showing the uniqueness – the proof takes half a page in Appendix B.4 of this manuscript.
1
arXiv:2210.13072v1 [math.OC] 24 Oct 2022
work. In fact, the goal of this manuscript is to give you all the tools needed to move to the next level and carry out
research work.
Contents
PART 1 THE ESSENTIAL BUILDING BLOCKS
1 Characterization of semidefinite positive (SDP) matrices 4
1.1 Real symmetric matrices, eigenvalues and the eigendecomposition . . . . . . . . . . . . . . . . 4
1.2 EquivalentSDPdenitions ..................................... 5
1.3 Schur complements, the self-duality of the SDP cone and related properties . . . . . . . . . . 8
1.4 Three easy ways to generate (semi-)definite positive matrices . . . . . . . . . . . . . . . . . . 10
1.5 Positive definite matrices: unique Cholesky factorization and Sylvester criterion . . . . . . . . 11
1.6 Cholesky decomposition of semidefinite positivematrices .................... 13
1.7 Any A0has infinitely many factorizations A=V V >related by rotations and reflections . 18
1.8 Convex functions have an SDP Hessian assuming the Hessian is symmetric . . . . . . . . . . 19
2 Primal-Dual SDP programs and optimization considerations 20
2.1 PrimalanddualSDPprograms................................... 20
2.2 Relations between the primal optimum and the dual optimum . . . . . . . . . . . . . . . . . . 27
2.3 Strongduality............................................. 29
2.4 The difficulty of exactly solving (SDP) and algorithmic comments . . . . . . . . . . . . . . . 33
3 Interesting SDP programs 34
3.1 An SDP program does not always reach its min (inf) or max (sup) value . . . . . . . . . . . . 34
3.2 The lowest and greatest eigenvalue using the SDP duality . . . . . . . . . . . . . . . . . . . . 34
3.3 Change of variable in SDP programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4 Convex quadratic programming is a particular case of SDP programming . . . . . . . . . . . 35
3.5 An LP with equality constraints as an SDP program in the dual form . . . . . . . . . . . . . 38
PART 2 MORE ADVANCED SDP PROGRAMMING
4 Six equivalent formulations of the Loasz theta number ϑ(G)39
4.1 A first SDP formulation of the theta number . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 A second SDP formulation (ϑ0
G)ofthethetanumber....................... 42
4.3 A formulation ϑ00(G) of the theta number without SDP matrices . . . . . . . . . . . . . . . . 43
4.4 A fourth formulation ϑ`(G)ofthethetanumber ......................... 44
4.5 Two formulations of the theta number using maximum eigenvalues . . . . . . . . . . . . . . . 45
4.6 The theta number ϑ(G) is bounded by the fractional chromatic number χ(G) of G..... 46
5 A taste of copositive optimization and sum of squares hierarchies 48
5.1 Introducing the completely positive and the copositive cones . . . . . . . . . . . . . . . . . . . 48
5.2 Reformulating a homogeneous quadratic program as a copositive problem . . . . . . . . . . . 49
5.3 Relaxations of the copositive formulation of the maximum stable . . . . . . . . . . . . . . . . 52
5.4 Further characterization of the completely positive and the copositive cones . . . . . . . . . . 58
5.5 A final short property: the Schur complement does not apply in Cn.............. 60
6 SDP relaxations and convexifications of quadratic programs 60
6.1 The most general quadratic program: SDP relaxation and total Lagrangian . . . . . . . . . . 61
6.2 Partial and total Lagrangians for quadratic programs with linear equality constraints . . . . . 62
6.3 The case of 0 1 quadratic programs: partial and total Lagrangians . . . . . . . . . . . . . . 67
2
7 Basic elements of several other research topics: under construction 73
7.1 Approximation algorithms using SDP programming . . . . . . . . . . . . . . . . . . . . . . . 73
7.2 Strong duality in the more general context of linear conic programming . . . . . . . . . . . . 74
7.3 PolynomialOptimization ...................................... 79
7.4 Algorithms for SDP optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
APPENDIX
A On ranks, determinants and space dimensions 79
A.1 The rank-nullity theorem and other interesting rank properties . . . . . . . . . . . . . . . . . 79
A.2 Results on determinants and space dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . 81
B Three decompositions: eigenvalue, QR and square root 84
B.1 Preliminaries on eigen-values/vectors and similar matrices . . . . . . . . . . . . . . . . . . . . 84
B.2 Theeigenvaluedecomposition.................................... 85
B.3 The QR decomposition of real matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
B.4 An SDP matrix has a unique SDP square root factor . . . . . . . . . . . . . . . . . . . . . . . 90
C Useful related facts 91
C.1 Optimality conditions for linearly-constrained quadratic programs . . . . . . . . . . . . . . . 91
C.2 More insight and detail into the convexifications from Section 6................. 93
C.3 A convex function with an asymmetric Hessian . . . . . . . . . . . . . . . . . . . . . . . . . . 97
C.4 The separating hyperplane theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3
PART 1: THE ESSENTIAL BUILDING BLOCKS
1 Characterization of semidefinite positive (SDP) matrices
1.1 Real symmetric matrices, eigenvalues and the eigendecomposition
This light introduction aims at familiarizing the reader with the main concepts of real symmetric matrices,
eigenvalues and the eigenvalue decomposition. Experts on this topic can skip to Section 1.2 or even to further
sections. Absolute beginners should first consult Appendix Ato get the notion of matrix rank, (sub-)space
dimension, (principal) minor, or to recall how det(A)=0 ⇒ ∃xs. t. Ax=0. To familiarize with such
introductory concepts it is also useful to solve a few exercises, but the only exercise I propose is to ask the
reader to prove by himself all theorems whose proof does not exceed half a page.
Given (real symmetric) matrix A, we say λis a (real) eigenvalue of Aif there exists eigenvector vsuch
that Av=λv. Notice that by multiplying the eigenvector with a constant we obtain another eigenvector.
An eigenvector is called unitary if it has a norm of 1, i.e.,|v|2=v>v=Pn
i=1 v2
i= 1. We say vis an
eigenvector if it satisfies λInvAv=0, equivalent to (λInA)v=0, which also means det(λInA) = 0.
We can also define the eigenvalues as the roots of the characteristic polynomial det(xInA). Indeed, if
det(λ1InA) = 0, there exists a (real) eigenvector v1such that (λ1InA)v1= 0, equivalent to Av1=λ1v1.
The fact that v1needs to exist is formally proven in Prop A.2.2; this Prop A.2.2 proves the statement in
the more general context of complex matrices because we should not yet take for granted that root λ1is not
complex, i.e.,λ1InAcould have complex entries in principle. However, it is possible to show that a real
symmetric matrix has only real eigenvalues and real proper eigenvectors. The proof is not really completely
obvious, it takes a third of a page and it is given in appendix (Prop. B.1.2). By developing the characteristic
polynomial, one can also prove that the determinant is the product of the eigenvalues (Prop. A.2.4).
The characteristic polynomial has degree n, and so, it has nroots (eigenvalues), but some of them can have
multiplicities greater than 1, i.e., some eigenvalues can appear more than once. However, each eigenvalue is
associated to at least one eigenvector. An eigenvalue with multiplicity greater than 1 can have more than
one eigenvector. The following result is called the eigendecomposition of the real symmetric matrix A.
unitary eigenvectors of A
. . .
A=
v11 v21 . . . vn,1
v21 v22 . . . vn,2
.
.
..
.
.....
.
.
vn,1vn,2. . . vn,n
λ10. . . 0
0λ2. . . 0
.
.
..
.
.....
.
.
0 0 . . . λn
v11 v12 . . . v1,n
v12 v22 . . . v2,n
.
.
..
.
.....
.
.
v1,n v2,n . . . vn,n
.
.
.
=hv1v2. . . vnidiag(λ1, λ2, . . . λn)
v>
1
v>
2
.
.
.
v>
n
=UΛU>
=
n
X
i=1
λiviv>
i,
(1.1.1)
transposed
eigenvectors
eigenvalues of A
where v1,v2, . . . vnare the unitary (column) eigenvectors of Aassociated to (some repeated) eigenvalues
λ1, λ2,...λnand Λ = diag(λ1, λ2...,λn) is the diagonal matrix with Λii =λii[1..n]. The eigenvectors
are unitary and orthogonal, meaning that v>
ivj= 0,i, j [1..n], i 6=jand v>
ivi= 1 i[1..n]. This
directly leads to
[v1v2. . . vn]>[v1v2. . . vn] = In, equivalent to [v1v2. . . vn][v1v2. . . vn]>=In,(1.1.2)
using the very well known property XY =I=Y X =I(Prop. A.2.5).
4
The simple case of distinct eigenvalues
It is important to familiarize with this decomposition. For this, let us first examine a proof that works for
symmetric matrices Awith distinct eigenvalues (all with multiplicity one). We will first show that the eigen-
vectors of symmetric Aare orthogonal. Let v1,v2, . . . vnbe the unitary eigenvectors of resp. λ1, λ2,...λn.
For any i, j [1..n], i 6=j, we can write v>
iAvj=v>
iλjvj=λjv>
ivjand also v>
iAvj=λiv>
ivjbased on
v>
iA= (v>
iA)>>= (A>vi)>= (Avi)>=λiv>
i. This leads to λjv>
ivj=λiv>
ivj, and, using λi6=λj, we
obtain v>
ivj= 0 i, j [1..n], i 6=j.
We now construct the eigen-decomposition. Using λivi=Avii[1..n], we obtain A[v1v2. . . vn] =
[λ1v1λ2v2. . . λnvn], equivalent to A[v1v2. . . vn] = [v1v2. . . vn]diag(λ1, λ2,...λn). We now multiply
both sides at right by [v1v2. . . vn]>which is equal to [v1v2. . . vn]1by virtue (1.1.2); we thus obtain
A= [v1v2. . . vn]diag(λ1, λ2,...λn)[v1v2. . . vn]>, which is exactly (1.1.1).
Generalizing the proof for the case of eigenvalues with different multiplicities
We now prove the eigen-decomposition in the general case by extending the above construction. We need
two steps
Step. 1 We first show that the above construction for eigenvalues with unitary multiplicities can be readily
generalized to the case where there is a different eigenvector for each of the kirepetitions of each
root λiof the characteristic polynomial (i.e., for any eigenvalue λi). In this case, there are kilinearly
independent eigenvectors associated to λi; we say that the eigenspace dimension k0
iof λi(the geometric
multiplicity k0
i) is equal to the algebraic multiplicity ki. This step is relatively straightforward and it
is proved in full detail (and slightly generalized) in Prop. B.2.2.
Step. 2 We need to show that the eigenspace of λihas dimension ki. To show this, we use the notion
of similar matrices: Xand Yare similar if there exists Usuch that X=UY U1; we say that
Yis the representation of Xin the basis composed by the columns of U. Two simple properties
(Prop. B.1.4 and B.1.5) show that: (i) two similar matrices have the same characteristic polynomial so
that any common eigenvalue λiis repeated kitimes and (ii) similar matrices have the same eigenspace
dimension k0
ifor any common eigenvalue λi. Assuming k0
i< kifor symmetric ARn×n, we can
construct URn×nby putting k0
iorthogonal unitary eigenvectors of λion the first k0
icolumns and by
filling the other columns (introducing nk0
iunitary vectors perpendicular on the first k0
icolumns) such
that U>U=In. We obtain U>AU =hλiIk0
i0
0Di.Since similar matrices have the same characteristic
polynomial and k0
i< k,λineeds to be a root of the characteristic polynomial of D, and so, λineeds
to have an eigenvector in D. This means that the eigenspace of λiin hλiIk0
i0
0Dihas dimension at least
k0
i+1, contradicting the fact that similar matrices have the same eigenspace dimension for λi. As such,
the assumption k0
i< kiwas false and we obtain k0
i=ki,i.e.,λihas the same algebraic and geometric
multiplicity ki. For the skeptical, this Step 2 is proven in full detail in Prop. B.2.3.
1.2 Equivalent SDP definitions
Definition 1.2.1. A symmetric matrix ARn×nis positive semidefinite (SDP) if the following holds for
any vector xRn
x>Ax=x
·
Ax=A
·
xx>=trace(Xxx>) =
n
X
i=1
n
X
j=1
aij xixj0,(1.2.1)
where
·
stands for the scalar product. If the above inequality is always strict, the matrix is positive definite.
If the inequality is reversed, the matrix is negative semidefinite.
Proposition 1.2.2. A symmetric matrix ARn×nis positive semidefinite (resp. definite) if and only if all
eigenvalues λiverify λi0(resp. >0).
Proof.
=
We consider Ais positive semidefinite (resp. definite). Assume there exist an eigenvalue λ < 0 (resp λ0)
associated to eigenvector v. We have v>Av=v>λv=λ(v2
1+v2
2+. . . v2
n). If λ < 0 (resp λ0), then
5
摘要:

DemystifyingthecharacterizationofSDPmatricesinmathematicalprogrammingDanielPorumbelOctober25,2022ArgumentThismanuscriptwaswrittenbecauseIfoundnootherintroductiontoSDPprogrammingthattargetsthesameaudience.Thisworkisintendedtobeaccessibletoanybodywhodoesnothatemaths,whoknowswhataderivativeisandaccepts...

展开>> 收起<<
Demystifying the characterization of SDP matrices in mathematical programming Daniel Porumbel_2.pdf

共104页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:104 页 大小:1.71MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 104
客服
关注