Matching Map Recovery with an Unknown Number of Outliers Arshak Minasyan Tigran Galstyan Sona Hunanyan Arnak Dalalyan CREST ENSAE IP Paris

2025-05-02 0 0 3.27MB 16 页 10玖币

侵权投诉

Matching Map Recovery with an Unknown Number of Outliers

Arshak Minasyan* Tigran Galstyan Sona Hunanyan Arnak Dalalyan

CREST, ENSAE, IP Paris

5 av. Henry Le Chatelier

91764 Palaiseau

arshak.minasyan@ensae.fr

RAU, YerevaNN

20 Charents street

0025 Yerevan

tigran@yerevann.com

Yerevan State University

1 Alek Manukyan

0025 Yerevan

hunan.sona@gmail.com

CREST, ENSAE, IP Paris

5 av. Henry Le Chatelier

91764 Palaiseau

arnak.dalalyan@ensae.fr

Abstract

We consider the problem of ﬁnding the match-

ing map between two sets of

-dimensional noisy

feature-vectors. The distinctive feature of our set-

ting is that we do not assume that all the vectors

of the ﬁrst set have their corresponding vector in

the second set. If

and

are the sizes of these

two sets, we assume that the matching map that

should be recovered is deﬁned on a subset of un-

known cardinality

k∗≤min(n, m)

. We show

that, in the high-dimensional setting, if the signal-

to-noise ratio is larger than

5(dlog(4nm/α))1/4

then the true matching map can be recovered

with probability

1−α

. Interestingly, this thresh-

old does not depend on

k∗

and is the same as

the one obtained in prior work in the case of

k= min(n, m)

. The procedure for which the

aforementioned property is proved is obtained

by a data-driven selection among candidate map-

pings

{bπk:k∈[min(n, m)]}

. Each

bπk

mini-

mizes the sum of squares of distances between

two sets of size

. The resulting optimization

problem can be formulated as a minimum-cost

ﬂow problem, and thus solved efﬁciently. Finally,

we report the results of numerical experiments on

both synthetic and real-world data that illustrate

our theoretical results and provide further insight

into the properties of the algorithms studied in

this work.

1 INTRODUCTION

The problem of ﬁnding the best matching between two point

clouds has been extensively studied, both theoretically and

Proceedings of the 26

International Conference on Artiﬁcial Intel-

ligence and Statistics (AISTATS) 2023, Valencia, Spain. PMLR:

experimentally. The matching problem arises in various

applications, for instance in computer vision and natural

language processing. In computer vision, ﬁnding the corre-

spondence between two sets of local descriptors extracted

from two images of the same scene is a well-known exam-

ple of a matching problem. In natural language processing,

in particular, in machine translation, the correspondence

between vector representations of the same text in two dif-

ferent languages is another example of a matching problem.

Clearly, in these problems, not all the points have their

matching point and one can hardly know in advance how

many points have their corresponding matching points. The

goal of the present work is to focus on this setting and to

gain a theoretical understanding of the statistical limitations

of the matching problem.

To formulate the problem and to state the main result, let

X= (X1, . . . , Xn)

and

X#= (X#

1, . . . , X#

be two

sequences of feature vectors of sizes

and

such that

m≥n≥2

. We assume that these sequences are noisy

versions of some feature-vectors, i.e.,

(Xi=θi+σξi,

j=θ#

j+σ#ξ#

j,i∈[n]and j∈[m],(1)

where

θ= (θ1, . . . , θn)

and

θ#= (θ#

1, . . . , θ#

are two

sequences of deterministic vectors from

, corresponding

to the original feature-vectors. The noise components of

and

are two independent sequences of i.i.d. isotropic

Gaussian random vectors. Formally,

ξ1, . . . , ξn, ξ#

1, . . . , ξ#

i.i.d.

∼ N(0,Id),

where

is the identity matrix of size

d×d

. We assume

that for some

S∗⊂[n]

of cardinality

k∗

, there exists an

injective mapping

π∗:S∗→[m]

such that

θi=θ#

π∗(i)

holds for all

i∈S∗

. We call the observations

(Xi:i∈S∗)

and

(X#

π∗(i):i∈S∗)

inliers, while the other vectors from

the sequences

and

are considered to be outliers. The

ultimate goal is to recover

π∗

based on the observations

and X#only.

Various versions of this problem have been studied in the

literature. Collier and Dalalyan [2013,2016] considered the

arXiv:2210.13354v2 [math.ST] 9 Mar 2023

Matching Map Recovery with an Unknown Number of Outliers

src sink

XX#

d1,1

d2,1

d3,5

Figure 1: Matching as a Minimum Cost Flow (MCF) prob-

lem. The idea is to augment the graph with two nodes,

source and sink, and

n+m

edges. The capacities of orange

edges should be set to

, while the cost should be set to

. Setting the total ﬂow sent through the graph to

, the

solution of the MCF becomes a matching of size k.

outlier-free case with equal sizes of sequences

and

(i.e.,

m=n

and

S∗= [n]

), whereas Galstyan et al. [2022]

investigated the case with outliers in one of the sequences

only (i.e.,

m≥n

and

S∗= [n]

). Other variations of the

matching problem under Hamming loss have been studied

by Wang et al. [2022], Chen et al. [2022b], Kunisky and

Niles-Weed [2022]. These papers obtain minimax-optimal

separation rates and, in most cases, despite the discrete

nature of the matching problem, provide computationally

tractable procedures to achieve these rates.

When

S∗

is an arbitrary subset of

[n]

, which is the setting we

focus on in this work, one can wonder whether the minimax

separation rate is the same as in the case of known

S∗

. Since

the absence of knowledge on

S∗

brings additional combi-

natorial complexity to the problem, one can also wonder

whether it is still possible to conciliate statistical optimality

and computational tractability. We show in this work that

the answers to these questions are afﬁrmative.

To explain our result, let us introduce the quantity

κi,j =kθi−θ#

jk2/(σ2+σ#2)1/2,

which is the signal-to-noise ratio of the difference

Xi−X#

of a pair of feature-vectors. Clearly, for matching pairs this

difference vanishes. Furthermore, if

κi,j

vanishes or is very

small for a non-matching pair, then there is an identiﬁability

issue and consistent recovery of underlying true matching

is impossible. Therefore, a natural condition for making

consistent recovery possible is to assume that the quantity

¯κall ,min

i∈[n]min

j∈[m]\{π∗(i)}κi,j

is bounded away from zero. A recovery procedure

bπ

is con-

sidered to be good, if the threshold

such that

bπ

recovers

π∗

with high probability as soon as

¯κall ≥λ

is as small as

possible. It was proved in [Collier and Dalalyan,2016] that

when

k∗=n=m

, one can recover

π∗

with probability

1−α

for

λ= 4dlog(4n2/α)1/4∨8 log(4n2/α)1/2

Furthermore, it was proved that this threshold is minimax

optimal, i.e., optimal in the family of all possible recovery

procedures. This implies that there are two regimes. In the

low dimensional regime

d.log n

, the separation rate is

dimension independent. In contrast with this, the separa-

tion rate scales roughly as

d1/4

in the (moderately) high

dimensional regime d&log n.

Let us set

λn,m,d,α = 4dlog( 4nm

α)1

/4∨8 log(4nm

α)1

/2.(2)

The main contributions of this work are the following.

•

For any given

k∈[min(n, m)]

, we show that the

Least Sum of Squares (

-LSS) procedure, based on

maximizing proﬁle likelihood among matching maps

between two sets of size

, can be efﬁciently computed

using the minimum cost ﬂow problem. We denote the

matching obtained using k-LSS by bπLSS

•

If the value

turns out to be smaller than

k∗

and

¯κall ≥

λn,m,d,α

, we prove that

bπLSS

makes no mistake with

probability 1−α.

•

We design a data-driven model selection algorithm

that adaptively chooses

such that with probability

1−α

, we have

k=k∗

and

bπLSS

k=π∗

as soon as

¯κall ≥(5/4)λn,m,d,α.

The last item above implies that our data-driven algorithm

bπLSS

achieves the minimax separation rate. More surpris-

ingly, this shows that there is no gap in statistical complexi-

ties between the problems of recovering matching maps in

outlier-free and outliers-present-on-both-sides settings.

2 RELATED WORK

In statistical hypothesis testing, the separation rates became

key objects for measuring the quality of statistical proce-

dures, see the seminal papers [Burnashev,1979,Ingster,

1982] as well as the monographs [Ingster and Suslina,2003,

Juditsky and Nemirovski,2020]. Currently, this approach is

widely adopted in machine learning literature [Xing et al.,

2020,Wolfer and Kontorovich,2020,Blanchard et al.,2018,

Ramdas et al.,2016,Wei et al.,2019,Collier,2012]. Beyond

the classical setting of two hypotheses, it can also be ap-

plied to multiple testing frameworks, for instance, variable

selection [Ndaoud and Tsybakov,2020,Azaïs and de Cas-

tro,2020,Comminges and Dalalyan,2012,2013] or the

matching problem considered here.

In computer vision, feature matching is a well-studied prob-

lem. One of the main directions is to accelerate matching

Minasyan, Galstyan, Hunanyan, Dalalyan

algorithms, based on fast approximate methods (see e.g.

Malkov and Yashunin [2020], Wang et al. [2018], Harwood

and Drummond [2016], Jiang et al. [2016]). Another di-

rection is to improve the matching quality by considering

alternative local descriptors [Rublee et al.,2011,Chen et al.,

2010,Calonder et al.,2010] for given keypoints. The choice

of keypoints is considered in Tian et al. [2020], Bai et al.

[2020].

The minimum cost ﬂow problem was ﬁrst studied in the

context of the Hungarian algorithm [Kuhn,2012] and the

assignment problem, which is a special case of minimum

cost ﬂow on bipartite graphs with all edges having unit ca-

pacity. Generalization of Hungarian algorithm for graphs

with arbitrary edge costs guarantees

O((n+F)m)

time

complexity, where

is the number of nodes in the graph,

is the number of edges and

is the total ﬂow sent through

the graph. There have also been other algorithms with sim-

ilar complexity guarantees [Fulkerson,1961,Ahuja et al.,

1992]. Since then many algorithms have been proposed for

solving minimum cost ﬂow problems in strongly polynomial

time [Orlin et al.,1993,Orlin,1993,1996,Goldberg and

Tarjan,1989,Galil and Tardos,1988] with the fastest run-

time of around

O(nm)

. Recent advances for solving MCF

problems have been proposed in Goldberg et al. [2015] and

Chen et al. [2022a]. The latter proposes an algorithm with

an almost-linear computational time.

Permutation estimation and related problems have been re-

cently investigated in different contexts such as statistical

seriation [Flammarion et al.,2019,Giraud et al.,2021,Cai

and Ma,2022], noisy sorting [Mao et al.,2018], regression

with shufﬂed data [Pananjady et al.,2017,Slawski and Ben-

David,2019], isotonic regression and matrices [Mao et al.,

2020,Pananjady and Samworth,2020,Ma et al.,2020],

crowd labeling [Shah et al.,2021], recovery of general dis-

crete structure [Gao and Zhang,2019], and multitarget track-

ing [Chertkov et al.,2010,Kunisky and Niles-Weed,2022].

3 MAIN THEORETICAL RESULT

This section contains the main theoretical contribution of

the present work. In order to be able to recover

S∗

and the

matching map

π∗

, the key ingredient we use is the max-

imization of the proﬁle likelihood. This corresponds to

looking for the least sum of squares (LSS) of errors over

all injective mappings deﬁned on a subset of

[n]

of size

Formally, if we deﬁne

Pk:= π:S→[m]such that S⊂[n],|S|=k,

πis injective 

to be the set of all

-matching maps, we can deﬁne the

procedure k-LSS as a solution to the optimization problem

bπLSS

k∈arg min

π∈PkX

i∈SπkXi−X#

π(i)k2

2,(3)

where

Sπ

denotes the support of function

. In the particular

case of

k∗=n

, the optimization above is conducted over

all the injective mappings from

[n]

[m]

. This coincides

with the LSS method from [Galstyan et al.,2022].

Let b

Φ(k)be the error of bπLSS

k, that is

Φ(k) = min

π∈PkXi∈SπkXi−X#

π(i)k2

For some values of tuning parameters

λ > 0

and

γ > 0

, as

well as for some kmin ∈[n], initialize k←kmin and

1. Compute b

Φ(k)and b

Φ(k+ 1).

2. Set ¯σ2

k=b

Φ(k)/(kd).

3. If k=nor b

Φ(k+ 1) −b

Φ(k)>d+λ

1−γ¯σ2

then output (k, ¯σk,bπLSS

k).

4. Otherwise, increase k←k+ 1 and go to Step 1.

In the sequel, we denote by

k, ¯σb

k,bπLSS

the output of this

procedure. Notice that we start with the value of k=kmin,

which in the absence of any information on the number of

inliers might be set to

k= 1

. However, using a higher

value of

kmin

might considerably speed up the procedure

and improve its quality.

For appropriately chosen values of

and

, as stated in the

next theorem, the described procedure outputs the correct

values of k∗and π∗with high probability.

Theorem 1.

Let

α∈(0,1)

and

λn,m,d,α

be deﬁned by

(2)

. If

¯κall >(5

/4)λn,m,d,α

, then the output

k, bπLSS

of the model selection algorithm with parameters

λ=

/4)λ2

n,m,d,α, γ =λ

/dsatisﬁes P(bπLSS

k=π∗)≥1−α.

Since the condition on the separation distance

¯κall

compared

to the case of known

k∗

is different by only a slightly larger

constant, from the perspective of statistical accuracy, the

case of unknown

k∗

is not more challenging than that of the

known k∗.

In the sequel, without much loss of generality, we assume

that the sizes of

and

are equal, i.e.,

n=m

. Indeed,

in the case,

m>n

one can add

m−n

points arbitrarily far

from the rest of the points to the smaller set

obtaining

equal size sets X+and X#.

Notice that in the optimization problem

(3)

the domain of

is a ﬁnite set of injective functions. For a given value

, the number of such functions is

k!n

k2

making thus

an exhaustive search computationally infeasible. Instead,

we show in Section 5that the optimization problem formu-

lated in

(3)

can indeed be solved efﬁciently with complexity

O(√k n2)

, where the notation

hides polylogarithmic fac-

tors, i.e., up to polylogarithmic factors, the computational

cost is of order √kn2.

Matching Map Recovery with an Unknown Number of Outliers

4 INTERMEDIATE RESULTS AND

PROOF OF THEOREM 1

This section is devoted to the proof of our main result. Along

the way, we establish some intermediate results which are of

interest on their own. The proofs of some technical lemmas

are deferred to Appendix A.

4.1 Sub-mapping Recovery by LSS for k≤k∗

The ﬁrst question we address in this section is under which

conditions the LSS estimator

bπLSS

from

(3)

recovers correct

matches. Of course, the only way of correctly estimating

the true matching is to choose

k=k∗

. However, it turns

out that even if we overestimate the number of outliers and

choose a value

which is smaller than the true value

k∗

with high probability the

-LSS estimator makes no wrong

matches. Naturally, this result, stated in the next theorem,

is valid under the condition that the relative signal-to-noise

ratio of all incorrect pairs of original features is larger than

some threshold.

Theorem 2

(Quality of

-LSS when

k≤k∗

)

Let

supp(bπ)for bπ=bπLSS

kdeﬁned by (3),α∈(0,1) and

λn,d,α = 4dlog(4n2/α)1/4∨8 log(4n2/α)1/2.(4)

k≤k∗

and the signal-to-noise ratio satisﬁes the condition

¯κall ≥λn,d,α

then, with probability at least

1−α

, the

support of the estimator

bπ

is included in

S∗

and

bπ

coincides

with π∗on the set b

S. Formally,

Pb

S⊂S∗and bπ(i) = π∗(i),∀i∈b

S≥1−α.

Proof of Theorem 2.Note that the random vectors

ηij = (σξi−σ#ξ#

j)/pσ2+σ#2

are standard Gaussian and deﬁne the following quantities

ζ1,max

i,j6=π∗(i)

|(θi−θ#

j)>ηij |

kθi−θ#

jk2

ζ2,d−1/2max

i,j kηij k2

2−d.

(5)

For the ease of notation, for any matching map

we also

deﬁne L(π)as follows

L(π) = Xi∈Sπ

kXi−X#

π(i)k2

σ2+σ#2.

We start with two auxiliary lemmas that will be used in other

proofs as well. The proofs of these lemmas are deferred to

the appendix.

Lemma 1.

Let

be any matching map that can not be

obtained as a restriction of

π∗

on a subset of

[n]

. Let

S0⊂

S∗

be an arbitrary set satisfying

|S0| ≤ |Sπ|

and

{i∈

Sπ∩S∗:π(i) = π∗(i)} ⊂ S0

and let

π0

be the restriction

π∗

. On the event

Ω0={8ζ1≤¯κall; 4√d ζ2≤¯κ2

all}

we have

L(π)−L(π0)≥(1

/4)¯κ2

all +d(|Sπ|−|S0|).

Let

be any matching map from

that is not a restriction

π∗

. Since

|Sπ|=k≤k∗

, there exists necessarily a

π0

as in Lemma 1such that

|S0|=|Sπ|

. For this

π0

, we

have

L(π)−L(π0)≥(1

/4)¯κ2

all >0

. This implies that

cannot be a minimizer of

L(·)

over

. As a consequence,

Ω0

, any minimizer of

L(·)

over

is a restriction of

π∗

. Therefore, on

Ω0

, we have

S⊂S∗

and

bπk=π∗|b

. It

remains to prove that P(Ω0)≥1−α.

Lemma 2.

Let

Ω0,x ={8ζ1≤x}∩{4√dζ2≤x2}

with

ζ1, ζ2

deﬁned as in

(5)

. Then, for every

x > 0

P(Ω{

0,x)

upper bounded by

2n2exp n−x2

128o+ exp n−x2

128dx2∧4do.

We apply Lemma 2with

x= ¯κall

to show that

P(Ω0)≥

1−α. Clearly, a sufﬁcient condition for the latter is







2n2exp −¯κ2

all/128≤α/2,

2n2exp −(¯κall/16)2

d2¯κ2

all ∧8d≤α/2.

This system is equivalent to

¯κall ≥82 log 4n2

α1/2and ¯κall ≥4d

2log 4n2

α1/4.

Therefore, if the signal-to-noise ratio satisﬁes

¯κall ≥4dlog(4n2/α)1/4∨8 log(4n2/α)1/2,

we have P(Ω0)≥1−α.

4.2 Matching Map Recovery for Unknown k∗

If no information on

k∗

is available, and the goal is to

recover the entire mapping

π∗

, one can proceed by model

selection. More precisely, one can compute the collection of

estimators

{bπLSS

k:k∈[n]}

and select one of those using a

suitable criterion. To deﬁne the selection criterion proposed

in this paper, let us remark that

Φ(k) = min

π∈PkXi∈SπkXi−X#

π(i)k2

is an increasing function. The increments of this function for

k≤k∗

are not large, since they essentially correspond to the

squared norm of a pure noise vector distributed according

to a scaled

χ2

distribution with

degrees of freedom. The

main idea behind the criterion we propose below is that the

increment of

k∗

is signiﬁcantly larger than the previous

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

MatchingMapRecoverywithanUnknownNumberofOutliersArshakMinasyan*TigranGalstyanSonaHunanyanArnakDalalyanCREST,ENSAE,IPParis5av.HenryLeChatelier91764Palaiseauarshak.minasyan@ensae.frRAU,YerevaNN20Charentsstreet0025Yerevantigran@yerevann.comYerevanStateUniversity1AlekManukyan0025Yerevanhunan.sona@gmail....

展开>> 收起<<

Matching Map Recovery with an Unknown Number of Outliers Arshak Minasyan Tigran Galstyan Sona Hunanyan Arnak Dalalyan CREST ENSAE IP Paris.pdf

共16页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Matching Map Recovery with an Unknown Number of Outliers Arshak Minasyan Tigran Galstyan Sona Hunanyan Arnak Dalalyan CREST ENSAE IP Paris

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: