Local and Global Structure Preservation Based Spectral Clustering

2025-05-02 0 0 449.31KB 19 页 10玖币
侵权投诉
Local and Global Structure Preservation Based Spectral
Clustering
Kajal Eybpoosha, Mansoor Rezghib, Abbas Heydaria
aDepartment of Mathematics, Tarbiat Modares University, Tehran, Iran
bDepartment of Computer Science, Tarbiat Modares University, Tehran, Iran
Abstract
Spectral Clustering (SC) is widely used for clustering data on a nonlinear manifold.
SC aims to cluster data by considering the preservation of the local neighborhood
structure on the manifold data. This paper extends Spectral Clustering to Local and
Global Structure Preservation Based Spectral Clustering (LGPSC) that incorporates
both global structure and local neighborhood structure simultaneously. For this exten-
sion, LGPSC proposes two models to extend local structures preservation to local and
global structures preservation: Spectral clustering guided Principal component anal-
ysis model and Multilevel model. Finally, we compare the experimental results of
the state-of-the-art methods with our two models of LGPSC on various data sets such
that the experimental results confirm the effectiveness of our LGPSC models to cluster
nonlinear data.
Keywords: Spectral clustering, Local structure, Global structure, Similarity
1. Introduction
Clustering is an important task in data analysis, machine learning, and data mining.
The primary purpose of clustering is to group data points into corresponding categories
according to their intrinsic similarities [1, 2, 3]. Most of the clustering approaches can
be categorized as local or global. Among existing clustering methods, techniques based
on applying spectral clustering to a similarity matrix have become extremely popular
due to their both local and global variants as follows:
1. Local spectral clustering-based approaches such as classical spectral clustering
algorithm uses kernel functions [4] or k-nearest neighbors (KNN) [5], scalable
Preprint submitted to arxiv October 25, 2022
arXiv:2210.12778v1 [cs.LG] 23 Oct 2022
spectral clustering with cosine similarity [6], Local Subspace Affinity (LSA) [7],
Spectral Local Best-fit Flats (SLBF) [8], and [9] use locality of each data point
to build a similarity between pairs of data points.
2. Global spectral clustering-based approaches such as Spectral Curvature Cluster-
ing (SCC) [10], Sparse Subspace Clustering (SSC) [11, 12], Low-Rank Subspace
Clustering (LRSC) [13], Scalable Sparse Subspace Clustering by Orthogonal
Matching Pursuit (SSC-OMP) [14], and Sparse Subspace Clustering-matching
pursuit (SSC-MP) [15] algorithms that build similarities between data points us-
ing global information.
The idea of spectral clustering [16, 17, 18] is often of interest to cluster a set of ndata
points on a nonlinear manifold into a finite number of clusters. This algorithm is widely
used in various fields, such as image segmentation [19], speaker diarization [20], and
multi-type relational data [21]. Spectral clustering (SC) is a general graph-theoretic
framework. SC separates the clustering task into two main steps: first converts a data
set into a graph or a data similarity matrix such that clustering results of spectral clus-
ters are sensitive to conversion similarity matrices or graphs, and then applies a graph
cutting method to identify clusters. We know that the similarity matrix has an impor-
tant role in clustering performance. So, most spectral clustering methods are designed
to obtain a high-quality graph from the data set [22, 23]. However, there are still some
problems to limit the spectral clustering method performance, which includes some
of the proposed spectral clustering-based algorithms only consider the local similar-
ity of data, some of the methods consider the global similarity of data separately and
overlook the local similarity of the data. To address this, [24] has merged the local
method [25] and the global method [26] to learn similarities of the original data in the
kernel space. Furthermore, [24] uses low-rank constraint to make the adaptive graph to
achieve the purpose of one-step clustering. Compared to spectral clustering, the com-
putational cost of the method in [24] is very expensive, which is not suitable for large
data sets.
To solve these problems, we propose a novel spectral clustering method called Lo-
2
cal and Global Structure Preservation Based Spectral Clustering (LGPSC). The LGPSC
algorithm simultaneously considers preserving the local and the global structure of data
to provide comprehensive similarities for nonlinear clustering tasks. The remainder of
this paper is structured as follows. Some preliminaries are reviewed in section 2. The
proposed method (including two models) is introduced in section 3. Section 4 presents
experimental results for several data sets. The last section is devoted to some conclu-
sions.
2. Spectral Clustering
Spectral clustering (SC) has became a popular clustering technique due to the pi-
oneering works [27, 28, 29] at the beginning of the century. Spectral clustering is a
general graph-theoretic framework based on two main steps: first converts data points
into a data similarity matrix or a graph and then uses a graph cutting method to identify
clusters. Here we briefly outline the algorithm of spectral clustering.
For given data set X= [x1,x2,...,xn]Rm×nSpectral clustering (SC) algorithm
constructs local similarity matrix W= (Wi j)Rn×nthat Wi j measures the similarity
between two points xiand xj. The similarities between any two points in Wwill be
assigned to the value of a Gaussian kernel. If xjis a knearest neighbor of xi,Wi j is
computed as follows:
Wi j =exp(d(xi,xj)2/σ2)(1)
Subject to the constraint Wi j =0 if xjis not a knearest neighbor of xi.
Wi j =
e
dis(xi,xj)2
σ2xjknn(xi)
0otherwise
By this similarity matrix, Spectral clustering (SC) method tries to find embedding
vectors yifor each xisuch that minimize the following object function:
min tr(YTLY )s.t.YTY=I(2)
3
where Y= [y1,y2,..., yn]TRn×d,dis the number of clustering, L=DWis a
Laplacian matrix and D=diag(d1,d2,··· ,dn),dii =n
j=1Wi j . It is known that the
keigenvectors of Lassociated with the dsmallest eigenvalues are the solution of the
problem. It is easy to show that tr(YTLY ) = 1
2n
i,j=1wi jkyiyjk2. So, the Spectral
Clustering (SC) model could be presented as follows:
min
n
i,j=1
wi jkyiyjk2s.t.YTY=I(3)
3. Local and Global Structure Preservation Based Spectral Clustering (LGPSC)
This section proposes a new clustering method called Local and Global Structure
Preservation Based Spectral Clustering (LGPSC), a clustering method based on modi-
fied Spectral Clustering (SC) to preserve the global and local structure of the nonlinear
manifold data. To maintain the global structure of manifold data, the Local and Global
Structure Preservation Based Spectral Clustering (LGPSC) technique provides two new
models:
First model: The LGPSC(SC-PCA) technique combines Principal component
analysis (PCA) [30, 31] as a classical low-dimensional representation method
and Spectral Clustering method. This model aims to preserve local and global
structure by combining the SC’s similarities in local manifold structure and
PCAs good preservation of the data’s global structure simultaneously.
Second model: The LGPSC(Multilevel) method uses the arithmetic mean of
each input point along with its k nearest neighbors in the second model. This
model of the LGPSC algorithm employs the arithmetic means to calculate the
similarity between the local neighborhood of each input point and the local
neighborhood of other input points, which makes the global properties to be
considered in calculating the similarity matrix.
Our proposed methods based spectral clustering can have the following benefits:
4
摘要:

LocalandGlobalStructurePreservationBasedSpectralClusteringKajalEybpoosha,MansoorRezghib,AbbasHeydariaaDepartmentofMathematics,TarbiatModaresUniversity,Tehran,IranbDepartmentofComputerScience,TarbiatModaresUniversity,Tehran,IranAbstractSpectralClustering(SC)iswidelyusedforclusteringdataonanonlinearma...

展开>> 收起<<
Local and Global Structure Preservation Based Spectral Clustering.pdf

共19页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:19 页 大小:449.31KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 19
客服
关注