All correspondence to Amir.moslemiryerson.ca Email address arash.ahmadianmail.utoronto.ca Subspace Learning for Feature Selection via Rank Revealing_2

2025-04-30 0 0 1.96MB 34 页 10玖币

侵权投诉

* All correspondence to Amir.moslemi@ryerson.ca

** Email address: arash.ahmadian@mail.utoronto.ca

Subspace Learning for Feature Selection via Rank Revealing

QR Factorization: Unsupervised and Hybrid Approaches

with Non-negative Matrix Factorization and Evolutionary

Algorithm

Amir Moslemi a,b,*, Arash Ahmadian c,**

a Toronto Metropolitan University, Department of Physics Toronto, Ontario, Canada

b University of Toronto, Department of Medical Biophysics, Toronto, Ontario, Canada

c University of Toronto, Edward S. Rogers Sr. Department of Electrical and Computer Engineering, Toronto,

Ontario, Canada

Abstract

The selection of most informative and discriminative features from high-dimensional data has been

noticed as an important topic in machine learning and data engineering. Using matrix factorization-

based techniques such as nonnegative matrix factorization for feature selection has emerged as a

hot topic in feature selection. The main goal of feature selection using matrix factorization is to

extract a subspace which approximates the original space but in a lower dimension. In this study,

rank revealing QR (RRQR) factorization, which is computationally cheaper than singular value

decomposition (SVD), is leveraged in obtaining the most informative features as a novel

unsupervised feature selection technique. This technique uses the permutation matrix of QR for

feature selection which is a unique property to this factorization method. Moreover, QR

factorization is embedded into non-negative matrix factorization (NMF) objective function as a

new unsupervised feature selection method. Lastly, a hybrid feature selection algorithm is

proposed by coupling RRQR, as a filter-based technique, and a Genetic algorithm as a wrapper-

based technique. In this method, redundant features are removed using RRQR factorization and

the most discriminative subset of features are selected using the Genetic algorithm. The proposed

algorithm shows to be dependable and robust when compared against state-of-the-art feature

selection algorithms in supervised, unsupervised, and semi-supervised settings. All methods are

tested on seven available microarray datasets using KNN, SVM and C4.5 classifiers. In terms of

evaluation metrics, the experimental results shows that the proposed method is comparable with

the state-of-the-art feature selection.

Key words: Feature selection, Rank Revealing QR factorization, Genetic Algorithm

1. Introduction

Challenges surrounding high-dimensional data has been strikingly noticed since the emergence of

the internet and rapid development of information technology. Trace of curse of dimensionality is

become bold in machine learning, pattern recognition, data mining, computer vision and

recommender systems. Recently, dimensionality reduction techniques have been developed

rapidly. Feature selection and feature extraction are the two main categories of dimensionality

reduction. Feature extraction aims to map data to a lower dimension representation, whereas

Feature selection is aimed to obtain the most discriminative features of high-dimensional data,

which can be used to interpret medical data, industrial data, finance data, etc.

Feature selection in medical data can be important since the selected features can be used in disease

diagnosis or optimizing treatments. To this end, a new line of research is devoted to bioinformatic

and pattern recognition to deal with DNA microarray high dimensional datasets. These data types

have significantly more features than sample points and thus need to be treated as an undetermined

system [1,2]. Although thousands of genes from different samples are stored in a DNA micro array,

only a small portion of the encoded information are related to disease [3]. A Gene selection

problem in bioinformatic is equivalent to a feature selection task in machine learning.

In the context of disease diagnosis and progression analysis using feature selection, demographic,

spirometry test and Computed Tomography (CT) features were collected and combined to predict

chronic obstructive pulmonary disease (COPD) progression, exacerbation, and hospitalization.

Then, extracting more prominent and discriminative features improved the accuracy of

hospitalization prediction and discrimination between COPD and asthma [4-6].

Features without effect on output considered as irrelevant features, and redundant features are a

mix of other features which cannot add more information. Including these irrelevant and redundant

features increases the complexity of system which affects the performance of the learning

algorithm and the computation time [7].

Feature selection is utilized for dimensionality reduction and to obtain the most informative subset

of features by minimizing redundancy and maximizing the relevancy with the aim of increasing

accuracy and decreasing computation time. Feature selection decreases task complexity and thus

the probability of overfitting is decreased [8]. Peaking phenomenon states the error of a classifier

for fixed data is decreased by adding features though error might be increased [9]. There are three

strategies for feature selection; A) Filter, B) Wrapper, and C) Embedded, based techniques. In

filter-based, the features are sorted and assessed based on some criteria by statistical and intrinsic

characteristics of the dataset. In the filter method, there is no connection between classifier and

dataset. Laplacian score [10], mutual information and MRMR are amongst the most well-known

filter-based techniques [11-15]. In wrapper-based, there is a connection between features and the

learning algorithm and thus the best subset of features can be achieved based on the output of the

machine. Forward feature selection, sequential backward feature selection and floating search [16]

and recursive SVM [17] are conventional feature selection techniques in machine learning.

Feature selection based on label information can be categorized to three types including supervised

methods [18], semi-supervised methods [19], and unsupervised methods [20]. Information about

relevancy of features and target can be obtained in a supervised setting making the goal of

interaction between features and classes to find the most discriminative features. Under semi-

supervised conditions part of data are labeled and information of both labeled and unlabeled data

are extracted. Most of the semi-supervised techniques leverage a similarity matrix to select features

by Laplacian graph (graph structure) construction [20]. There is no label information for the

unsupervised case. Therefore, feature selection can be applied based on relevancy between

features. This type of feature selection is most of the time considered as subspace learning with

the aim to find a subspace of data can span original data [21]. The notations and list of

abbreviations, which are frequently used in this study, are shown in Table 1 and Table 2.

Table 1. Notation used in the study.

Notation

Representation

Number of instances

Number of features



Singular value of matrix

Unit matrix

Lower case shows the vector

Aij

The (i, j)-th entry of matrix

The transpose of matrix A

Diag(A)

The vector of all diagonal elements of matrix A

Tr(A)

The trace of square matrix 







Determinant of matrix 



Frobenius norm of A. 



Matrix 



Vector norm (. 



 



Mixed matrix Norm (. 





 

Recently, feature selection using matrix decomposition and subspace learning has gotten

significant attention. Dealing with noisy information and redundant features are the main

challenges for subspace learning. To circumvent this challenge, different regularization

frameworks have been proposed to decrease the effect of noise and redundant information. Non-

negative Matrix Factorization (NMF) combined with additional regularization terms is a well-

known technique in this domain. Sparse regularization terms such as ,  (

and  were added to increase the sparsity of solution [28,30,54,55]. However, information

can be degraded or lost after projecting to subspace. To ameliorate this challenge, local structure

learning regularization functions, which are based on graph Laplacian concept, were introduced.

These functions were added to preserve geometrical information of data in subspace with this

assumption that if two samples were close, they would be close in the subspace. [31, 32, 53, 54].

Since feature selection is categorized as a combinatorial NP-hard problem, evolutionary

computational techniques can be utilized to tackle this challenge. To this end, Genetic algorithms

[38,46], particle swarm optimization (PSO) [39, 47], ant colony optimization [40, 48] and grey

wolf optimization [41] have been applied on feature selection task to obtain the most informative

features (genes). Cost function definition is one of the most important issues which requires

significant consideration in feature selection using evolutionary techniques. The vast majority of

feature selection cost function are constructed based on two well-known objectives: the number of

features and classification error. Furthermore, many-objective feature selection methods have been

proposed to enhance the classification accuracy [56]. This cost function encompassed the four

objectives including the number of features, classification error, correlation between features and

labels, and computational complexity of features.

Table 2. List of abbreviations, which are utilized throughout of this article.

Complete Form

Abbreviation

Non-negative matrix factorization

NMF

Maximum projection and minimum redundancy feature

MPMR

Rescaled Linear Square Regression

RLSR

Dual regularized unsupervised feature selection based on

matrix factorization and minimum redundancy

DR-FS-MFMR

ConCave–Convex Procedure

CCCP

Dual-graph sparse non-negative matrix factorization

DSNMF

Hesitant fuzzy set

HFS

Reduced row Echelon

RRE

Singular value decomposition

SVD

Particle swarm optimization

PSO

Genetic Algorithm

Rank revealing QR

RRQR

Rank revealing QR- Genetic Algorithm

QR-GA

Principal component analysis

PCA

Non-negative matrix factorization QR

NMF-QR

In all feature selection techniques, including NMF based and combinatorial methods,

computational complexity is a considered heavily. For instance, the complexity of SVD and NMF

are  and , respectively ( is the number of selected features). To this end, a

comparison between SVD-based principal component analysis (PCA) and QR-based PCA was

done, and results showed that QR-based PCA is more efficient than SVD-based PCA in terms of

computational complexity [57]. Additionally, these feature selection techniques have many hyper-

parameters which must be tuned to achieve the best performance. Consequently, a matrix-based

technique with less computational complexity and minimum hyper-parameters is the main

motivation behind this work and to the best of our knowledge, it's the first to utilize QR

factorization in feature selection. Additionally, followed by QR two hybrid techniques are

proposed including; A) unsupervised NMF-QR feature selection, which is embedding QR in NMF,

and B) A genetic algorithm is used, as an evolutionary algorithm, to combine with rank revealing

QR factorization to construct a supervised hybrid feature selection by combination of filter policy

and wrapper policy.

The contributions of this paper can be summarized as follows.

• For the first time rank revealing QR matrix factorization is proposed to find the most

informative features of a high-dimensional data.

• Combination of NMF and QR as unsupervised feature selection.

• A Genetic algorithm is combined with rank revealing QR matrix factorization to obtain the

optimum number of features and tune the hyperparameters of the classifier simultaneously

with feature selection. Moreover, to prevent premature convergence to local optima, the

quality of search space for the genetic algorithm is considerably enhanced by using Rank

revealing QR matrix factorization as a filter-phase for hybrid feature selection.

The remainder of this paper is arranged as follows:

Several feature selections are surveyed in Section 2. Section 3 briefly reviews matrix

decomposition. Section 3.1.2-3.1.3 belongs to RRQR and NMF-QR and 3.1.4 covers hybrid

feature selection QR-GA. Results of the proposed method and its comparison with state-of-art

methods are shown in Section 5. Finally, Section 7 and 8 draws the discussion and conclusion.

2. Related work

Matrix-based techniques, which are deeply rooted in linear algebra, have played an important role

in feature selection task. Subspace learning plays a significant role in obtaining a low-dimensional

representation of original data while preserving maximum information. To this end, non-negative

matrix factorization (NMF) was formulized for feature selection by minimizing distance between

the subspace and the original space of matrix [22]. In theory of NMF a matrix can be

estimated by two non-negative matrices  and  such that . Therefore,

NMF feature selection was defined as follows,











    (1)

 is a positive penalty coefficient and this cost function was solved for fixed  and  in an

iterative algorithm.  is an indicator matrix (feature weight matrix) and the importance of each

feature can be obtained using the Euclidean norm of  rows.  is representation matrix of data.

A new term, which is to guarantee minimum redundancy, was added to NMF feature selection to

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

*AllcorrespondencetoAmir.moslemi@ryerson.ca**Emailaddress:arash.ahmadian@mail.utoronto.caSubspaceLearningforFeatureSelectionviaRankRevealingQRFactorization:UnsupervisedandHybridApproacheswithNon-negativeMatrixFactorizationandEvolutionaryAlgorithmAmirMoslemia,b,*,ArashAhmadianc,**aTorontoMetropolitan...

展开>> 收起<<

All correspondence to Amir.moslemiryerson.ca Email address arash.ahmadianmail.utoronto.ca Subspace Learning for Feature Selection via Rank Revealing_2.pdf

共34页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

All correspondence to Amir.moslemiryerson.ca Email address arash.ahmadianmail.utoronto.ca Subspace Learning for Feature Selection via Rank Revealing_2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: