
PRINCIPAL COMPONENT CLASSIFICATION
Rozenn Dahyot
Department of Computer Science, Maynooth University, Ireland
ABSTRACT
We propose to directly compute classification estimates
by learning features encoded with their class scores. Our re-
sulting model has a encoder-decoder structure suitable for su-
pervised learning, it is computationally efficient and performs
well for classification on several datasets.
Index Terms—Supervised Learning, PCA, classification
1. INTRODUCTION
The choice of data encoding for defining inputs and outputs of
machine learning pipelines contributes substantially to their
performance. For instance, adding positional encoding in the
inputs have shown useful for Convolutional Neural Networks
[1] and for Neural radiance Fields [2]. Here, we propose to
add vectors of class scores as part of inputs to learn princi-
pal components suitable for predicting classification scores.
Performance of our proposed frugal model is validated exper-
imentally on datasets wine,australian [3, 4] and MNIST [5]
for comparison with metric learning classification [6, 4, 7],
and deep learning [5, 8].
2. PRINCIPAL COMPONENT CLASSIFICATION
In supervised learning, we consider available a dataset B=
{(x(i),y(i))}i=1,··· ,N of Nobservations with x∈Rdxde-
noting the feature vector of dimension dxand y∈Rncthe
indicator class vector where ncis the number of classes. All
coordinates of y(i)are equal to zero at the exception of its co-
ordinate y(i)
jthat is equal to 1 if y(i)is indicating that feature
vector x(i)belongs to class j∈ {1,· · · , nc}.
Principal Component Analysis (PCA) [9] is a standard
technique for dimension reduction often used in conjunction
with classification techniques [6]. In PCA, the principal com-
ponents correspond to the eigenvectors of the covariance ma-
trix Σranked in descending order of their associated eigen-
values, where Σ = 1
NXXTand X = [x(1),· · · ,x(N)]. These
principal components provide a orthonormal basis in the fea-
ture space. Retaining only the ones associated with the high-
est eigenvalues allow to project xin a very small dimensional
This work was funded by the SFI Research Centre ADAPT (13/RC/2106
P2), and is co-funded by the European Regional Development Fund.
eigenspace (data embedding). Such PCA based representa-
tion has been used for learning images of objects, to perform
detection and registration [10, 11, 12], and has a probabilistic
interpretation [13]. PCA for dimensionality reduction of the
feature space ignores information from the class labels and
we propose next a new data encoding suitable for learning
principal components that can be used for classification.
2.1. Data encoding with Class
Class score vectors have recently been used as node attributes
in a graph model for image segmentation [14]. We propose
likewise to use that information explicitly by creating a train-
ing dataset noted Tα={z(i)
α}i=1,··· ,N from the dataset B,
where each instance z(i)
αconcatenates the feature vector x(i)
with its class vector y(i)as follow:
zα= (1 −α)·x
0y+α·0x
y(1)
where 0xand 0yare the null vectors of feature space Rdx
and class space Rncrespectively. The scalar 0≤α≤1is
controlling the weight of the class vector w.r.t. the feature
vector, and it is a hyper-parameter in this new framework.
The training dataset Tαis stored in a data matrix noted Zα=
[z(1)
α,· · · ,z(N)
α]. The matrix Zαconcatenates vertically the
matrix Xand the matrix Y=[y(1),· · · ,y(N)]as follows:
Zα=(1 −α)·X
α·Y(2)
We note dz=dx+ncthe dimension of vectors zα, and the
matrix Zαis of size dz×N.
2.2. Principal components
The dz×dzcovariance matrix Σαis computed as follow:
Σα=1
NZαZT
α= UαΛαUT
α(3)
In our experiments, we used Singular Value Decomposition
(SVD) to compute the diagonal matrix Λαof eigenvalues
{λi}i=1,··· ,dzof Σαand with the corresponding eigenvectors
stored as columns in the matrix Uα= [u1,· · · ,udz]. For
large training dataset (N >> 0), more efficient algorithms
arXiv:2210.12746v2 [cs.LG] 26 Oct 2022