UNSUPERVISED PARTICLE SORTING FOR CRYO-EM USING PROBABILISTIC PCA Gili Weiss-DickerAmitay EldaryYoel ShkolinskyyTamir Bendory School of Electrical Engineering Tel Aviv University

2025-05-06 0 0 3.65MB 5 页 10玖币
侵权投诉
UNSUPERVISED PARTICLE SORTING FOR CRYO-EM USING PROBABILISTIC PCA
Gili Weiss-Dicker ?Amitay Eldar Yoel ShkolinskyTamir Bendory?
?School of Electrical Engineering, Tel Aviv University
Department of Applied Mathematics, School of Mathematical Sciences, Tel Aviv University
ABSTRACT
Single-particle cryo-electron microscopy (cryo-EM) is a lead-
ing technology to resolve the structure of molecules. Early
in the process, the user detects potential particle images in
the raw data. Typically, there are many false detections as a
result of high levels of noise and contamination. Currently,
removing the false detections requires human intervention
to sort the hundred thousands of images. We propose a
statistically-established unsupervised algorithm to remove
non-particle images. We model the particle images as a union
of low-dimensional subspaces, assuming non-particle images
are arbitrarily scattered in the high-dimensional space. The
algorithm is based on an extension of the probabilistic PCA
framework to robustly learn a non-linear model of union of
subspaces. This provides a flexible model for cryo-EM data,
and allows to automatically remove images that correspond
to pure noise and contamination. Numerical experiments
corroborate the effectiveness of the sorting algorithm.
Index TermsUnsupervised learning, single-particle
cryo-EM, probabilistic PCA, expectation-maximization
1 INTRODUCTION
Single-particle cryo-electron microscopy (cryo-EM) is an
emerging technology to determine the structure of molecules.
In the cryo-EM process, the acquired “raw data” image, called
a micrograph, contains a few dozens of 2-D tomographic par-
ticle projection images with unknown random orientations
and locations. The micrograph suffers from low signal-to-
noise ratio (SNR), as low as 1
100 . Typically, it also contains
undesired contamination. For the purpose of this paper, the
pixels in a micrograph can be broadly divided into three cat-
egories: regions of particles with additive noise, regions of
contamination, and regions of noise only.
During the cryo-EM workflow, particle images are detected
and extracted from micrographs in a process called particle
picking [1, 2]. The extracted images are the individual parti-
cles within each micrograph. If only particles were picked,
the images chosen by the particle picker would have been
used to construct the 3-D molecular structure. Figure 1 illus-
trates a schematic sequence of computational steps typically
used to convert the raw data into 3-D molecular structures.
While many particle picking algorithms were developed, e.g.,
[3, 4, 5], due to very low SNR levels they result in contam-
ination and pure noise images picked along with the particle
images. Typical images chosen by a picking algorithm can be
seen in Figure 2.
A common approach to remove non-particle images, called
“2-D classification, is semi-automatic and involves an ex-
pert practitioner; it relies heavily on subjective criteria that
are neither consistent nor reproducible among different users.
We propose an automatic and statistically-established unsu-
pervised algorithm to remove non-particle images from the
data. Specifically, we assume that all particle images approx-
imately lie on a union of subspaces, whereas the non-particle
images are scattered in the high-dimensional space. Similar
parsimonious models are ubiquitous in many signal process-
ing tasks, and specifically in different stages of the cryo-EM
computational pipeline [7, 8]. The main computational tool
in this work is an extension of principal component analysis
(PCA). PCA has been applied to cryo-EM data for several
tasks [7, 9]. However, PCA is limited since it learns a single
subspace. We build on a maximum likelihood formulation,
called probabilistic PCA (PPCA) [10, 11, 12]. In particu-
lar, we iteratively estimate the union of subspaces using an
expectation-maximization (EM) algorithm, while sorting out
images that do not lie on the subspaces.
PPCA offers several attractive advantages over PCA. First,
PPCA can be readily extended to multiple subspaces, leading
to a nonlinear flexible mixture model. Second, we work in the
dimension of the problem (i.e., the number of parameters that
define the sought subspaces) in contrast to standard PCA that
requires estimating the full covariance matrix.
Fig. 1: A typical data processing pipeline for single-particle cryo-EM. The main contribution of this work is the fully automated sorting block,
aiming to replace the need of an expert practitioner involved in the sorting step.
arXiv:2210.12811v2 [eess.IV] 7 Mar 2023
摘要:

UNSUPERVISEDPARTICLESORTINGFORCRYO-EMUSINGPROBABILISTICPCAGiliWeiss-Dicker?AmitayEldaryYoelShkolinskyyTamirBendory??SchoolofElectricalEngineering,TelAvivUniversityyDepartmentofAppliedMathematics,SchoolofMathematicalSciences,TelAvivUniversityABSTRACTSingle-particlecryo-electronmicroscopy(cryo-EM)isal...

展开>> 收起<<
UNSUPERVISED PARTICLE SORTING FOR CRYO-EM USING PROBABILISTIC PCA Gili Weiss-DickerAmitay EldaryYoel ShkolinskyyTamir Bendory School of Electrical Engineering Tel Aviv University.pdf

共5页,预览1页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:5 页 大小:3.65MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 5
客服
关注