JOINT RIGID MOTION CORRECTION AND SPARSE-VIEW CT
VIA SELF-CALIBRATING NEURAL FIELD
Qing Wu†Xin Li†Hongjiang Wei‡Jingyi Yu†Yuyao Zhang†,?
†School of Information Science and Technology, ShanghaiTech University, Shanghai, China
‡School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
ABSTRACT
Neural Radiance Field (NeRF) has widely received attention in
Sparse-View Computed Tomography (SVCT) reconstruction tasks
as a self-supervised deep learning framework. NeRF-based SVCT
methods represent the desired CT image as a continuous function
of spatial coordinates and train a Multi-Layer Perceptron (MLP) to
learn the function by minimizing loss on the SV sinogram. Ben-
efiting from the continuous representation provided by NeRF, the
high-quality CT image can be reconstructed. However, existing
NeRF-based SVCT methods strictly suppose there is completely
no relative motion during the CT acquisition because they require
accurate projection poses to model the X-rays that scan the SV
sinogram. Therefore, these methods suffer from severe performance
drops for real SVCT imaging with motion. In this work, we pro-
pose a self-calibrating neural field to recover the artifacts-free image
from the rigid motion-corrupted SV sinogram without using any
external data. Specifically, we parametrize the inaccurate projection
poses caused by rigid motion as trainable variables and then jointly
optimize these pose variables and the MLP. We conduct numerical
experiments on a public CT image dataset. The results indicate
our model significantly outperforms two representative NeRF-based
methods for SVCT reconstruction tasks with four different levels of
rigid motion.
Index Terms—Sparse-View CT Reconstruction, Rigid Motion
Correction, Neural Radiance Field, Self-Supervised Learning.
1. INTRODUCTION
Sparse-View Computed Tomography (SVCT) can significantly re-
duce the radiation dose and shorten the scanning time by decreasing
the number of radiation views. However, the insufficient projection
measurement (i.e., SV sinogram) in SVCT will suffer from severe
streaking artifacts if applying conventional analytical reconstruction
algorithms such as Filtered Back-Projection (FBP) [1], which signif-
icantly degrades image quality.
Recently, several self-supervised SVCT methods [2–6] based
on Neural Radiance Field (NeRF) [7] have been emerged. Differ-
ent from supervised deep learning models [8,9], these NeRF-based
methods can recover the high-quality CT image from the SV sino-
gram without using any external data. Specifically, NeRF-based
methods first represent the unknown CT image as a continuous func-
tion that maps coordinates to intensities and then train a Multi-Layer
Perceptron (MLP) to learn the function by minimizing prediction er-
rors on the SV sinogram. Benefiting from the implicit continuous
prior imposed by the function and the neural network prior [10–12],
the function can be approximated well, and thus the desired high-
quality CT image will be reconstructed.
Existing NeRF-based SVCT methods [2–6] suppose there is
completely no relative motion during the CT acquisition process,
which benefits that the accurate projection poses can be accessi-
ble for modeling the X-rays that scan the SV sinogram. However,
relative motion, especially rigid motion, is common and even in-
evitable [13–16] during the real CT acquisition process due to
various factors (e.g., imaging subject’s movement and CT scanner’s
system error, etc.). Therefore, this overly strict assumption will re-
sult in severe model performance drops in real SVCT reconstruction
with motion.
In this paper, we propose a self-calibrating neural field that can
reconstruct the artifacts-free image from the rigid motion-corrupted
SV measurement without involving any external data. Like the pre-
vious works [2–6], our proposed model is also based on NeRF’s [17]
framework (i.e., using an MLP to learn the function of desired CT
image through minimizing loss on the SV sinogram). The major
novelty of our model is that it extra models the rigid motion in the
CT acquisition process and thus can produce robust and excellent
results for SVCT imaging with rigid motion. More specifically, we
first parameterize inaccurate projection poses caused by rigid mo-
tion as three trainable variables (a rotation angle and two transla-
tion offsets). Then, we jointly optimize these pose variables and
the MLP representation. After the poses calibration and MLP opti-
mization, the final high-quality CT image can be reconstructed. We
conduct numerical experiments on a public COVID-19 CT image
dataset [18]. Experimental results shows that the proposed model
significantly outperforms the two representative NeRF-based meth-
ods [2,3] for SVCT reconstruction tasks with four different levels of
rigid motion. We also perform ablation study for the pose correction
module in our model. The results confirm its effectiveness.
2. BACKGROUND
Formally, NeRF-based SVCT methods [2–6] represent the unknown
and high-quality CT image x∈RN×Nas a continuous function:
M:p→I, (1)
where p= (x, y)∈R2is any spatial coordinate in a normalized
2D Cartesian coordinates [−1,1] ×[−1,1] and I∈Rdenotes the
corresponding intensity value in the CT image x.
Given the acquired SV sinogram y∈RM×N, where Mand
Ndenote the number of projections and X-rays per projection, re-
spectively. NeRF-based methods leverage an MLP FΦto learn the
continuous function Mby optimizing the objective as below:
Φ∗= arg min
Φ
L(y,AFΦ),(2)
where A∈RN×Mrepresents a differentiable projection operator
(e.g., Radon transform for parallel X-ray beam CT acquisition) and
arXiv:2210.12731v2 [eess.IV] 6 Nov 2022