2 S. Bendazzoli, M. Astaraki
anatomies from PET-CT images is learned by training an inpainting model.
In specific, a Partial Convolution Neural Network (PCNN) [4] was employed
to capture the distributions of healthy anatomies. Prior information regarding
the appearance of the tumors was, then, estimated by calculating the residuals
between the reconstructed fake healthy images from the tumoral ones. Second,
the prior information highlighting the presence of candidate tumoral regions was
added as an additional channel into a supervised segmentation network in order
to guide the attention of the model to the candidate regions.
2.1 Estimating the tumor appearance
Estimating the prior information regarding the appearance of tumors can be
achieved by, first, modeling the healthy anatomies and then detecting the tumors
as anomalies. To model the distribution of complicated healthy anatomies from
whole-body PET-CT volumes, we employed a PCNN model as a robust inpaint-
ing network. This inpainting model can replace the pathological regions with the
characteristics of nearby healthy tissues and generate plausible pathology-free
images with realistic-looking and anatomically meaningful visual patterns. This
can be achieved by the following two steps: 1) forcing the model to learn the
appearance of healthy anatomies, and 2) guiding the model to inpaint only the
tumoral regions.
To learn the attributes of healthy anatomies, healthy image slices from the PET-
CT dataset were employed as the training set of the inpainting model. In specific,
more than 30000 healthy image slices were used for the training set, while the
pathological slices were employed for the testing set. Considering the large di-
versity in the shape, size, and location of the tumoral regions, random irregular
shapes were synthesized by combining regular geometrical shapes, including cir-
cles, squares, and ellipses, to corrupt the healthy images. The PCNN model is
trained until it fills the random holes and replaces the gaps with meaningful
anatomical and imagery patterns. The objective function of the PCNN model is
constructed based on several loss terms, including per-pixel loss, perceptual loss,
style loss, and total variation loss. This multi-objective optimization leverage
the quality of the inpainted images and reconstructs high-quality image while
preserving the anatomical details. The performance of the PCNN model was
evaluated by quantifying the following metrics: peak signal-to-noise ratio, mean
square error, and structural similarity index.
During the training step, the model learns to fill the random holes with the
attributes of the nearby healthy tissues. This process enforces the model to
model the distribution of the healthy anatomies. Therefore, in the test phase,
the learned model can be used to replace the tumoral regions with visual char-
acteristics of healthy tissues. However, such a tumoral removal step essentially
requires tumoral masks. While in the NAA model, a second autoencoder model
was employed to remove the tumors automatically, in this study, we utilized
the learned PCNN model to directly inpaint the tumoral regions by gaining
from the hyperintensity patterns of PET images. In specific, tumoral regions
in PET volumes often appear with higher FDG uptake with respect to nearby