Strong Gravitational Lensing Parameter Estimation with Vision Transformer Kuan-Wei Huang1 Geo Chih-Fan Chen2 Po-Wen Chang3 Sheng-Chieh

2025-05-02 0 0 1.26MB 17 页 10玖币
侵权投诉
Strong Gravitational Lensing Parameter
Estimation with Vision Transformer
Kuan-Wei Huang1,, Geoff Chih-Fan Chen2,, Po-Wen Chang3, Sheng-Chieh
Lin4, Chia-Jung Hsu5, Vishal Thengane6, and Joshua Yao-Yu Lin7,
1Carnegie Mellon University
2University of California, Los Angeles
3Ohio State University
4University of Kentucky
5Chalmers University of Technology
6Mohamed bin Zayed University of Artificial Intelligence
7University of Illinois at Urbana-Champaign
equal contribution
yaoyuyl2@illinois.edu
Abstract. Quantifying the parameters and corresponding uncertainties
of hundreds of strongly lensed quasar systems holds the key to resolv-
ing one of the most important scientific questions: the Hubble constant
(H0) tension. The commonly used Markov chain Monte Carlo (MCMC)
method has been too time-consuming to achieve this goal, yet recent work
has shown that convolution neural networks (CNNs) can be an alterna-
tive with seven orders of magnitude improvement in speed. With 31,200
simulated strongly lensed quasar images, we explore the usage of Vision
Transformer (ViT) for simulated strong gravitational lensing for the first
time. We show that ViT could reach competitive results compared with
CNNs, and is specifically good at some lensing parameters, including the
most important mass-related parameters such as the center of lens θ1and
θ2, the ellipticities e1and e2, and the radial power-law slope γ0. With
this promising preliminary result, we believe the ViT (or attention-based)
network architecture can be an important tool for strong lensing science
for the next generation of surveys. The open source of our code and data
is in https://github.com/kuanweih/strong_lensing_vit_resnet.
1 Introduction
The discovery of the accelerated expansion of the Universe [1,2] and observations
of the Cosmic Microwave Background (CMB; e.g., [3]) established the standard
cosmological paradigm: the so-called Λcold dark matter (CDM) model, where
Λrepresents a constant dark energy density. Intriguingly the recent direct 1.7%
H0measurements from Type Ia supernovae (SNe), calibrated by the traditional
Cepheid distance ladder (H0= 73.2±1.3 km s1Mpc1; SH0ES collaboration
[4]), show a 4.2σtension with the Planck results (H0= 67.4±0.5 km s1Mpc1
[5]). However, a recent measurement of H0from SNe Ia calibrated by the Tip of
the Red Giant Branch (H0= 69.8±0.8(stat) ±1.7(sys) km s1Mpc1; CCHP
arXiv:2210.04143v1 [astro-ph.CO] 9 Oct 2022
2 K.-W. Huang et al.
collaboration [6]) agrees with both the Planck and SH0ES results. The spread
in these results, whether due to systematic effects or not, clearly demonstrates
that it is crucial to reveal unknown systematics through different methodology.
Strongly lensed quasar system provides such a technique to constrain H0
at low redshift that is completely independent of the traditional distance ladder
approach (e.g., [7,8,9]). When a quasar is strongly lensed by a foreground galaxy,
its multiple images have light curves that are offset by a well-defined time delay,
which depends on the mass profile of the lens and cosmological distances to
the galaxy and the quasar [10]. However the bottleneck of using strongly lensed
quasar systems is the expensive cost of computational resources and man power.
With commonly used Markov chain Monte Carlo (MCMC) procedure, modeling
single strongly lensed quasar system requires experienced modelers with a few
months effort in order to obtain robust uncertainty estimations and up to years
to check the systematics (e.g., [11,12,13,14,15,16]). This is infeasible as 2600
of such systems with well-measured time delays are expected to be discovered in
the upcoming survey with the Large Synoptic Survey Telescope [17,18].
Fig. 1. Left panel: simulated strong lensing imaging with real point spread functions
(top two: space-based telescope images; bottom: ground-based adaptive-optics images).
Each image contains the lensing galaxy in the middle, the multiple-lensed quasar im-
ages, and the lensed background host galaxies (arc). Right panel: Vision Transformer
attention map: the overall average attentions are focusing on the strong lens system.
Each individual head is paying attention to different subjects such as attention heads
#2 are focusing the center of lens, heads #1 and #3 are looking into particular lensed
quasars, and heads #4 are dealing with the arc.
ECCV-22 submission ID 184 3
Deep learning provides a workaround for the time-consuming lens modeling
task by directly mapping the underlying relationships between the input lens-
ing images and the corresponding lensing parameters and their uncertainties.
Hezaveh et al. [19] and Perreault Levasseur et al. [20] first demonstrated that
convolution neural networks (CNNs) can be an alternative to the maximum
likelihood procedures with seven orders of magnitude improvement in speed.
Since then, other works adpot CNN for strong lensing science related inference
[21,22,23,24,25,26,27,28,29,30].
In this work, instead of using traditional CNN-based models, we explore the
attention-based Vision Transformer (ViT, [31,32]) that has been shown to be
more robust compared with CNN-based models [33]. Furthermore, ViT retains
more spatial information than ResNet [34] and hence is perfectly suitable for the
strong lensing imaging as the quasar configuration and the spatially extended
background lensed galaxy provide rich information on the foreground mass dis-
tribution (see Figure 1).
2 Data and Models
In Section 2.1, we describe the strong lensing simulation for generating the
datasets in this work. In Section 2.2, we describe the deep learning models we use
to train on the simulated dataset for strong lensing parameters and uncertainty
estimations.
2.1 Simulation and Datasets
Simulating strong lensing imaging requires four major components: the mass dis-
tribution of the lensing galaxy, the source light distribution, the lens light distri-
bution, and the point spread function (PSF), which convolves images depending
on the atmosphere distortion and telescope structures. We use the lenstron-
omy package [35,36] to generate 31,200 strong lensing images with the cor-
responding lensing parameters for our imaging multi-regression task. For the
mass distribution, we adapt commonly used (e.g., [37,15]) elliptically symmetric
power-law distributions [38] to model the dimensionless surface mass density of
lens galaxies,
κpl(θ1, θ2) = 3γ0
1 + q θE
pθ2
1+θ2
2/q2!γ01
,(1)
where γ0is the radial power-law slope (γ0= 2 corresponding to isothermal), θE
is the Einstein radius, and qis the axis ratio of the elliptical isodensity contour.
The light distribution of the lens galaxy and source galaxy are described by
elliptical S´ersic profile,
IS(θ1, θ2) = Isexp
k
pθ2
1+θ2
2/q2
L
Reff !1/nersic
1
,(2)
4 K.-W. Huang et al.
where Isis the amplitude, kis a constant such that Reff is the effective radius, qL
is the minor-to-major axis ratio, and nersic is the S´ersic index [39]. For the PSFs,
we use six different PSF structures including three real Hubble space telescope
PSFs generated by Tinytim [40] and corrected by the real HST imaging [15],
and three adaptive-optics (AO) PSFs reconstructed from ground-based Keck
AO imaging [41,42,43]. Three example images are shown in Figure 1.
We split the whole simulated dataset of 31,200 images into a training set of
27,000 images, a validation set of 3,000 images, and a test set of 1,200 images.
We rescale each image as 3 ×224 ×224 and normalize pixel values in each
color channel by the mean [0.485,0.456,0.406] and the standard deviation
[0.229,0.224,0.225] of the datasets. Each image has eight target variables to
be predicted in this task: the Einstein radius θE, the ellipticities e1and e2, the
radial power-law slope γ0, the coordinates of mass center θ1and θ2, the effective
radius Reff, and the S´ersic index nersic.
2.2 Models
We use the Vision Transformer (ViT) as the main model for our image multi-
regression task of strong lensing parameter estimations. Inspired by the origi-
nal Transformer models [31] for natural language processing tasks, Google Re-
search proposed the ViT models [32] for computer vision tasks. In this paper,
we leverage the base-sized ViT model (ViT-Base), which was pre-trained on the
ImageNet-21k dataset and fine-tuned on the ImageNet 2012 dataset [44].
Taking advantage of the transfer learning concept, we start with the pre-
trained ViT-Base model downloaded from the module of HuggingFace’s Trans-
formers [45], and replace the last layer with a fully connected layer whose num-
ber of outputs matches the number of target variables in our regression tasks.
The ViT model we use thus has 85,814,036 trainable parameters, patch size of
16, depth of 12, and 12 attention heads.
Alongside the ViT model, we also train a ResNet152 model [46] for the same
task as a comparison between ViT and the classic benchmark CNN-based model.
We leverage the pre-trained ResNet152 model from the torchvision package
[47] and modify the last layer accordingly for our multi-regression purpose.
For regression tasks, the log-likelihood can be written as a Gaussian log-
likelihood [48]. Thus for our task of Ktargets, we use the negative log likelihood
as the loss function [20]:
Lossn=−L (yn,ˆyn,ˆsn)
=1
2 K
X
k=1
eˆsn,k kyn,k ˆyn,kk2+ ˆsn,k + ln 2π!(3)
where (yn,ˆyn,ˆsn) are the (target, parameter estimation, uncertainty estimation)
for the nth sample, and (yn,k, ˆyn,k, ˆsn,k) are the (target, parameter estimation,
uncertainty estimation) for the n-th sample of the k-th target. We note that in
practice, working with the log-variance ˆsn= ln ˆσ2
ninstead of the variance ˆσ2
n
摘要:

StrongGravitationalLensingParameterEstimationwithVisionTransformerKuan-WeiHuang1;,Geo Chih-FanChen2;,Po-WenChang3,Sheng-ChiehLin4,Chia-JungHsu5,VishalThengane6,andJoshuaYao-YuLin7;1CarnegieMellonUniversity2UniversityofCalifornia,LosAngeles3OhioStateUniversity4UniversityofKentucky5ChalmersUniversi...

展开>> 收起<<
Strong Gravitational Lensing Parameter Estimation with Vision Transformer Kuan-Wei Huang1 Geo Chih-Fan Chen2 Po-Wen Chang3 Sheng-Chieh.pdf

共17页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:17 页 大小:1.26MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 17
客服
关注