Machine Learning Stopping Power ESPNN a novel Electronic Stopping Power neural-network code built on the IAEA stopping power database. I. Atomic targets

2025-05-02 0 0 1.82MB 26 页 10玖币
侵权投诉
Machine Learning Stopping Power
ESPNN: a novel Electronic Stopping Power neural-network code built on the IAEA
stopping power database. I. Atomic targets
F. Bivort Haiek,1A.M.P. Mendez,2C.C. Montanari,2and D.M. Mitnik2, a)
1)Miner´ıa de Datos y Descubrimiento del Conocimiento,
Universidad de Buenos Aires, Argentina.
2)Instituto de Astronom´ıa y F´ısica del Espacio, CONICET and Universidad de
Buenos Aires, Argentina.
(Dated: 16 December 2022)
The International Atomic Energy Agency (IAEA) stopping power database is a highly
valued public resource compiling most of the experimental measurements published
over nearly a century. The database–accessible to the global scientific community–is
continuously updated and has been extensively employed in theoretical and exper-
imental research for more than thirty years. This work aims to employ machine
learning algorithms on the 2021 IAEA database to predict accurate electronic stop-
ping power cross sections for any ion and target combination in a wide range of
incident energies. Unsupervised machine learning methods are applied to clean the
database in an automated manner. These techniques purge the data by removing
suspicious outliers and old isolated values. A large portion of the remaining data is
used to train a deep neural network, while the rest is set aside, constituting the test
set. The present work considers collisional systems only with atomic targets. The
first version of the espnn (electronic stopping power neural-network code), openly
available to users, is shown to yield predicted values in excellent agreement with the
experimental results of the test set.
a)Electronic mail: dmitnik@df.uba.ar
1
arXiv:2210.10950v2 [physics.atm-clus] 14 Dec 2022
Machine Learning Stopping Power
I. INTRODUCTION
At the end of 2015, the Nuclear Data Section of the International Atomic Energy Agency1
(IAEA) inherited the monumental work done by Paul2–7. He collected about 1000 exper-
imental stopping power measurements made in multiple laboratories worldwide, including
publications from as early as 1928. He kick-started his database project in 1990 at the Uni-
versity of Linz, and it has been available to the scientific community since then. The IAEA
assumed the responsibility of maintaining, updating, and disseminating this data collection6,
which included tables, figures, and comparisons of the published stopping data for ions in
atomic targets, compounds, and new materials of technological interest. An overview of the
database contents can be found in the review of Montanari and Dimitriou8.
The stopping power is the mean energy loss per unit path length of the projectile in
many collisional processes. Calculating the electronic stopping power involves determining
the target system probabilities of occupying any electronic state different from the initial
one due to the transfer of energy from the ion to the target electrons. Several reviews on
this subject are available in the literature9–11. Different methods and semiempirical codes
freely available online are linked and scrutinized on the IAEA stopping power website12.
Certainly, the code most widely used is srim. This code is based on a semiempirical method
developed by Ziegler13. As reported in their work14, it reproduces 64% of the data with an
overall accuracy of 5%. Noteworthy, the latest version of srim includes measurements only
up until 2013. Essential differences between the srim predictions and new measurements
have been reported since then15–18.
The present work is the first of a series of publications where we design a robust and gen-
eral model to accurately calculate the electronic stopping power in different target materials
and along an extended energy range. To fulfill this task, we developed a machine learning
(ML) model based on a clustering technique and a deep neural-network (NN) method.
The IAEA database was built by gathering published articles from diverse authors; hence,
it needs to be standardized in its original form. The data are presented in various units and
formats. Depending on the ion–target system, the experimental values are given as stopping
power per unit length or cross sections per mass or atom. The database contains dozens
of thousands of input values only for mono-elemental targets. As a preliminary work, we
devoted significant efforts to reorganizing the database, unifying the units, and arranging
2
Machine Learning Stopping Power
the data in a standard (csv) format, which enabled easy and quick access to the compiled
data. The first task consisted of cleaning the curated data outside the general trend. Purging
these data by hand requires considerable amounts of work and is not recommended. Instead,
we developed an unsupervised-machine-learning-based method to clean up the database
by implementing a filtering algorithm and a cluster analysis. This clustering technique,
called dbscan, identifies outlier values and determines which data to keep in the cases of
inconsistent overlapping. The cleaning procedure is shown schematically in the left dashed-
box of Fig. 1.
FIG. 1. Graphic scheme of the electronic stopping power with neural-network (espnn) model. See
the text for details.
As illustrated in Fig. 1, the cleaned database becomes the input of the second supervised-
machine-learning method consisting of a deep neural-network. This network has many basic
units (neurons) arranged in layers. Each neuron receives input signals from the previous
elements, processes it through some weighted non-linear function, and transmits the result-
ing output to neurons belonging to the next layer. The neural-network contains a known
input (projectiles, targets, energies), and a known output (the corresponding experimental
stopping power). The training is performed by adjusting the weights, minimizing the differ-
ences between the final processed output of the network (predictions) and the experimental
results. In that way, the model can accurately reproduce the experimental values and, hope-
fully, predict new results in the cases not included in the training procedure (the test set).
The resulting model and code espnn (electronic stopping power with neural-network) are
presented in this work. In this first article, we report the results obtained only for atomic
targets. Our results were obtained with excellent accuracy using different error metrics, such
as MAPE (Mean Average Percentage Error) or MAE (Mean Absolute Error).
Section II describes the machine learning (ML) methods employed to depurate the
database, showing a few examples of the original data and the remaining final input em-
ployed in the network training. In Section III, the deep neural-network architecture is
3
Machine Learning Stopping Power
described. An overview of the training, validation and test sets is presented; some details
concerning the training procedure are also discussed. Section IV presents selected results
and the analysis of distinct error metrics, which give a sense of the method’s accuracy. In
Appendix B, we provide instructions for installing and using the espnn code.
II. DATABASE CLEANSING
A. Database review
The database (updated in December 2021) consists of 60173 experimental measurements,
representing stopping power values for 1491 ion–target combinations of 49 projectiles col-
liding with 283 targets across the energy range 104104MeV/amu and ion and target
atomic masses from 1 to 240 amu. Concerning only the mono-elemental targets, there are
706 collision cases composed of 44 different projectiles and 73 targets, resulting in 36544
experimental data points. The experimental data summarize 1190 publications covering
the period 1928–2021. The reorganization of the database allows for performing extensive
statistical analysis. This task would indicate, for instance, the lack of data in certain energy
regions, over-measured and under-measured systems, which may guide experimental groups
regarding the necessity of new findings. A detailed analysis of the database’s current status
will be presented in a forthcoming article.
The raw data collected come from several publications; the results of the same ion–target
collision may show significant discrepancies (much larger than the error of the individual
set of experimental data). Cleaning the database is crucial; well-thought criteria must be
adopted for selecting the most reliable values, accompanied by careful examination of the
outcome. The immense difficulties of scrutinizing such an extensive dataset are resolved
by implementing a straightforward ML-based method. First, the dbscan classification
algorithm is used to group similar results in clusters and to identify outliers, i.e., values
suspected of being erroneous. Then, an algorithm is developed to assess clusters and outliers
by introducing different criteria for overlapping and isolated data. In the following, we briefly
explain these algorithms.
4
Machine Learning Stopping Power
B. The dbscan algorithm
Clustering algorithms are attractive for the task of class identification in spatial databases.
However, most well-known unsupervised classification algorithms suffer severe drawbacks
when applied to large spatial databases. That is, elements in the same cluster may not share
enough similarities, or performance may be poor. Also, while partition-based algorithms,
such as K-means, may be easy to understand and implement in practice, the algorithm has
no notion of outliers; all points are assigned to a cluster, even if they do not belong to
any. Moreover, anomalous points draw the cluster’s centroid toward them, making it more
difficult to classify them as anomalous points. In contrast, density-based clustering locates
regions of high density that are separated by regions of low density (in this context, density
is defined as the number of points within a specified radius).
In this work, we used the density-based spatial clustering of applications with noise
algorithm dbscan19,20. The main idea behind this technique is the following: for a set of
points in some space, the algorithm groups together values that are closely packed (points
with many nearby neighbors), marking as outlier points that lie alone in low-density regions
(whose nearest neighbors are too far away). It requires two input parameters: the radius
of the neighborhood, , and the number of reachable points (within a distance ), Nmin,
required to form a dense region. All points belonging to an -neighborhood configure a
cluster. All points not reachable from any other point are outliers or noise points. dbscan
is significantly effective in discovering clusters of arbitrary shapes, which makes it a standard
clustering algorithm and one of the most cited in the scientific literature. This method has
numerous advantages: it does not require one to specify the number of clusters in the data a
priori (as opposed to K-means, for example). dbscan can find arbitrarily shaped clusters;
it has a notion of noise, is robust to outliers, and is mainly insensitive to the data order.
5
摘要:

MachineLearningStoppingPowerESPNN:anovelElectronicStoppingPowerneural-networkcodebuiltontheIAEAstoppingpowerdatabase.I.AtomictargetsF.BivortHaiek,1A.M.P.Mendez,2C.C.Montanari,2andD.M.Mitnik2,a)1)MineradeDatosyDescubrimientodelConocimiento,UniversidaddeBuenosAires,Argentina.2)InstitutodeAstronoma...

展开>> 收起<<
Machine Learning Stopping Power ESPNN a novel Electronic Stopping Power neural-network code built on the IAEA stopping power database. I. Atomic targets.pdf

共26页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:26 页 大小:1.82MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 26
客服
关注