Machine Learning Stopping Power ESPNN a novel Electronic Stopping Power neural-network code built on the IAEA stopping power database. I. Atomic targets

2025-05-02 0 0 1.82MB 26 页 10玖币

侵权投诉

Machine Learning Stopping Power

ESPNN: a novel Electronic Stopping Power neural-network code built on the IAEA

stopping power database. I. Atomic targets

F. Bivort Haiek,1A.M.P. Mendez,2C.C. Montanari,2and D.M. Mitnik2, a)

1)Miner´ıa de Datos y Descubrimiento del Conocimiento,

Universidad de Buenos Aires, Argentina.

2)Instituto de Astronom´ıa y F´ısica del Espacio, CONICET and Universidad de

Buenos Aires, Argentina.

(Dated: 16 December 2022)

The International Atomic Energy Agency (IAEA) stopping power database is a highly

valued public resource compiling most of the experimental measurements published

over nearly a century. The database–accessible to the global scientiﬁc community–is

continuously updated and has been extensively employed in theoretical and exper-

imental research for more than thirty years. This work aims to employ machine

learning algorithms on the 2021 IAEA database to predict accurate electronic stop-

ping power cross sections for any ion and target combination in a wide range of

incident energies. Unsupervised machine learning methods are applied to clean the

database in an automated manner. These techniques purge the data by removing

suspicious outliers and old isolated values. A large portion of the remaining data is

used to train a deep neural network, while the rest is set aside, constituting the test

set. The present work considers collisional systems only with atomic targets. The

ﬁrst version of the espnn (electronic stopping power neural-network code), openly

available to users, is shown to yield predicted values in excellent agreement with the

experimental results of the test set.

a)Electronic mail: dmitnik@df.uba.ar

arXiv:2210.10950v2 [physics.atm-clus] 14 Dec 2022

Machine Learning Stopping Power

I. INTRODUCTION

At the end of 2015, the Nuclear Data Section of the International Atomic Energy Agency1

(IAEA) inherited the monumental work done by Paul2–7. He collected about 1000 exper-

imental stopping power measurements made in multiple laboratories worldwide, including

publications from as early as 1928. He kick-started his database project in 1990 at the Uni-

versity of Linz, and it has been available to the scientiﬁc community since then. The IAEA

assumed the responsibility of maintaining, updating, and disseminating this data collection6,

which included tables, ﬁgures, and comparisons of the published stopping data for ions in

atomic targets, compounds, and new materials of technological interest. An overview of the

database contents can be found in the review of Montanari and Dimitriou8.

The stopping power is the mean energy loss per unit path length of the projectile in

many collisional processes. Calculating the electronic stopping power involves determining

the target system probabilities of occupying any electronic state diﬀerent from the initial

one due to the transfer of energy from the ion to the target electrons. Several reviews on

this subject are available in the literature9–11. Diﬀerent methods and semiempirical codes

freely available online are linked and scrutinized on the IAEA stopping power website12.

Certainly, the code most widely used is srim. This code is based on a semiempirical method

developed by Ziegler13. As reported in their work14, it reproduces 64% of the data with an

overall accuracy of 5%. Noteworthy, the latest version of srim includes measurements only

up until 2013. Essential diﬀerences between the srim predictions and new measurements

have been reported since then15–18.

The present work is the ﬁrst of a series of publications where we design a robust and gen-

eral model to accurately calculate the electronic stopping power in diﬀerent target materials

and along an extended energy range. To fulﬁll this task, we developed a machine learning

(ML) model based on a clustering technique and a deep neural-network (NN) method.

The IAEA database was built by gathering published articles from diverse authors; hence,

it needs to be standardized in its original form. The data are presented in various units and

formats. Depending on the ion–target system, the experimental values are given as stopping

power per unit length or cross sections per mass or atom. The database contains dozens

of thousands of input values only for mono-elemental targets. As a preliminary work, we

devoted signiﬁcant eﬀorts to reorganizing the database, unifying the units, and arranging

Machine Learning Stopping Power

the data in a standard (csv) format, which enabled easy and quick access to the compiled

data. The ﬁrst task consisted of cleaning the curated data outside the general trend. Purging

these data by hand requires considerable amounts of work and is not recommended. Instead,

we developed an unsupervised-machine-learning-based method to clean up the database

by implementing a ﬁltering algorithm and a cluster analysis. This clustering technique,

called dbscan, identiﬁes outlier values and determines which data to keep in the cases of

inconsistent overlapping. The cleaning procedure is shown schematically in the left dashed-

box of Fig. 1.

FIG. 1. Graphic scheme of the electronic stopping power with neural-network (espnn) model. See

the text for details.

As illustrated in Fig. 1, the cleaned database becomes the input of the second supervised-

machine-learning method consisting of a deep neural-network. This network has many basic

units (neurons) arranged in layers. Each neuron receives input signals from the previous

elements, processes it through some weighted non-linear function, and transmits the result-

ing output to neurons belonging to the next layer. The neural-network contains a known

input (projectiles, targets, energies), and a known output (the corresponding experimental

stopping power). The training is performed by adjusting the weights, minimizing the diﬀer-

ences between the ﬁnal processed output of the network (predictions) and the experimental

results. In that way, the model can accurately reproduce the experimental values and, hope-

fully, predict new results in the cases not included in the training procedure (the test set).

The resulting model and code espnn (electronic stopping power with neural-network) are

presented in this work. In this ﬁrst article, we report the results obtained only for atomic

targets. Our results were obtained with excellent accuracy using diﬀerent error metrics, such

as MAPE (Mean Average Percentage Error) or MAE (Mean Absolute Error).

Section II describes the machine learning (ML) methods employed to depurate the

database, showing a few examples of the original data and the remaining ﬁnal input em-

ployed in the network training. In Section III, the deep neural-network architecture is

Machine Learning Stopping Power

described. An overview of the training, validation and test sets is presented; some details

concerning the training procedure are also discussed. Section IV presents selected results

and the analysis of distinct error metrics, which give a sense of the method’s accuracy. In

Appendix B, we provide instructions for installing and using the espnn code.

II. DATABASE CLEANSING

A. Database review

The database (updated in December 2021) consists of 60173 experimental measurements,

representing stopping power values for 1491 ion–target combinations of 49 projectiles col-

liding with 283 targets across the energy range 10−4−104MeV/amu and ion and target

atomic masses from 1 to 240 amu. Concerning only the mono-elemental targets, there are

706 collision cases composed of 44 diﬀerent projectiles and 73 targets, resulting in 36544

experimental data points. The experimental data summarize 1190 publications covering

the period 1928–2021. The reorganization of the database allows for performing extensive

statistical analysis. This task would indicate, for instance, the lack of data in certain energy

regions, over-measured and under-measured systems, which may guide experimental groups

regarding the necessity of new ﬁndings. A detailed analysis of the database’s current status

will be presented in a forthcoming article.

The raw data collected come from several publications; the results of the same ion–target

collision may show signiﬁcant discrepancies (much larger than the error of the individual

set of experimental data). Cleaning the database is crucial; well-thought criteria must be

adopted for selecting the most reliable values, accompanied by careful examination of the

outcome. The immense diﬃculties of scrutinizing such an extensive dataset are resolved

by implementing a straightforward ML-based method. First, the dbscan classiﬁcation

algorithm is used to group similar results in clusters and to identify outliers, i.e., values

suspected of being erroneous. Then, an algorithm is developed to assess clusters and outliers

by introducing diﬀerent criteria for overlapping and isolated data. In the following, we brieﬂy

explain these algorithms.

Machine Learning Stopping Power

B. The dbscan algorithm

Clustering algorithms are attractive for the task of class identiﬁcation in spatial databases.

However, most well-known unsupervised classiﬁcation algorithms suﬀer severe drawbacks

when applied to large spatial databases. That is, elements in the same cluster may not share

enough similarities, or performance may be poor. Also, while partition-based algorithms,

such as K-means, may be easy to understand and implement in practice, the algorithm has

no notion of outliers; all points are assigned to a cluster, even if they do not belong to

any. Moreover, anomalous points draw the cluster’s centroid toward them, making it more

diﬃcult to classify them as anomalous points. In contrast, density-based clustering locates

regions of high density that are separated by regions of low density (in this context, density

is deﬁned as the number of points within a speciﬁed radius).

In this work, we used the density-based spatial clustering of applications with noise

algorithm dbscan19,20. The main idea behind this technique is the following: for a set of

points in some space, the algorithm groups together values that are closely packed (points

with many nearby neighbors), marking as outlier points that lie alone in low-density regions

(whose nearest neighbors are too far away). It requires two input parameters: the radius

of the neighborhood, , and the number of reachable points (within a distance ), Nmin,

required to form a dense region. All points belonging to an -neighborhood conﬁgure a

cluster. All points not reachable from any other point are outliers or noise points. dbscan

is signiﬁcantly eﬀective in discovering clusters of arbitrary shapes, which makes it a standard

clustering algorithm and one of the most cited in the scientiﬁc literature. This method has

numerous advantages: it does not require one to specify the number of clusters in the data a

priori (as opposed to K-means, for example). dbscan can ﬁnd arbitrarily shaped clusters;

it has a notion of noise, is robust to outliers, and is mainly insensitive to the data order.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

MachineLearningStoppingPowerESPNN:anovelElectronicStoppingPowerneural-networkcodebuiltontheIAEAstoppingpowerdatabase.I.AtomictargetsF.BivortHaiek,1A.M.P.Mendez,2C.C.Montanari,2andD.M.Mitnik2,a)1)MineradeDatosyDescubrimientodelConocimiento,UniversidaddeBuenosAires,Argentina.2)InstitutodeAstronoma...

展开>> 收起<<

Machine Learning Stopping Power ESPNN a novel Electronic Stopping Power neural-network code built on the IAEA stopping power database. I. Atomic targets.pdf

共26页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Machine Learning Stopping Power ESPNN a novel Electronic Stopping Power neural-network code built on the IAEA stopping power database. I. Atomic targets

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: