kafe2 - a Modern Tool for Model Fitting in Physics Lab Courses Johannes G aler1 G unter Quast1 Daniel Savoiu2 Cedric Verstege1 1Karlsruhe Institute of Technology KIT2Hamburg University

2025-05-06 0 0 389.57KB 10 页 10玖币
侵权投诉
kafe2 - a Modern Tool for Model Fitting in Physics Lab Courses
Johannes G¨aßler(1), G¨unter Quast(1), Daniel Savoiu(2), Cedric Verstege(1)
(1) Karlsruhe Institute of Technology (KIT), (2) Hamburg University
October 25, 2022
Abstract
Fitting models to measured data is one of the standard tasks in the natural sciences, typically
addressed early on in physics education in the context of laboratory courses, in which statistical methods
play a central role in analysing and interpreting experimental results. The increased emphasis placed on
such methods in modern school curricula, together with the availability of powerful free and open-source
software tools geared towards scientific data analysis, form an excellent premise for the development
of new teaching concepts for these methods at the university level. In this article, we present kafe2,
a new tool developed at the Faculty of Physics at the Karlsruhe Institute of Technology, which has
been used in physics laboratory courses for several years. Written in the Python programming language
and making extensive use of established numerical and optimization libraries, kafe2 provides simple
but powerful interfaces for numerically fitting model functions to data. The tools provided allow for
fine-grained control over many aspects of the fitting procedure, including the specification of the input
data and of arbitrarily complex model functions, the construction of complex uncertainty models, and
the visualization of the resulting confidence intervals of the model parameters.
1 Model fitting in lab courses: pitfalls and limitations
A primary goal of data analysis in physics is to check whether an assumed model adequately describes
the experimental data within the scope of their uncertainties. A regression procedure is used to infer the
causal relationships between independent and dependent variables, the abscissa values xiand the ordinate
values yi. The dependency is typically represented by a functional relation yi=f(xi;{pk}) with a set of
model parameters {pk}. Both the yiand the xiare the results of measurements and hence are affected
by unavoidable measurement uncertainties that must be taken into account when determining confidence
intervals for the model parameters. It has become a common standard in the natural sciences to determine
confidence intervals for the parameter values based on a maximum-likelihood estimation (MLE). In practice,
one typically uses the negative logarithm of the likelihood function (NLL), which is minimised by means of
numerical methods.
In the basic laboratory courses, parameter estimation is traditionally based on the method of least
squares (LSQ), which in many cases is equivalent to the NLL method. For example, fitting a straight line
to data affected by static Gaussian uncertainties on the ordinate values yiis an analytically solvable problem
and the necessary calculations can be carried out by means of a pocket calculator or a simple spreadsheet
application. For simplicity, non-linear problems are often converted to linear ones by transforming the
yvalues. However, the Gaussian nature of the uncertainties is lost in this process, which is why the
transformed problem only gives accurate results if the uncertainties are sufficiently small. Simple error
propagation is often used to transform the fitted parameters like the slope or yintercept to the model
parameters of interest. As a result, the parameter uncertainties obtained in this way are no longer Gaussian,
and it is non-trivial to determine sensible confidence intervals. When the independent variables are also
affected by measurement uncertainties, or when relative uncertainties with respect to the true values are
present, this traditional approach reaches its limits.
Such simple analytical methods are very limited in their general applicability to real-world problems. For
instance, the presence of uncertainties on the abscissa values in addition to those on the ordinate values,
as they would be encountered when measuring a simple current-voltage characteristic of an electronic
component, already results in a problem that can generally no longer be solved analytically. Unfortunately,
1
arXiv:2210.12768v1 [physics.ed-ph] 23 Oct 2022
most of the widely used numerical tools for parameter estimation provide only limited support for such
non-linear problems, even though they occur frequently in practice.
Prior to drawing any conclusions on model parameters, the most important task in physics is to check
the validity of a model hypothesis assumed to describe a set of measurement data. The full task therefore
consists of the following steps:
1. carefully quantifying the uncertainties of all input measurements;
2. defining a model hypothesis for the measured data;
3. testing whether the measurements are compatible with the model hypothesis;
4. if so, determining the model parameters, e.g. the slope and intercept of a straight line.
It is precisely the hypothesis testing in step 3 that drives progress in scientific understanding by providing
the relation between measurements and theoretical models. A simple example for such a hypothesis test is
the χ2test, which results directly from the frequently used method of least squares. A test for the validity
of the model assumption should definitely be performed before drawing any conclusions from the values and
uncertainties of the fitted model parameters. Unfortunately, many of the common tools come with a default
setting that assumes the correctness of the given parameterization and scales the parameter uncertainties
so that the model perfectly describes the data within the parameter uncertainties so determined. This
behaviour can usually be switched off by choosing appropriate options, but in practice this is often neglected.
The parameters of a complex model can be highly correlated. In this case it is no longer sufficient to
consider the distribution of just a single parameter. Instead the shared distribution of multiple parameters
needs to be considered, for example when conducting hypothesis testing. Correlations between model
parameters can also be problematic for numerical optimisation. For this reason strategies for choosing
appropriate parameterizations to reduce these correlations also need to be addressed. It is then indispensable
that the correlations are also shown when presenting the fit results. Ideally the correlation between two
model parameters can be expressed with a simple correlation coefficient. However, if non-linear models are
fitted, this is often not sufficient for the evaluation of the result; in this case the corresponding confidence
regions should be determined and presented as contour graphs for pairs of parameters.
The distinction between ”statistical” and ”systematic” uncertainties leads to many misunderstandings
among students. For example, the systematic error of a measuring instrument becomes a statistical one
if several measuring instruments of the same type are used. A much more suitable approach is to differ-
entiate whether uncertainties affect all measured values or groups of measured values equally or whether
they are independent. This approach, however, requires dealing with the covariance matrix of measure-
ment uncertainties and constructing it from the individual uncertainties on a problem-by-problem basis.
Unfortunately, there are very few simple tools that allow a full covariance matrix to be considered in the
fitting process. None of the tools commonly used in physics lab courses allow students to take into account
correlated uncertainties of both the abscissa and ordinate values.
While there are tools with which the requirements listed here could be partially implemented, these do
not provide the simplicity needed for undergraduate physics education. To address the issues described
above it became necessary to write a tool of our own, kafe2, an open-source Python package designed to
provide a flexible Python interface for the estimation of model parameters from measured data.
2 The Open-Source Python Package kafe2
After designing a prototype package, kafe, development continued with kafe2 [1]. Said tools make use
of contemporary methods for visualization and evaluation of measurement data and are based on freely
available, open-source software that is also used in scientific contexts and thus relevant for later professional
practice.
2.1 Implementation
Due to its widespread use in scientific communities as well as in the burgeoning professional field of data
science, Python [2] was chosen as the programming language for kafe2. As a high-level language with
particular emphasis on clear and intuitive syntax, and convenient features such as dynamic typing and
2
摘要:

kafe2-aModernToolforModelFittinginPhysicsLabCoursesJohannesGaler(1),GunterQuast(1),DanielSavoiu(2),CedricVerstege(1)(1)KarlsruheInstituteofTechnology(KIT),(2)HamburgUniversityOctober25,2022AbstractFittingmodelstomeasureddataisoneofthestandardtasksinthenaturalsciences,typicallyaddressedearlyoninph...

展开>> 收起<<
kafe2 - a Modern Tool for Model Fitting in Physics Lab Courses Johannes G aler1 G unter Quast1 Daniel Savoiu2 Cedric Verstege1 1Karlsruhe Institute of Technology KIT2Hamburg University.pdf

共10页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:10 页 大小:389.57KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 10
客服
关注