kafe2 - a Modern Tool for Model Fitting in Physics Lab Courses
Johannes G¨aßler(1), G¨unter Quast(1), Daniel Savoiu(2), Cedric Verstege(1)
(1) Karlsruhe Institute of Technology (KIT), (2) Hamburg University
October 25, 2022
Abstract
Fitting models to measured data is one of the standard tasks in the natural sciences, typically
addressed early on in physics education in the context of laboratory courses, in which statistical methods
play a central role in analysing and interpreting experimental results. The increased emphasis placed on
such methods in modern school curricula, together with the availability of powerful free and open-source
software tools geared towards scientific data analysis, form an excellent premise for the development
of new teaching concepts for these methods at the university level. In this article, we present kafe2,
a new tool developed at the Faculty of Physics at the Karlsruhe Institute of Technology, which has
been used in physics laboratory courses for several years. Written in the Python programming language
and making extensive use of established numerical and optimization libraries, kafe2 provides simple
but powerful interfaces for numerically fitting model functions to data. The tools provided allow for
fine-grained control over many aspects of the fitting procedure, including the specification of the input
data and of arbitrarily complex model functions, the construction of complex uncertainty models, and
the visualization of the resulting confidence intervals of the model parameters.
1 Model fitting in lab courses: pitfalls and limitations
A primary goal of data analysis in physics is to check whether an assumed model adequately describes
the experimental data within the scope of their uncertainties. A regression procedure is used to infer the
causal relationships between independent and dependent variables, the abscissa values xiand the ordinate
values yi. The dependency is typically represented by a functional relation yi=f(xi;{pk}) with a set of
model parameters {pk}. Both the yiand the xiare the results of measurements and hence are affected
by unavoidable measurement uncertainties that must be taken into account when determining confidence
intervals for the model parameters. It has become a common standard in the natural sciences to determine
confidence intervals for the parameter values based on a maximum-likelihood estimation (MLE). In practice,
one typically uses the negative logarithm of the likelihood function (NLL), which is minimised by means of
numerical methods.
In the basic laboratory courses, parameter estimation is traditionally based on the method of least
squares (LSQ), which in many cases is equivalent to the NLL method. For example, fitting a straight line
to data affected by static Gaussian uncertainties on the ordinate values yiis an analytically solvable problem
and the necessary calculations can be carried out by means of a pocket calculator or a simple spreadsheet
application. For simplicity, non-linear problems are often converted to linear ones by transforming the
yvalues. However, the Gaussian nature of the uncertainties is lost in this process, which is why the
transformed problem only gives accurate results if the uncertainties are sufficiently small. Simple error
propagation is often used to transform the fitted parameters like the slope or yintercept to the model
parameters of interest. As a result, the parameter uncertainties obtained in this way are no longer Gaussian,
and it is non-trivial to determine sensible confidence intervals. When the independent variables are also
affected by measurement uncertainties, or when relative uncertainties with respect to the true values are
present, this traditional approach reaches its limits.
Such simple analytical methods are very limited in their general applicability to real-world problems. For
instance, the presence of uncertainties on the abscissa values in addition to those on the ordinate values,
as they would be encountered when measuring a simple current-voltage characteristic of an electronic
component, already results in a problem that can generally no longer be solved analytically. Unfortunately,
1
arXiv:2210.12768v1 [physics.ed-ph] 23 Oct 2022