Vector graphics extraction and analysis of electrical resistance data
in Nature volume 586, pages 373–377 (2020)
J. J. Hamlin1
1Department of Physics, University of Florida, Gainesville, FL 32611, USA
(Dated: September 2022)
In this paper, I present an analysis of the electrical resistance graphs in Ref. [1], which reported
the discovery of room temperature superconductivity in a carbonaceous sulfur hydride and was
subsequently retracted on September 26th, 2022. I show that, over a single temperature interval,
the electrical resistance data can be decomposed into at least two signals of differing digital precision,
thus raising questions concerning the methods used to obtain the published data. Since the raw data-
files for the electrical resistance measurements have not been made available, in order to perform
this analysis, I have developed a set of python scripts to extract the data-points with high precision
from the internal structure of the vector graphics image files. I describe the data extraction method.
Example code and the resulting electrical resistance vs temperature data-files are made available in
public repositories.
INTRODUCTION
In October 2020, a paper was published reporting the
discovery of room temperature superconductivity in a
carbonaceous sulfur hydride (CSH) under high pressure
conditions [1]. The claim was based primarily on three
pieces of evidence: diamagnetic transitions in the ac mag-
netic susceptibility, zero electrical resistance, and sup-
pression of the transition in an applied magnetic field.
On Monday, September 26th, a retraction was published,
with the notice focusing on the validity of the background
subtraction method applied to the ac magnetic suscepti-
bility data. The retraction notice notes that the authors
maintain that the raw data provide strong support for
the main claims of the original paper.
In this work, I have analyzed the electrical resistance
graphs in Fig. 1a and Fig. 2b of Ref. [1], which report
electrical resistance measurements. Hereafter, I refer to
these figures as CSH-Fig1a and CSH-Fig2b, respectively.
I find artifacts similar to those that appear in the mag-
netic susceptibility data [2]. In the case of the magnetic
susceptibility data, two of the authors of Ref. [1] stated
that such artifacts are a consequence of the user-defined
background that was subtracted from the data [3]. No
such background subtraction for the published electrical
resistance data was indicated in Ref. [1].
VECTOR-BASED DATA EXTRACTION
The authors of Ref. [1] have not made the electrical
resistance raw data files publicly available. In such situ-
ations interested parties are left to analyze the published
graphs. Typically, this is done using bitmap-based data
extraction methods implemented by programs such as
DataThief [4] and Webplotdigitizer [5]. In this method,
the axes scale bars are manually calibrated and the data
points are selected either by hand or, in some cases, using
automated methods that function with varying levels of
success. Bitmap-based data extraction faces several lim-
itations. The precision with which the data points can
be extracted is limited by the resolution of the image file.
In favorable cases, the image resolution may be approx-
imately 600 pixels-per-inch. However, figures are often
included in formats such as jpg, where image compres-
sion may introduce artifacts that blur the edges of data
points. Most importantly, data points that are visually
covered by other data points will be invisible and thus im-
possible to extract. This is often the case for graphs that
contain a large number of densely-spaced data points.
Many journals adopt policies that graphics should be
included in vector format when possible. Common vec-
tor graphics formats include pdf,eps, and svg (scalable
vector graphics). A simple way to determine whether a
graph in a paper is a vector-based image is to use the
text-selection tool to try to select some of the text in the
graph. If you are able to select the text, it is likely a
vector-based image. A vector format image will remain
sharp no matter how much you zoom-in on it. Rather
than describing graphics as a grid of pixels, these for-
mats can include instructions describing a set of paths to
draw. The full svg instruction specification is available
at Ref. [6]. For example, a square data point might be
described by specifying the coordinates of a set of four
lines. Figure 1shows examples of the path instructions
for (a) a single red, Run 2 data point in CSH-Fig1a and
(b) a single red 9 T data point in CSH-Fig2b, after con-
verting the corresponding pages of the article pdf into
svg format using pdf2svg [7]. Typically, the coordinates
associated with these paths are stored with substantially
higher precision than a corresponding bitmap equivalent.
In principle then, it is possible to analyze the instruc-
tions in a vector image file to extract the coordinates of
the data points. Extraction of the data points from the
vector image data allows extraction of points that are
completely covered by other data points and are there-
fore not directly visible when viewing the image in a pdf
arXiv:2210.10766v1 [cond-mat.supr-con] 19 Oct 2022