Vector graphics extraction and analysis of electrical resistance data in Nature volume 586 pages 373377 2020 J. J. Hamlin1

2025-05-06 0 0 729.41KB 6 页 10玖币

侵权投诉

Vector graphics extraction and analysis of electrical resistance data

in Nature volume 586, pages 373–377 (2020)

J. J. Hamlin1

1Department of Physics, University of Florida, Gainesville, FL 32611, USA

(Dated: September 2022)

In this paper, I present an analysis of the electrical resistance graphs in Ref. [1], which reported

the discovery of room temperature superconductivity in a carbonaceous sulfur hydride and was

subsequently retracted on September 26th, 2022. I show that, over a single temperature interval,

the electrical resistance data can be decomposed into at least two signals of diﬀering digital precision,

thus raising questions concerning the methods used to obtain the published data. Since the raw data-

ﬁles for the electrical resistance measurements have not been made available, in order to perform

this analysis, I have developed a set of python scripts to extract the data-points with high precision

from the internal structure of the vector graphics image ﬁles. I describe the data extraction method.

Example code and the resulting electrical resistance vs temperature data-ﬁles are made available in

public repositories.

INTRODUCTION

In October 2020, a paper was published reporting the

discovery of room temperature superconductivity in a

carbonaceous sulfur hydride (CSH) under high pressure

conditions [1]. The claim was based primarily on three

pieces of evidence: diamagnetic transitions in the ac mag-

netic susceptibility, zero electrical resistance, and sup-

pression of the transition in an applied magnetic ﬁeld.

On Monday, September 26th, a retraction was published,

with the notice focusing on the validity of the background

subtraction method applied to the ac magnetic suscepti-

bility data. The retraction notice notes that the authors

maintain that the raw data provide strong support for

the main claims of the original paper.

In this work, I have analyzed the electrical resistance

graphs in Fig. 1a and Fig. 2b of Ref. [1], which report

electrical resistance measurements. Hereafter, I refer to

these ﬁgures as CSH-Fig1a and CSH-Fig2b, respectively.

I ﬁnd artifacts similar to those that appear in the mag-

netic susceptibility data [2]. In the case of the magnetic

susceptibility data, two of the authors of Ref. [1] stated

that such artifacts are a consequence of the user-deﬁned

background that was subtracted from the data [3]. No

such background subtraction for the published electrical

resistance data was indicated in Ref. [1].

VECTOR-BASED DATA EXTRACTION

The authors of Ref. [1] have not made the electrical

resistance raw data ﬁles publicly available. In such situ-

ations interested parties are left to analyze the published

graphs. Typically, this is done using bitmap-based data

extraction methods implemented by programs such as

DataThief [4] and Webplotdigitizer [5]. In this method,

the axes scale bars are manually calibrated and the data

points are selected either by hand or, in some cases, using

automated methods that function with varying levels of

success. Bitmap-based data extraction faces several lim-

itations. The precision with which the data points can

be extracted is limited by the resolution of the image ﬁle.

In favorable cases, the image resolution may be approx-

imately 600 pixels-per-inch. However, ﬁgures are often

included in formats such as jpg, where image compres-

sion may introduce artifacts that blur the edges of data

points. Most importantly, data points that are visually

covered by other data points will be invisible and thus im-

possible to extract. This is often the case for graphs that

contain a large number of densely-spaced data points.

Many journals adopt policies that graphics should be

included in vector format when possible. Common vec-

tor graphics formats include pdf,eps, and svg (scalable

vector graphics). A simple way to determine whether a

graph in a paper is a vector-based image is to use the

text-selection tool to try to select some of the text in the

graph. If you are able to select the text, it is likely a

vector-based image. A vector format image will remain

sharp no matter how much you zoom-in on it. Rather

than describing graphics as a grid of pixels, these for-

mats can include instructions describing a set of paths to

draw. The full svg instruction speciﬁcation is available

at Ref. [6]. For example, a square data point might be

described by specifying the coordinates of a set of four

lines. Figure 1shows examples of the path instructions

for (a) a single red, Run 2 data point in CSH-Fig1a and

(b) a single red 9 T data point in CSH-Fig2b, after con-

verting the corresponding pages of the article pdf into

svg format using pdf2svg [7]. Typically, the coordinates

associated with these paths are stored with substantially

higher precision than a corresponding bitmap equivalent.

In principle then, it is possible to analyze the instruc-

tions in a vector image ﬁle to extract the coordinates of

the data points. Extraction of the data points from the

vector image data allows extraction of points that are

completely covered by other data points and are there-

fore not directly visible when viewing the image in a pdf

arXiv:2210.10766v1 [cond-mat.supr-con] 19 Oct 2022

(a)

< path style =" st roke : none ; fill - ru le : evenodd ; fill : rgb (94 .1 177 37 % , 32 .1 56 37 2% ,32 .5 485 23 %) ; fill -

opacity :1;" d =" M 208.2 9 2 969 217 . 4 72656 C 208 . 2 92969 21 8 . 171875 2 0 7.72656 2 2 18.7382 8 1

207.0312 5 2 1 8.73828 1 C 206. 3 3 2031 218. 7 3 8281 205. 7 6 5625 218. 1 7 1875 205. 7 6 5625 217. 4 7 2656 C

205.76 5 6 25 216.77 7 3 44 206.3 3 2 0 31 216.2 1 0 938 207.031 2 5 2 16.2109 3 8 C 207 . 7 26562 216 . 2 10938

208.29 2 9 69 216.77 7 3 44 208.29 2 9 69 217.47 2 6 56 "/ >

(b)

< path style =" fill - rule : evenodd ; fil l : rgb ( 92 .9 412 84 % , 13 .3 33 13 % , 14 .1 17 43 2% ) ; fill - opacity :1; stroke -

wi dth :0.139; stroke - l inecap : round ; stroke - linejo in : round ; str oke : rgb

( 52 .9 41 8 95 % , 9. 01 9 47 % , 9. 80 3 77 2% ) ; s tro ke - o pa ci ty : 1; stro ke - m i te rl im i t : 10 ;" d =" M 0 .0 00 4 57 77

-0 . 00143164 L 1.340302 1.3 3 8 4 1 2 L 2. 68 40 52 -0.0014316 4 L 1 . 3 4 03 02 -1.3 37369 Z M 0.00 0 4 5777

-0 . 00143164 " tr a ns form =" ma t r i x ( 1 ,0 ,0 , -1 ,46 3.7261 05 ,81.529818) "/ >

FIG. 1. Example of instructions used to draw data points in SVG ﬁles. (a) Shows an example of a red, Run 2 data point from

CSH-Fig1a, while (b) shows an example of red 9 T data point from CSH-Fig2b. Note the diﬀerent structure of the two sets of

instructions: (b) uses a transform matrix, while (a) does not. Diﬀerent graphs may describe the data in slightly diﬀerent ways,

but the instructions can be straightforwardly parsed to extract the coordinates of each data point.

of a manuscript. Such is not possible from a bitmap im-

age. While not common, vector-based data extraction

has been implemented and shown to be reliable in least

one other study [8]. The technique has the potential to

aid in future data-mining eﬀorts. Databases of experi-

mental information such as the The Pauling File [9], may

consider the feasibility of adding vector-extracted data to

their collections.

The data extraction workﬂow employed in this work is

as follows:

1. Convert an entire page of the pdf into svg format

using the pdf2svg utility [7].

2. Identify the hexadecimal color for each data set.

3. Use the svgpathtools python library [10] to extract

the svg path objects that have the correct color and

compute the centroid of the vertices of each path

(i.e. the center of each data point), or in the case

of line plots, the coordinates of each line segment.

4. Remove data points from the series that come from

the ﬁgure legends or other ﬁgures on the same page.

5. Analyze the locations of the tick marks in the svg

ﬁle in order to calibrate the scale.

Full details of the data extraction method and example

code are available in extensively commented jupyterlab

notebooks hosted on github [11] and archived on zen-

odo [12]. The program Inkscape [13] can also be useful

for initial exploration of the structure of the svg ﬁle.

Inkscape’s pdf to svg conversion assigns a path identi-

ﬁcation (ID) number property to every path, which can

be helpful for identifying the parts of the svg code that

correspond to certain objects in the graph. One can open

the svg in Inkscape, click on an item to ﬁnd the path ID,

and then open the svg in a text editor and search for

that path ID to view the corresponding code.

VALIDATION

In order to validate the extraction method, I include

here an analysis of the extracted data for Fig. 3 of

Ref. [14] (Re22Be-Fig3), for which I have access to the

original raw data ﬁles. The paper was recently published

in Physical Review B, and Re22Be-Fig3 was included as

a vector graphics image. This ﬁgure presents normalized

electrical resistivity vs temperature measured at several

pressures for the material Be22Re. Superconducting tran-

sitions are present in each measurement. For this anal-

ysis, we focus on the measurement at 15 GPa, because

at that pressure, the data at both high and low tem-

perature are mostly obscured by measurements at other

pressures. Bitmap-based data extraction would not be

able to recover data in those regions.

Figure 2shows a comparison of the extracted data

and the original raw data. The raw resistance data has

been normalized to the value at 10 K, consistent with the

axis labeling in Be22Re-Fig3. The insets of Fig. 2show

zoomed comparisons of the extracted and raw data. Very

small diﬀerences between the two data sets are caused by

the limited precision of the svg path data.

PRECISION OF EXTRACTION

While a vector image plot in a publication has higher

precision than a corresponding bitmap, the precision is

not unlimited. The precision of data in a raw data ﬁle is

set not only by the number of digits recorded in the ﬁle

but also by the number of bits in the analog to digital

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

VectorgraphicsextractionandanalysisofelectricalresistancedatainNaturevolume586,pages373{377(2020)J.J.Hamlin11DepartmentofPhysics,UniversityofFlorida,Gainesville,FL32611,USA(Dated:September2022)Inthispaper,IpresentananalysisoftheelectricalresistancegraphsinRef.[1],whichreportedthediscoveryofroomtempe...

展开>> 收起<<

Vector graphics extraction and analysis of electrical resistance data in Nature volume 586 pages 373377 2020 J. J. Hamlin1.pdf

共6页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Vector graphics extraction and analysis of electrical resistance data in Nature volume 586 pages 373377 2020 J. J. Hamlin1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: