ARTIFICIAL ASMR A CYBER-PSYCHOLOGICAL APPROACH Zexin FangBin HanC. Clark CaoHans D. Schotten RPTU Kaiserslautern-LandauLingnan UniversityGerman Research Center of Artificial Intelligence

2025-04-30 0 0 2.31MB 6 页 10玖币

侵权投诉

ARTIFICIAL ASMR: A CYBER-PSYCHOLOGICAL APPROACH

Zexin Fang⋆Bin Han⋆C. Clark Cao†Hans D. Schotten⋆‡

⋆RPTU Kaiserslautern-Landau, †Lingnan University, ‡German Research Center of Artiﬁcial Intelligence

ABSTRACT

The popularity of Autonomous Sensory Meridian Response

(ASMR) has skyrockted over the past decade, but scientiﬁc

studies on what exactly triggered ASMR effect remain few

and immature, one most commonly acknowledged trigger is

that ASMR clips typically provide rich semantic information.

With our attention caught by the common acoustic patterns

in ASMR audios, we investigate the correlation between the

cyclic features of audio signals and their effectiveness in trig-

gering ASMR effects. A cyber-psychological approach that

combines signal processing, artiﬁcial intelligence, and exper-

imental psychology is taken, with which we are able to quan-

tize ASMR-related acoustic features, and therewith synthe-

size ASMR clips with random cyclic patterns but not deliver-

ing identiﬁably scenarios to the audience, which were proven

to be effective in triggering ASMR effects.

Index Terms—ASMR, auditory, cyclostationary, GAN

1. INTRODUCTION

Autonomous Sensory Meridian Response (ASMR), a term

coined in the 2010s, is widely used to describe an intrigu-

ing phenomenon in which speciﬁc visual and auditory stimuli

trigger tingling sensations accompanied by positive emotions

as well as a feeling of deep relaxation [1]. With its bloom-

ing cultural popularity and growing commercial market [2],

ASMR has attracted emerging research interest [3], and its

cognitive effect has been veriﬁed by signiﬁcant behavioral

and neurological evidences [4,5]. However, to the best of our

knowledge, existing work has only identiﬁed some semantic

elements that trigger the ASMR effect [6], while the acous-

tic features of auditory triggers remain poorly understood.

Though there is a generic claim that low-frequency sounds are

widely observed from effectively triggering ASMR audios, it

does not capture the repetitive characteristic.

This work is supported in part by the German Federal Min-

istry of Education and Research within the project Open6GHub

(16KISK003K/16KISK004), in part by the European Commission within the

Horizon Europe project Hexa-X (101015956), in part by the Network for

the Promotion of Young Scientists at RPTU Kaiserslautern-Landau within

the project A-SIREN (Individual Research Funding 2022-2), and in part by

the Lam Woo Research Fund at Lingnan University (F871223). B. Han

(bin.han@rptu.de) is the corresponding author.

Inspired by a study that reveals the correlation between

the trypophobia-triggering effect and cyclostationary features

of images [7], we suspect that the time-frequency and cyclic

features of audios may also play an important role in the

triggering of ASMR experience. In this paper, we prove this

correlation in a cyber-psychological approach that combines

various techniques of signal processing, artiﬁcial intelligence,

and experimental psychology. More speciﬁcally, we apply

short-time Fourier transform (STFT) and cyclic spectral anal-

ysis on recorded ASMR audio clips to extract their acoustic

features. Combining generative advertisal networks (GANs)

and a post-processing module, we are able to synthesize

ASMR clips based on generated cyclic patterns. We also

design a psychological survey to evaluate the effectiveness of

both the recorded and synthesized ASMR audios in triggering

ASMR effect on humans, and verify the correlation between

this effect and the selected acoustic features.

2. AUDIO DATA PROVISIONING

As the object of our study, we obtained off-the-shelf ASMR

audios from non-commercial online open sources. We col-

lected four audios recorded from real sounds of different ori-

gins, including: i) breathing, ii) mixing soft cream, iii) puff-

ing a spray, and iv) clicking a keyboard. Every audio was

recorded 16 bit stereo at the sampling rate of 22.05 kHz, last-

ing about 1 h. We intentionally selected these four sound

types as our experiment objects because they are largely dis-

tinct in cyclic and spectral features, for instance, clicking a

keyboard clips generally have dense cyclic patterns and more

high frequency components, meanwhile breathing clips man-

ifests sparse cyclic patterns with low pass characteristic.

3. ANALYSIS AND FEATURE EXTRACTION

Spectrograms of the provisioned ASMR clips generally ex-

hibit a low-pass characteristic and cyclic patterns of borad-

band bursts, as exampled in Fig. 1. To further investigate the

periodic patterns within the PSD of an audio signal x, the

spectral correlation density (SCD) and cyclic coherence func-

tion (CCF) can be applied as suggested by [8]:

arXiv:2210.14321v3 [eess.AS] 5 Jul 2023

Fig. 1: Spectrogram of a pufﬁng spray audio clip

SXX (f, α) = EnX(f+α

2)X∗(f−α

2)o(1)

CXX (f, α) = SXX (f, α)

rEnX(f+α

2)

2oEnX(f−α

2)

2o(2)

where αis the cyclic frequency that indicates the periodicity

within the spectrum, and X(f)the complex-valued spectrum

of xat frequency f. For an N-sample signal frame x,Xcan

be estimated by the Discrete Fourier Transform (DFT):

X(f) = 1

N−1

k=0 "N−1

n=0

x(n)e−j2πnk

Nδ2πf −2πk

N#.

(3)

The SCD and CCF of the spray pufﬁng sound clip in Fig. 1,

for example, are illustrated in Fig. 2. From the ﬁgures

we see a distinguishing feature that is also observed from

other ASMR audios: narrow vertical stripes that distribute

smoothly over a wide range in the fdomain, while being

discrete and sparse in the αdomain, occurring at only low

cyclic frequencies.

Fig. 2: SCD (left) and CCF (right) of the spray audio clip

While both SCD and CCF are bivariate, we further

deﬁne two univariate functions of αto characterize the

cyclic behavior of an audio signal x, namely SXX (α) =

f∈F

SXX (f, α)and Smax

XX (α) = max

f∈FSXX (f, α), where F

is the set of Nfrequency pins in SXX .SXX is sensitive to

broadband cyclic patterns in x, while Smax

XX is more sensitive

to narrow-banded cyclic components, as illustrated in Fig. 3.

Fig. 3:S(left) and Smax (right) of the spray audio clip

On top of X(f),SXX (α)and Smax

XX (α), we deﬁne eight

features: Φ1(x) = Mean SXX ,Φ2(x) = Var SXX ,

Φ3(x) = GSXX ,Φ4(x) = Mean (Smax

XX ),Φ5(x) =

Var (Smax

XX ),Φ6(x) = G(Smax

XX ),Φ7(x) = Var (|X|), and

Φ8(x) = G(|X|), where G(S) =

i∈S

j∈S

|i−j|

2∥S∥0P

i∈S

iis the Gini

coefﬁcient of set S. Here, Φ1and Φ4assess the overall cy-

clostationaity, Φ2and Φ5assess the α-variance, Φ3and Φ6

reﬂect the α-sparsity, Φ7describes the f-variance of SPD,

and Φ8the f-sparsitiy of SPD.

We sampled three 10 s clips from each recorded audio

mentioned in Sec. 2, and extracted the features Φ1–Φ8of ev-

ery clip. A 10 s stereo 22.05 kSPS sampled white noise was

also analyzed as benchmark. The results are listed in Tab. 1.

4. RANDOM ASMR AUDIO SYNTHESIS

GAN, since its proposal in 2014 [9], has rapidly demonstrated

its great capability of generating artiﬁcial data with certain

patterns, and therewith attracted intensive attentions from re-

search and development ﬁelds of artiﬁcial intelligence (AI).

In general, a GAN system consists of two neural networks:

the generator and the discriminator. They work against each

other and jointly evolve, so that the generator is eventually ca-

pable of generating fake data that can be hardly distinguished

from real ones [10]. Besides its iconic success in generating

fake images, GAN is also proven effective in generating and

converting audio signals such as music and speech. Several

recent efforts are reported in [11–14], which have inspired us

to generate random ASMR audios with GAN.

While conventional GAN-based audio solutions are com-

monly specialized regarding acoustic features of natural lan-

guages or musical instruments, they cannot be straightfor-

wardly applied on our problem. Noticing that the spectrogram

of an audio signal i) contains most of the audio information,

and ii) exhibits signiﬁcant 2D patterns upon cyclic audio com-

ponents, we propose to indirectly synthesize ASMR audios

by generating artiﬁcial ASMR-like cyclic patterns in spectro-

grams with GANs. More speciﬁcally, concerning the draw-

backs of original GANs such like low learning stability and

incapability of training with large high-deﬁnition datasets,

we invoke deep convolutional GANs (DCGANs), which in-

tegrate convolutional neural networks (CNNs) into GANs to

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ARTIFICIALASMR:ACYBER-PSYCHOLOGICALAPPROACHZexinFang⋆BinHan⋆C.ClarkCao†HansD.Schotten⋆‡⋆RPTUKaiserslautern-Landau,†LingnanUniversity,‡GermanResearchCenterofArtificialIntelligenceABSTRACTThepopularityofAutonomousSensoryMeridianResponse(ASMR)hasskyrocktedoverthepastdecade,butscientificstudiesonwhatexa...

展开>> 收起<<

ARTIFICIAL ASMR A CYBER-PSYCHOLOGICAL APPROACH Zexin FangBin HanC. Clark CaoHans D. Schotten RPTU Kaiserslautern-LandauLingnan UniversityGerman Research Center of Artificial Intelligence.pdf

共6页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

ARTIFICIAL ASMR A CYBER-PSYCHOLOGICAL APPROACH Zexin FangBin HanC. Clark CaoHans D. Schotten RPTU Kaiserslautern-LandauLingnan UniversityGerman Research Center of Artificial Intelligence

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: