ARTIFICIAL ASMR A CYBER-PSYCHOLOGICAL APPROACH Zexin FangBin HanC. Clark CaoHans D. Schotten RPTU Kaiserslautern-LandauLingnan UniversityGerman Research Center of Artificial Intelligence

2025-04-30 0 0 2.31MB 6 页 10玖币
侵权投诉
ARTIFICIAL ASMR: A CYBER-PSYCHOLOGICAL APPROACH
Zexin FangBin HanC. Clark CaoHans D. Schotten
RPTU Kaiserslautern-Landau, Lingnan University, German Research Center of Artificial Intelligence
ABSTRACT
The popularity of Autonomous Sensory Meridian Response
(ASMR) has skyrockted over the past decade, but scientific
studies on what exactly triggered ASMR effect remain few
and immature, one most commonly acknowledged trigger is
that ASMR clips typically provide rich semantic information.
With our attention caught by the common acoustic patterns
in ASMR audios, we investigate the correlation between the
cyclic features of audio signals and their effectiveness in trig-
gering ASMR effects. A cyber-psychological approach that
combines signal processing, artificial intelligence, and exper-
imental psychology is taken, with which we are able to quan-
tize ASMR-related acoustic features, and therewith synthe-
size ASMR clips with random cyclic patterns but not deliver-
ing identifiably scenarios to the audience, which were proven
to be effective in triggering ASMR effects.
Index TermsASMR, auditory, cyclostationary, GAN
1. INTRODUCTION
Autonomous Sensory Meridian Response (ASMR), a term
coined in the 2010s, is widely used to describe an intrigu-
ing phenomenon in which specific visual and auditory stimuli
trigger tingling sensations accompanied by positive emotions
as well as a feeling of deep relaxation [1]. With its bloom-
ing cultural popularity and growing commercial market [2],
ASMR has attracted emerging research interest [3], and its
cognitive effect has been verified by significant behavioral
and neurological evidences [4,5]. However, to the best of our
knowledge, existing work has only identified some semantic
elements that trigger the ASMR effect [6], while the acous-
tic features of auditory triggers remain poorly understood.
Though there is a generic claim that low-frequency sounds are
widely observed from effectively triggering ASMR audios, it
does not capture the repetitive characteristic.
This work is supported in part by the German Federal Min-
istry of Education and Research within the project Open6GHub
(16KISK003K/16KISK004), in part by the European Commission within the
Horizon Europe project Hexa-X (101015956), in part by the Network for
the Promotion of Young Scientists at RPTU Kaiserslautern-Landau within
the project A-SIREN (Individual Research Funding 2022-2), and in part by
the Lam Woo Research Fund at Lingnan University (F871223). B. Han
(bin.han@rptu.de) is the corresponding author.
Inspired by a study that reveals the correlation between
the trypophobia-triggering effect and cyclostationary features
of images [7], we suspect that the time-frequency and cyclic
features of audios may also play an important role in the
triggering of ASMR experience. In this paper, we prove this
correlation in a cyber-psychological approach that combines
various techniques of signal processing, artificial intelligence,
and experimental psychology. More specifically, we apply
short-time Fourier transform (STFT) and cyclic spectral anal-
ysis on recorded ASMR audio clips to extract their acoustic
features. Combining generative advertisal networks (GANs)
and a post-processing module, we are able to synthesize
ASMR clips based on generated cyclic patterns. We also
design a psychological survey to evaluate the effectiveness of
both the recorded and synthesized ASMR audios in triggering
ASMR effect on humans, and verify the correlation between
this effect and the selected acoustic features.
2. AUDIO DATA PROVISIONING
As the object of our study, we obtained off-the-shelf ASMR
audios from non-commercial online open sources. We col-
lected four audios recorded from real sounds of different ori-
gins, including: i) breathing, ii) mixing soft cream, iii) puff-
ing a spray, and iv) clicking a keyboard. Every audio was
recorded 16 bit stereo at the sampling rate of 22.05 kHz, last-
ing about 1 h. We intentionally selected these four sound
types as our experiment objects because they are largely dis-
tinct in cyclic and spectral features, for instance, clicking a
keyboard clips generally have dense cyclic patterns and more
high frequency components, meanwhile breathing clips man-
ifests sparse cyclic patterns with low pass characteristic.
3. ANALYSIS AND FEATURE EXTRACTION
Spectrograms of the provisioned ASMR clips generally ex-
hibit a low-pass characteristic and cyclic patterns of borad-
band bursts, as exampled in Fig. 1. To further investigate the
periodic patterns within the PSD of an audio signal x, the
spectral correlation density (SCD) and cyclic coherence func-
tion (CCF) can be applied as suggested by [8]:
arXiv:2210.14321v3 [eess.AS] 5 Jul 2023
Fig. 1: Spectrogram of a puffing spray audio clip
SXX (f, α) = EnX(f+α
2)X(fα
2)o(1)
CXX (f, α) = SXX (f, α)
rEnX(f+α
2)
2oEnX(fα
2)
2o(2)
where αis the cyclic frequency that indicates the periodicity
within the spectrum, and X(f)the complex-valued spectrum
of xat frequency f. For an N-sample signal frame x,Xcan
be estimated by the Discrete Fourier Transform (DFT):
X(f) = 1
N
N1
X
k=0 "N1
X
n=0
x(n)ej2πnk
Nδ2πf 2πk
N#.
(3)
The SCD and CCF of the spray puffing sound clip in Fig. 1,
for example, are illustrated in Fig. 2. From the figures
we see a distinguishing feature that is also observed from
other ASMR audios: narrow vertical stripes that distribute
smoothly over a wide range in the fdomain, while being
discrete and sparse in the αdomain, occurring at only low
cyclic frequencies.
Fig. 2: SCD (left) and CCF (right) of the spray audio clip
While both SCD and CCF are bivariate, we further
define two univariate functions of αto characterize the
cyclic behavior of an audio signal x, namely SXX (α) =
1
NP
fF
SXX (f, α)and Smax
XX (α) = max
fFSXX (f, α), where F
is the set of Nfrequency pins in SXX .SXX is sensitive to
broadband cyclic patterns in x, while Smax
XX is more sensitive
to narrow-banded cyclic components, as illustrated in Fig. 3.
Fig. 3:S(left) and Smax (right) of the spray audio clip
On top of X(f),SXX (α)and Smax
XX (α), we define eight
features: Φ1(x) = Mean SXX ,Φ2(x) = Var SXX ,
Φ3(x) = GSXX ,Φ4(x) = Mean (Smax
XX ),Φ5(x) =
Var (Smax
XX ),Φ6(x) = G(Smax
XX ),Φ7(x) = Var (|X|), and
Φ8(x) = G(|X|), where G(S) =
P
iS
P
jS
|ij|
2S0P
iS
iis the Gini
coefficient of set S. Here, Φ1and Φ4assess the overall cy-
clostationaity, Φ2and Φ5assess the α-variance, Φ3and Φ6
reflect the α-sparsity, Φ7describes the f-variance of SPD,
and Φ8the f-sparsitiy of SPD.
We sampled three 10 s clips from each recorded audio
mentioned in Sec. 2, and extracted the features Φ1Φ8of ev-
ery clip. A 10 s stereo 22.05 kSPS sampled white noise was
also analyzed as benchmark. The results are listed in Tab. 1.
4. RANDOM ASMR AUDIO SYNTHESIS
GAN, since its proposal in 2014 [9], has rapidly demonstrated
its great capability of generating artificial data with certain
patterns, and therewith attracted intensive attentions from re-
search and development fields of artificial intelligence (AI).
In general, a GAN system consists of two neural networks:
the generator and the discriminator. They work against each
other and jointly evolve, so that the generator is eventually ca-
pable of generating fake data that can be hardly distinguished
from real ones [10]. Besides its iconic success in generating
fake images, GAN is also proven effective in generating and
converting audio signals such as music and speech. Several
recent efforts are reported in [11–14], which have inspired us
to generate random ASMR audios with GAN.
While conventional GAN-based audio solutions are com-
monly specialized regarding acoustic features of natural lan-
guages or musical instruments, they cannot be straightfor-
wardly applied on our problem. Noticing that the spectrogram
of an audio signal i) contains most of the audio information,
and ii) exhibits significant 2D patterns upon cyclic audio com-
ponents, we propose to indirectly synthesize ASMR audios
by generating artificial ASMR-like cyclic patterns in spectro-
grams with GANs. More specifically, concerning the draw-
backs of original GANs such like low learning stability and
incapability of training with large high-definition datasets,
we invoke deep convolutional GANs (DCGANs), which in-
tegrate convolutional neural networks (CNNs) into GANs to
摘要:

ARTIFICIALASMR:ACYBER-PSYCHOLOGICALAPPROACHZexinFang⋆BinHan⋆C.ClarkCao†HansD.Schotten⋆‡⋆RPTUKaiserslautern-Landau,†LingnanUniversity,‡GermanResearchCenterofArtificialIntelligenceABSTRACTThepopularityofAutonomousSensoryMeridianResponse(ASMR)hasskyrocktedoverthepastdecade,butscientificstudiesonwhatexa...

展开>> 收起<<
ARTIFICIAL ASMR A CYBER-PSYCHOLOGICAL APPROACH Zexin FangBin HanC. Clark CaoHans D. Schotten RPTU Kaiserslautern-LandauLingnan UniversityGerman Research Center of Artificial Intelligence.pdf

共6页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:6 页 大小:2.31MB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 6
客服
关注