Optimized Decoders for Mixed-Order Ambisonics Aaron Heller1 Eric Benjamin2 and Fernando Lopez-Lezcano3 1Artificial Intelligence Center SRI International Menlo Park CA

2025-04-29 0 0 4.14MB 9 页 10玖币
侵权投诉
Optimized Decoders for Mixed-Order Ambisonics*
Aaron Heller1, Eric Benjamin2, and Fernando Lopez-Lezcano3
1Artificial Intelligence Center, SRI International, Menlo Park, CA
2Surround Research, Pacifica, CA
3Center for Computer Research in Music and Acoustics (CCRMA), Stanford University, Stanford, CA
May 23, 2021
Abstract
In this paper we discuss the motivation, design, and anal-
ysis of ambisonic decoders for systems where the vertical
order is less than the horizontal order, known as mixed-
order Ambisonic systems. This can be due to the use of
microphone arrays that emphasize horizontal spatial res-
olution or speaker arrays that provide sparser coverage
vertically. First, we review Ambisonic reproduction crite-
ria, as defined by Gerzon, and summarize recent results on
the relative perceptual importance of the various criteria.
Then we show that using full-order decoders with mixed-
order program material results in poorer performance than
with a properly designed mixed-order decoder. We then in-
troduce a new implementation of a decoder optimizer that
draws upon techniques from machine learning for quick
and robust convergence, discuss the construction of the
objective function, and apply it to the problem of designing
two-band decoders for mixed-order signal sets and non-
uniform loudspeaker layouts. Results of informal listening
tests are summarized and future directions discussed.
1 Introduction
There is a renewed interest in decoders for mixed-order
Ambisonics due to the availability of mixed-order micro-
phones and the current COVID-19 restrictions placing an
*
This paper has been accepted for presentation at the 150th Audio
Engineering Society Convention, May 25-28, 2021, (virtual).
1aaron.heller@sri.com
2ericmbenj@gmail.com
3nando@ccrma.stanford.edu
emphasis on loudspeaker arrays that can be deployed in
domestic settings, where it is relatively easy to deploy a
third-order horizontal array comprising eight loudspeak-
ers. However, installing more than a few elevated speakers
is difficult and placing speakers significantly below the
listener is nearly impossible. While mixed-order opera-
tion is frequently cited as an advantage of Ambisonics,
little has been written about creating or analyzing the per-
formance of decoders specifically for mixed-order signal
sets or highly non-uniform loudspeaker arrays. We also
introduce a new implementation of the Ambisonic De-
coder Toolbox (ADT) in Python/NumPy, which includes
a fast and robust non-linear optimizer and a new design
procedure for dual-band decoders where we first optimize
the high-frequency performance of the decoder and then
optimize the low-frequency performance to match the high-
frequency [1].
2 Ambisonics
Ambisonics is an extensible, hierarchical system for rep-
resenting sound fields. It defines how something should
sound as opposed to specifying the signals going to par-
ticular speakers. Sound fields can be recorded using an
Ambisonic microphone or created using an Ambisonic pan-
ner to position a sound in full 3D space. It is an isotropic
representation of the sound field that can be rotated in the
renderer making it attractive for virtual and augmented
reality applications.
An Ambisonic signal set is a representation of the sound
field as the time-varying coefficients of a spherical har-
1
arXiv:2210.00378v1 [eess.AS] 1 Oct 2022
monic series. The spatial accuracy increases with the num-
ber of harmonics being used. A first-order Ambisonic
signal set is four channels wide, third-order is sixteen
channels, fifth order is 36, and so forth. Each increase in
Ambisonic order adds spherical harmonics to the signal
set and increases the spatial accuracy of the representation
of the sound field. We use a shorthand notation to spec-
ify the signal set. For example 3H2V means third-order
horizontal, second-order vertical, with the set of spherical
harmonics according to the HV convention [2].
Once an Ambisonic signal set has been captured or
generated, appropriate speaker feeds are produced by a
decoder. Designing an optimal decoder, specifically the
low- and high-frequency matrices, for a given signal set
and loudspeaker array is the central topic of this paper.
Other aspects of decoder design have been covered in
earlier papers by the present authors [3].
2.1 Mixed-Order Ambisonics
A physical encoder (an Ambisonic microphone) needs
to have enough capsules covering the sphere to accurately
sample the spherical harmonics of the order it is intended to
capture. Conversely, a speaker array needs to have enough
loudspeakers covering the sphere to excite the spherical
harmonics for the maximum order it is intended to repro-
duce. That is not always the case, leading to arrays with
different densities of transducers in different directions.
The consequence is that the order that can be encoded or
decoded will change according to the direction.
For example, nine years ago, one of the present authors
published the design for a second-order ambisonic micro-
phone [
4
]. There have been four proprietary [
5
,
6
,
7
,
8
]
and one free and open-source implementation [
9
] of this
design. A compromise made was to use only eight cap-
sules. This simplifies calibration and allows the use of
widely-available eight-channel recorders.
While commonly referred to as a second-order micro-
phone, only eight of the nine spherical harmonic compo-
nents needed for the second-order signal set can be derived
from the capsule signals. The missing spherical harmonic
is degree 2 and order 0, which is called “R” in the Furse-
Malham convention. R is a “zonal” harmonic and varies
only with elevation. Eliminating this component coarsens
the description of the sound field at elevations other than
horizontal, making it a 2HV1 mixed-order encoder. As we
shall see, decoding this signal set with a decoder designed
for full second order is suboptimal.
Small speaker arrays with a limited number of speakers
in the vertical direction are another case in which the array
does not have uniform density of speakers and cannot
excite the spherical harmonics in all directions equally.
Physical restrictions in the placement of speakers can also
dictate that an array might not be capable of rendering the
same order in both the horizontal and vertical directions.
Such an array will need a mixed-order decoder.
3 Ambisonic Decoders
The task of the decoder is to create the best perceptual
impression possible that the sound field is being repro-
duced accurately, given the available resources. In practi-
cal terms, the following criteria are necessary[10]:
1. Constant amplitude gain for all source directions
2. Constant energy gain for all source directions
3.
At low-frequencies, correct reproduced wavefront
direction and velocity (Gerzon’s velocity-model lo-
calization vector, rV)
4.
At high-frequencies, maximum concentration of en-
ergy in the source direction (Gerzon’s energy-model
localization vector, rE)
5.
Matching high- and low-frequency perceived direc-
tions ( ˆ
rE=ˆ
rV)
Recent work shows that (4) is the most important [
11
]; it
is also the most difficult to get right. After that, (2) and
(5) are important, as it is thought that we use a majority
voting system to resolve conflicting directional cues [
10
].
Decoders that ignore (5) can be fatiguing due to conflict-
ing perceptual cues [
12
]. Note that to satisfy all of these
criteria we must use decoders that have different gain ma-
trices for high and low frequencies, so-called “two-band”
or “Vienna” decoders [13].
The ADT includes a full-featured decoder engine writ-
ten in the FAUST DSP specification language [
14
] that
implements dual-band decoding, near-field correction, and
level and time-of-arrival compensation. The ADT incor-
porates several design techniques that produce decoders
that perform well according to these criteria for partial-
coverage loudspeaker arrays, such as domes and stacked
rings, but assumes that within those limits the speakers are
(more or less) uniformly distributed. It also assumes that
the decoders produced by these techniques are optimal for
mixed-order signal sets.
2
摘要:

OptimizedDecodersforMixed-OrderAmbisonics*AaronHeller1,EricBenjamin2,andFernandoLopez-Lezcano31ArticialIntelligenceCenter,SRIInternational,MenloPark,CA2SurroundResearch,Pacica,CA3CenterforComputerResearchinMusicandAcoustics(CCRMA),StanfordUniversity,Stanford,CAMay23,2021AbstractInthispaperwediscus...

展开>> 收起<<
Optimized Decoders for Mixed-Order Ambisonics Aaron Heller1 Eric Benjamin2 and Fernando Lopez-Lezcano3 1Artificial Intelligence Center SRI International Menlo Park CA.pdf

共9页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:9 页 大小:4.14MB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 9
客服
关注