Optimized Decoders for Mixed-Order Ambisonics Aaron Heller1 Eric Benjamin2 and Fernando Lopez-Lezcano3 1Artiﬁcial Intelligence Center SRI International Menlo Park CA

2025-04-29 0 0 4.14MB 9 页 10玖币

侵权投诉

Optimized Decoders for Mixed-Order Ambisonics*

Aaron Heller1, Eric Benjamin2, and Fernando Lopez-Lezcano3

1Artiﬁcial Intelligence Center, SRI International, Menlo Park, CA

2Surround Research, Paciﬁca, CA

3Center for Computer Research in Music and Acoustics (CCRMA), Stanford University, Stanford, CA

May 23, 2021

Abstract

In this paper we discuss the motivation, design, and anal-

ysis of ambisonic decoders for systems where the vertical

order is less than the horizontal order, known as mixed-

order Ambisonic systems. This can be due to the use of

microphone arrays that emphasize horizontal spatial res-

olution or speaker arrays that provide sparser coverage

vertically. First, we review Ambisonic reproduction crite-

ria, as deﬁned by Gerzon, and summarize recent results on

the relative perceptual importance of the various criteria.

Then we show that using full-order decoders with mixed-

order program material results in poorer performance than

with a properly designed mixed-order decoder. We then in-

troduce a new implementation of a decoder optimizer that

draws upon techniques from machine learning for quick

and robust convergence, discuss the construction of the

objective function, and apply it to the problem of designing

two-band decoders for mixed-order signal sets and non-

uniform loudspeaker layouts. Results of informal listening

tests are summarized and future directions discussed.

1 Introduction

There is a renewed interest in decoders for mixed-order

Ambisonics due to the availability of mixed-order micro-

phones and the current COVID-19 restrictions placing an

This paper has been accepted for presentation at the 150th Audio

Engineering Society Convention, May 25-28, 2021, (virtual).

1aaron.heller@sri.com

2ericmbenj@gmail.com

3nando@ccrma.stanford.edu

emphasis on loudspeaker arrays that can be deployed in

domestic settings, where it is relatively easy to deploy a

third-order horizontal array comprising eight loudspeak-

ers. However, installing more than a few elevated speakers

is difﬁcult and placing speakers signiﬁcantly below the

listener is nearly impossible. While mixed-order opera-

tion is frequently cited as an advantage of Ambisonics,

little has been written about creating or analyzing the per-

formance of decoders speciﬁcally for mixed-order signal

sets or highly non-uniform loudspeaker arrays. We also

introduce a new implementation of the Ambisonic De-

coder Toolbox (ADT) in Python/NumPy, which includes

a fast and robust non-linear optimizer and a new design

procedure for dual-band decoders where we ﬁrst optimize

the high-frequency performance of the decoder and then

optimize the low-frequency performance to match the high-

frequency [1].

2 Ambisonics

Ambisonics is an extensible, hierarchical system for rep-

resenting sound ﬁelds. It deﬁnes how something should

sound as opposed to specifying the signals going to par-

ticular speakers. Sound ﬁelds can be recorded using an

Ambisonic microphone or created using an Ambisonic pan-

ner to position a sound in full 3D space. It is an isotropic

representation of the sound ﬁeld that can be rotated in the

renderer making it attractive for virtual and augmented

reality applications.

An Ambisonic signal set is a representation of the sound

ﬁeld as the time-varying coefﬁcients of a spherical har-

arXiv:2210.00378v1 [eess.AS] 1 Oct 2022

monic series. The spatial accuracy increases with the num-

ber of harmonics being used. A ﬁrst-order Ambisonic

signal set is four channels wide, third-order is sixteen

channels, ﬁfth order is 36, and so forth. Each increase in

Ambisonic order adds spherical harmonics to the signal

set and increases the spatial accuracy of the representation

of the sound ﬁeld. We use a shorthand notation to spec-

ify the signal set. For example 3H2V means third-order

horizontal, second-order vertical, with the set of spherical

harmonics according to the HV convention [2].

Once an Ambisonic signal set has been captured or

generated, appropriate speaker feeds are produced by a

decoder. Designing an optimal decoder, speciﬁcally the

low- and high-frequency matrices, for a given signal set

and loudspeaker array is the central topic of this paper.

Other aspects of decoder design have been covered in

earlier papers by the present authors [3].

2.1 Mixed-Order Ambisonics

A physical encoder (an Ambisonic microphone) needs

to have enough capsules covering the sphere to accurately

sample the spherical harmonics of the order it is intended to

capture. Conversely, a speaker array needs to have enough

loudspeakers covering the sphere to excite the spherical

harmonics for the maximum order it is intended to repro-

duce. That is not always the case, leading to arrays with

different densities of transducers in different directions.

The consequence is that the order that can be encoded or

decoded will change according to the direction.

For example, nine years ago, one of the present authors

published the design for a second-order ambisonic micro-

phone [

]. There have been four proprietary [

]

and one free and open-source implementation [

] of this

design. A compromise made was to use only eight cap-

sules. This simpliﬁes calibration and allows the use of

widely-available eight-channel recorders.

While commonly referred to as a second-order micro-

phone, only eight of the nine spherical harmonic compo-

nents needed for the second-order signal set can be derived

from the capsule signals. The missing spherical harmonic

is degree 2 and order 0, which is called “R” in the Furse-

Malham convention. R is a “zonal” harmonic and varies

only with elevation. Eliminating this component coarsens

the description of the sound ﬁeld at elevations other than

horizontal, making it a 2HV1 mixed-order encoder. As we

shall see, decoding this signal set with a decoder designed

for full second order is suboptimal.

Small speaker arrays with a limited number of speakers

in the vertical direction are another case in which the array

does not have uniform density of speakers and cannot

excite the spherical harmonics in all directions equally.

Physical restrictions in the placement of speakers can also

dictate that an array might not be capable of rendering the

same order in both the horizontal and vertical directions.

Such an array will need a mixed-order decoder.

3 Ambisonic Decoders

The task of the decoder is to create the best perceptual

impression possible that the sound ﬁeld is being repro-

duced accurately, given the available resources. In practi-

cal terms, the following criteria are necessary[10]:

1. Constant amplitude gain for all source directions

2. Constant energy gain for all source directions

At low-frequencies, correct reproduced wavefront

direction and velocity (Gerzon’s velocity-model lo-

calization vector, rV)

At high-frequencies, maximum concentration of en-

ergy in the source direction (Gerzon’s energy-model

localization vector, rE)

Matching high- and low-frequency perceived direc-

tions ( ˆ

rE=ˆ

rV)

Recent work shows that (4) is the most important [

]; it

is also the most difﬁcult to get right. After that, (2) and

(5) are important, as it is thought that we use a majority

voting system to resolve conﬂicting directional cues [

Decoders that ignore (5) can be fatiguing due to conﬂict-

ing perceptual cues [

]. Note that to satisfy all of these

criteria we must use decoders that have different gain ma-

trices for high and low frequencies, so-called “two-band”

or “Vienna” decoders [13].

The ADT includes a full-featured decoder engine writ-

ten in the FAUST DSP speciﬁcation language [

] that

implements dual-band decoding, near-ﬁeld correction, and

level and time-of-arrival compensation. The ADT incor-

porates several design techniques that produce decoders

that perform well according to these criteria for partial-

coverage loudspeaker arrays, such as domes and stacked

rings, but assumes that within those limits the speakers are

(more or less) uniformly distributed. It also assumes that

the decoders produced by these techniques are optimal for

mixed-order signal sets.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

OptimizedDecodersforMixed-OrderAmbisonics*AaronHeller1,EricBenjamin2,andFernandoLopez-Lezcano31ArticialIntelligenceCenter,SRIInternational,MenloPark,CA2SurroundResearch,Pacica,CA3CenterforComputerResearchinMusicandAcoustics(CCRMA),StanfordUniversity,Stanford,CAMay23,2021AbstractInthispaperwediscus...

展开>> 收起<<

Optimized Decoders for Mixed-Order Ambisonics Aaron Heller1 Eric Benjamin2 and Fernando Lopez-Lezcano3 1Artiﬁcial Intelligence Center SRI International Menlo Park CA.pdf

共9页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Optimized Decoders for Mixed-Order Ambisonics Aaron Heller1 Eric Benjamin2 and Fernando Lopez-Lezcano3 1Artiﬁcial Intelligence Center SRI International Menlo Park CA

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: