• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer

Analog IC Tips

Analog IC Design, Products, Tools Layout

  • Products
    • Amplifiers
    • Clocks & Timing
    • Data Converters
    • EMI/RFI
    • Interface & Isolation
    • MEMS & Sensors
  • Applications
    • Audio
    • Automotive/Transportation
    • Industrial
    • IoT
    • Medical
    • Telecommunications
    • Wireless
  • Learn
    • eBooks / Tech Tips
    • FAQs
    • EE Learning Center
    • EE Training Days
    • Tech Toolboxes
    • Webinars & Digital Events
  • Resources
    • Design Guide Library
    • Digital Issues
    • Engineering Diversity & Inclusion
    • LEAP Awards
    • Podcasts
    • White Papers
  • Video
    • EE Videos
    • Teardown Videos
  • EE Forums
    • EDABoard.com
    • Electro-Tech-Online.com
  • Engineering Training Days
  • Advertise
  • Subscribe

How does artificial intelligence relate to immersive audio?

November 20, 2023 By Jeff Shepard Leave a Comment

Tools like neural networks (NNs), machine learning (ML), and artificial intelligence (AI) are being applied to hard problems related to immersive audio.

This FAQ will examine how NNs and ML are being used to up-mix audio tracks into their original constituent parts, how NNs are being used to produce personalized head-related transfer functions (HRTFs) and looks at the European SONICOM project that aims to employ AI to produce personalized HRTFs for use in virtual-reality and artificial-reality (VR/AR) environments.

A software package called 3D Soundstage relies on deep-learning software running on a neural network to up-mix audio tracks. Up-mixing refers to the process of splitting audio tracks into a larger number of tracks that can be used for channel-based or object-based audio processing. 3D Soundstage is implemented in two steps: training followed by up-mixing.

During training, the software uses deep-learning software on a neural network with dozens of layers. The tracks move through the neural network, where they are gradually separated and sent to the output. The output is compared with the expected output, and machine learning is used to adjust the neural network parameters to improve the accuracy of the output.

Training this NN involved the use of tens of thousands of songs plus their separated tracks and took thousands of hours of processing. Training is iterative and continues until the output matches the expected result. At that point, the NN was ready to use for up-mixing (Figure 1).

Figure 1. A NN (center) can be trained to separate a single audio track (left) into its constituent parts (right) (Image: IEEE).

HRTFs, HRIRs and MLP ANNs
In another application of NNs, a research team used a multilayer perceptron artificial neural network (MLP ANN) to generate personalized HRTFs and head-related impulse responses HRIRs. HRTFs are in the frequency domain, while HRIRs are the corresponding function in the time domain. An MLP ANN is a feedforward single-hidden-layer neural network.

A group of 17 pinnae (outer ear structures), with 1,671,865 instances, was the training data set used to build the models. Two pinnae, with 196,690 instances, were used for the testing dataset to evaluate the predictive ability of the models. Finally, the standard KEMAR pinna, with 98,345 instances, was used to validate the selected models. The KEMAR pinna meets the requirements of ANSI S3.36/ASA58-2012 and IEC 60318-7:2011 and is based on the global average male and female head and torso dimensions. Model validation focused on analyzing how predicted HRTFs improved the sound experience that can be obtained using the information of the standard pinna. The resulting MLP ANN was able to predict HRTFs with relatively low errors for new individuals using basic personal morphological features.

Figure 2. This HRTF measurement chamber at Imperial College London is being used to generate the acoustic measurements needed to validate the SONICOM models (Image: IEEE).

SONICOM
The SONICOM project seeks to develop a data-driven approach that uses AI for HRTF personalization. It expects to develop accurate HRTF models that combine minimal data related to ear morphology and listener preferences. The project uses a combination of parametric pinna model (PPM) HRTF development using AI and is exploring the possibility of using AI to pair new individuals with high-quality HRTFs in a database for a higher-quality result. The acoustic simulations are being validated using acoustic measurements (Figure 2).

The SONICOM project will also dig into using AI to blend virtual objects (and related sounds) into a larger VR/AR environment. One key will be to develop techniques for rapidly estimating the reverberant characteristics of the environment. The project expects to use AI to extract data about the acoustic environment surrounding the VR/AR user to produce a realistic, immersive experience. The team expects to use geometrical acoustics and computational models like scattering delay networks to produce real-time simulations of the listener’s environment.

Summary
NNs, ML, and AI are being applied to a variety of uses in audio processing. NN plus ML is being used for up-mixing single audio tracks to support more information tracks and enable channel-based or object-based audio processing. ANNs, AI, and ML are being used to develop techniques for rapidly producing personalized HRTFs for immersive audio processing. Some of those efforts are expected to be particularly attractive for use in AR and VR environments.

References
Deep learning could bring the concert experience home, IEEE Spectrum
Optimization of HRTF Models with Deep Learning, Mathworks
Prediction of Head Related Transfer Functions Using Machine Learning Approaches, MDPI acoustics
The SONICOM Project: Artificial Intelligence-Driven Immersive Audio, From Personalization to Modeling, IEEE
Spatial Audio Meets AI, Steinberg

You may also like:


  • What are the computational requirements of immersive audio?

  • What’s the difference between object- and channel-based audio?

  • What codecs are there for immersive and 3D audio?

  • What is immersive audio and how does it work?

Filed Under: Applications, AR/VR, Audio, FAQ, Featured Tagged With: FAQ

Reader Interactions

Leave a Reply Cancel reply

You must be logged in to post a comment.

Primary Sidebar

Featured Contributions

High-Performance GPUs Are Located in a Variety of Environments, including Data Center Racks.

AI’s demand for faster, more reliable IC testing

Design a circuit for ultra-low power sensor applications

Active baluns bridge the microwave and digital worlds

Managing design complexity and global collaboration with IP-centric design

PCB design best practices for ECAD/MCAD collaboration

More Featured Contributions

EE TECH TOOLBOX

“ee
Tech Toolbox: Connectivity
AI and high-performance computing demand interconnects that can handle massive data throughput without bottlenecks. This Tech Toolbox explores the connector technologies enabling ML systems, from high-speed board-to-board and PCIe interfaces to in-package optical interconnects and twin-axial assemblies.

EE LEARNING CENTER

EE Learning Center
“analog
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, tools and strategies for EE professionals.

EE ENGINEERING TRAINING DAYS

engineering

RSS Current EDABoard.com discussions

  • CST different result
  • question regarding specs for two types of components
  • Help with reading footprint
  • analyzing current limiting switch circuit
  • LLC controller has very poor current sense facility?

RSS Current Electro-Tech-Online.com Discussions

  • RC Electronic Speed Control Capacitors
  • Annex32 / Annex RDS For ESP Micros - A Quick and Dirty Example
  • Convenient audio FFT module?
  • CR2/CR123A Batteries In Projects
  • Harman Kardon radio module BMW noise
“bills

Footer

Analog IC Tips

EE WORLD ONLINE NETWORK

  • 5G Technology World
  • EE World Online
  • Engineers Garage
  • Battery Power Tips
  • Connector Tips
  • EDA Board Forums
  • Electro Tech Online Forums
  • EV Engineering
  • Microcontroller Tips
  • Power Electronic Tips
  • Sensor Tips
  • Test and Measurement Tips

ANALOG IC TIPS

  • Subscribe to our newsletter
  • Advertise with us
  • Contact us
  • About us

Copyright © 2026 · WTWH Media LLC and its licensors. All rights reserved.
The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media.

Privacy Policy