Spatial Awareness: Inside the world of immersive sound design

Although the term Immersive Audio (or spatial audio) might at first glance appear to be a cutting-edge technology, the engineering challenge of trying to place an audience in a three-dimensional acoustic space has a long history and several methods have been used over the years with various amounts of success. Stephen Bennett reports…

Binaural recording is one early technique, where small omnidirectional microphones are placed in the ‘ears’ of a dummy head that has a similar mass and shape as a human bonce. Multi-microphone techniques, including what has come to be known as Ambisonics, has a long tradition amongst experimenters in sound, as has the use of the many-capsuled SoundField microphone. Various combinations of psychoacoustic and Digital Signal Processing (DSP) effects have been used to simulate ‘surround sound’ from stereo sources, such as the QSound system – which was used on Madonna’s 1990 album, The Immaculate Collection – and Roland’s RSS processor system.

The disadvantage of the multi-microphone technique is obvious – you need multi-speaker setups to play back the audio, which is inconvenient for most listeners. Until recently, the limitation of binaural playback was that the listener is forced into using headphones to experience the full effect – a problem than has resolved itself somewhat with the rise in the use of smartphones and earbud headphones.

Whatever technology is used, immersive audio is a growing field and some major players are experimenting with various techniques to enhance the listener experience. The potential for computer-based gameplay, Virtual Reality (VR), Augmented Reality (AR) and 3D/360-degree video has spurred on the expansion of creative practices in immersive sound.

The BBC has a unique place in British audio with a long history of technical innovation, and the corporation has experimented with binaural recording in the past. I recall my first experience of the technique while listening to a binaurally-recorded Sherlock Holmes dramatisation in the 1970s. I was lying on my bed with headphones on when a horse and cab appeared to run right over me. The BBC has continued to work with binaural sound in both radio and visual formats, culminating in a live immersive sound broadcast from the 2016 Proms in London, UK.

Tom Parnell has worked in the studio and on outside broadcasts for BBC Radio for fifteen years, balancing classical music for Radio 3, band sessions for Radio 2 and 6Music, and mixing documentary and speech programmes for Radio 4 and sport for Five Live. More recently he has been working closely with BBC Research & Development, helping to develop tools for producing high quality spatial audio content, creating binaural radio programmes and creating dynamic audio mixes for VR.

The Proms are part of a long British broadcasting tradition and you meddle with these production formats at your peril, so why did Parnell feel the time was right for an ‘Immersive Proms’?

“BBC R&D and BBC Radio 3 were keen to offer a live binaural stream from the Royal Albert Hall, optimising the production process and basing it on the music mix crafted by Radio 3’s dedicated sound engineers,” he says. According to Parnell, in the past recordings were only possible using in-ear or dummy head microphones which offered limited creative potential. But now, DSP techniques exist which can ‘binauralise’ any mono, stereo or multi-channel audio recording. “This allows us to pan audio in any position around the listener’s head when they are listening on headphones,” he adds. “This approach offers much more freedom when recording and more creative control when crafting immersive audio mixes in post-production.”

When all of the senses are aligned, then we are truly immersed in the scene, and only then does it become believable

In Soho, London, Jungle Studios, who specialise in sound design and music for advertising, promotion and broadcast have also been bitten by the immersive audio bug, with clients including Standard Chartered and Liverpool FC. “Immersive sound is, for us, sound that’s perceived to be all around the listener, including above and below,” says Steven Boardman, part of the Jungle Studios tech team, specialising in R&D for immersive sound. Jungle have delivered their immersive audio technology for some unique applications.

“After ten years of research, I’ve designed a room that houses a full sphere 31 speaker array with four subs which create the perfect environment for immersive audio,” says Boardman. He says that one example of the use of this technology is via speaker array that allows video-based discussions to take place inside an immersive soundfield. When the video is rotated the soundfield rotates too. As you can imagine, this type of remote networked communication system has the potential to improve the quality of engagement for attendees at virtual meetings and, consequently the environment too.

Contemporary immersive audio techniques build on past developments, as Boardman explains: “The techniques we use are not proprietary and as they are based on sound physics and psychoacoustics, they will continue to work way into the future. They are also format exclusive, non-dependent on playback system, and totally down mi x / up mix compatible. This means all past and future formats are catered for within one system. It is open ended and allows audio to be upgraded to any future resolutions easily, with very little re-mixing.”

Parnell says that the cutting edge in immersive audio is combining binaural technology with tracking data, so that as one moves ones head the sound field seems to stay in position. “This works by rendering in real-time the source material into the headphones in order to compensate for the head movements as the listener ‘explores’ the sound world around them, much like looking around a 360-degree video,” he adds.

Ambisonic recording has been in use since the 1970s, and Jungle say they have improved on the techniques used in the past. “We cater for Ambisonics at high resolutions,” says Boardman. “This is a bit of a buzzword right now, with Facebook, YouTube/Google, HTML5 and every HMD jumping on the bandwagon.” Boardman’s colleague Chris Turner, senior sound designer at Jungle, adds that the rise of immersive audio has been driven by VR, AR and 360 video applications and the need to match audio to the visual. “If you have a beautiful full sphere of visual objects that respond to head movements, then the audio needs to match this too. When all of the senses are aligned, then we are truly immersed in the scene, and only then does it become believable.” Jungle are also experimenting with a technology called Spatial PCM Sampling (SPS). 

“This has the ability to carry more spatial resolution per channel than Ambisonics,” says Boardman. “Its point source rendering is more accurate and it has the ability to’ beam in’ on specific points in space without much bleed. Turner says that Jungle use specialist microphones to capture 360-audio to capture the best immersive sound at source. “We recently recorded A Day In The Life, the story of a fan going to Anfield football ground to watch Liverpool play,” he says. “You hear the entire day as he heard it, and it climaxes with the sound of thousands of fans singing You’ll Never Walk Alone. The experience is incredible and something you could never get with a cheat in post.”

Parnell says that a third of BBC Radio audiences now listen using headphones, so it is timely that the BBC are producing content that is especially suited for this mode of listening. “The main benefit is a more immersive experience – the sense that the sounds are coming from all around you, perceived as coming from outside your head, and with better localisation,” he says. Parnell explains that these attributes all combine to produce a more realistic listening experience – and that preliminary research suggests that binaural sound can even be used to improve speech intelligibility for hearing-impaired audiences.

“This new technology is becoming more popular with BBC Radio producers, with binaural radio dramas and live music recordings now regularly in production.” Binaural sound was used for Radio 3’s coverage of the 2017 Cut and Splice festival in Manchester, while the BBC Radio 4 programme Quake presented a series of binaural audio dramas exploring modern responses to environmental disaster, with one produced in ‘cinematic VR’. The short series Pod Plays have also been specially written to showcase the potential of immersive audio.

Parnell says that the BBC R&D audio team has developed an object-based approach to immersive audio, where each sound source is treated as an ‘object’ and manipulated into a 3D position and rendered in a binaural mix with the relevant Head-Related Transfer Function (HRTF) filter depending where a source is positioned. “This is a very fast-moving area, particularly with regard to audio tools for VR, and there are multiple DAWs and plugins that support 3D panning and/or binaural rendering,” he adds.

“For the Proms I used IRCAM’s Panoramix console, which is designed for live 3D music production mixing and can render simultaneous outputs in stereo, Ambisonics, multiple loudspeaker, and binaural formats, enabling your choice of SOFA file (Spatially Oriented Format for Acoustics is an industry standard file format for HRTF sets) to be used in binaural.” Parnell says that the binaural Proms trial has highlighted the benefit that immersive audio can bring to listening to acoustic music, particularly where there is a spatial element to the performance. Concert hall venues are particularly suitable venues for immersive audio, the acoustics being architecturally designed to fill the hall with sound. Using immersive audio technology brings this experience right into the home (or pockets) of the listener. The BBC have gathered some statistics from those who accessed the live binaural stream on BBC Taste where a majority felt that the results were “like being there in person” while a similar majority believed that Radio 3 should broadcast more binaural sound.

“I worked on a project called Cinime a few years ago with Chris,” says Jungle’s Boardman. “This was an interactive advert portal/platform for cinema that allowed users to interact with content on the big screen via the small screen on their smart phones.” Turner says that Jungle are currently working on a project that highlights the lasting effects on children when a parent commits suicide. “It’s a difficult piece to get right and less is proving to be more but even subtle immersive movement within the music and recording the narrators in binaural audio is proving incredibly powerful.” Although immersive audio has long been a feature of both audio and video applications, its use in the field of recorded music has been limited and sporadic – however, Jungle think that is about to change. “We’re extremely excited about the prospect of the music industry getting on board the spatial bandwagon,” says Boardman.

“At the moment music presented as immersive audio isn’t common. Its use is generally driven by game engines interactively in real-time and according to game play, while music and non-diegetic audio is usually replayed in plain old stereo”. Boardman feels that music, most of all, can benefit from immersive audio techniques and if presented in this format, will ultimately allow the listener to better connect with the performance. “This is how we hear music live,” he says. “Immersive audio will enable the end-user to customise how they listen to music. In actual fact it will allow them inside the music and allow their ears and brains to choose what and how to listen.” Universal Music Group (UMG) appears to agree and are working with the company Within to create an app that can deliver immersive audio to consumers. 

Chris Milk, co-founder and CEO of Within believes that music is one of the most uniquely transformative mediums of human expression and that combining it with immersive AR and VR experiences will create a new art form that is more powerful than the sum of its parts. “This partnership with Universal allows us the incredible opportunity to work with top artists at UMG to create ever more meaningful and expressive immersive music experiences,” he says.

We’ve come a long way from the early experiments in immersive audio and new technological developments have allowed the audio engineer to obtain great control over placement of sound, the overall acoustic environment and where a listener is placed therein. As VR headsets and the likes of Google Glass become more common, it will present content creators with new creative opportunities and challenges and thus immersive audio is more likely to become more and more a major part of the audience’s everyday audio experience.

Stephen Bennett has been involved in music production for over 30 years. Based in Norwich, he splits his time between writing books and articles on music technology, recording and touring, and lecturing at the University of East Anglia.