Designer Audio: Marco Perry on building spatial audio systems

Renowned immersive sound artist and designer Marco Perry has collaborated on various cutting-edge audio projects, such as creating a spatial audio system for Björk at New York’s Museum of Modern Art and an immersive setup for the Dark MoFo festival in Australia last year. In this Q&A he takes us behind the scenes of his boundary pushing sound work and gives us his predictions on the future of 3D audio technology.

What are the biggest developments you have seen in the immersive and spatial audio arena over the last few years?

When I first decided to devote time to the commercial application of spatial audio formats there were none readily available on the market. There was nothing to simply “buy into” or anything out there that I thought was flexible or adaptable enough to cater for my vision of what we should be aiming for, so we made our own Ambisonic sound systems and using psycho-acoustic techniques we created a palette of spatial audio methods. I designed and built my own production rigs for decoding object and channel-based audio.

AES colleagues Dave Hunt, Richard Furse, Bruce Wiggins and Dave Malham among others were developing software using Reaper, Max msp, Linux, etc. These were exciting times with artist-led ideas and challenging briefs, using ambitious interfaces, haptic controllers and lots of boxes and wires.

In 2012, Dolby bought IM sound Barcelona and eventually entered the 3D audio market with a branded format. They worked hard at putting studios in place to render the Dolby exclusive file content and put file playback systems in place to convince us all that Dolby Atmos was the way to go.

Auro 3D became an excellent spatial sound stage audio format, while Ti-Max and Merging are notable purveyors of sophisticated multi channel surround sound hardware for use in theatres and installations. Boundaries have been pushed in general from all corners of the audio industry, including from manufacturers. The desire for more immersive audio experiences has been very much driving this movement.

The gaming market is also a huge driving force, with major players always keen to make the audio in their product as sophisticated and as real as possible for gamers. The large and the emerging VR companies Magic Leap, along with Apple, Microsoft, AMD, Sony, Samsung HTC and all the rest are most likely 90% of the driving force behind the interest in immersive audio because positional audio information in a VR headset is important.

For my own part I think a binaural decode for VR is often best derived from an Ambisonic sound field in either a moving or static experience, and the use of 360 degree convolution reverb is crucial to creating either a static or a real-time moving interactive experience.

Other new and established microphone companies are making spatial audio microphones that have gone beyond the confines of the dummy head technique traditionally known for it’s deployment in classical and orchestral recordings. Now we have quad binaural mics, 8 Ball, MH Acoustics, Eigen etc. Spatial audio mic techniques are also becoming available in a variety of hand-held affordable hard disc recorders and of course, directional microphones and highly advanced recording and hearing devices are now being developed for deployment in your phones.

How did you go about creating the system for Bjork’s MOMA gig?

I have worked with artists all my life and I think because of this I’m able to deal with the rigours of an unexpectedly changing schedule and to cope with the little surprises and demands that will come with this kind of gig.

In my Immersive Audio production studios in London, she said: “I don’t want to hear the bass so much as feel it,” so we went on a journey to find the most appropriate decode of a bass “feeling” to varied frequencies that are seriously low and with long wave lengths. With technical conundrums, you then adapt to make the feeling work every time in different venue locations and on different loudspeaker systems.

Plus, you know how she should sound and the balance of the piece. She trusts your judgement to make the immersive audio mix containing her original emotional intent in that content, so whatever piece you work on and on whatever sound system you have, and whatever acoustic you are presented with, you strive to always make the production work best. That is always the challenge.

Could you tell us how you went about satisfying the spec for the Dark MoFo Festival?

The Dark MoFo sound designs were by invitation from the festival organisers. The outdoor piece was commissioned by the festival to be the headline art installation and ran throughout the festival performance nights up to the winter solstice on 21 June.

Built in Melbourne, shipped to Tasmania then assembled on site, three 35 metre high towers were positioned at the points of an equilateral triangle contained in a 45 metre diameter circle painted on the ground. Lasers and sound hardware was positioned within the tripod legs of the towers and masts. 170kgs of our immersive audio control gear was flown from London including macs, PCs, FX boxes, my trusty RME converters, Digico and Midas mixers.

Reasoning the dispersion characteristics of the D&B boxes and the coverage required for the main system, I deployed three hangs of full range including flown subs with infra subs on the floor plus six ground stacked full range systems. My team designed and built software for sound control and object based audio panning within the circle area and across the site.

The soundtrack for this I created with my long time musical collaborator David Clayton and with Rob Del Naja of Massive Attack. Renowned artist Chris Levine worked with Tyler Le Dent from ER Productions to fine tune the choreography of the laser program to synchronise with the audio map I’d compiled for us in London.

I had began in the Immersive Audio London production studio where I have a 43 loudspeaker production and mix rig which I can configure appropriately for any spatial audio production and mixing work. Ultimately, live mixing on site was required to best decode the program to this extraordinary sound system so I set up a temporary studio in a ground floor office overlooking the site and refined the content there on a compact six-point loudspeaker system mounted on stands. When I was confident with the parts and had experimented with my performance effects I built a mobile audio control system and mixed the whole piece live from the centre of the circle.

I recorded this performance with all the desk and live analogue effects feeds including object pans and levels. This became the basis for the subsequent live playback performances of the work.

To what extent are we seeing a move towards more of these live 3D sound installations?

The rising popularity of immersive audio experiences is completely unstoppable. Now we have VR in our lives with HMDs and goggles on sale in the high street there is big money available from audio, video gaming and media companies. We have the virtues of spatial broadcast formats being expounded on and explored by the BBC for example, researching into ever more mass consumable live 3D audio and visual broadcast techniques.

 The live performance scene is changing rapidly and certainly at the top end. If you’re a top end touring production outfit why would you now invest big money in another traditional stereo sound reinforcement or PA system when there are so much more exciting and ever more sophisticated spatial audio options for your buck? Investors, promoters, artists, producers and venues all want to be ahead of the game and to wow audiences with great sounding gigs.

Audiences are becoming more discerning. They may not know what it is or how it was achieved but that incredible sound they heard is what they will remember and what they will talk about and what they’ll want to hear again and again. Creative decisions in theatre productions and concert halls also have more scope in immersive audio.

What has been your favourite project and what is the most important aspect of creating an immersive mix?

If I’m designing a sound field in an installation for an audience in a specific venue I start by thinking backwards. I’ll visualise the finished work audibly and see how close I can get to doing what’s necessary in the audio content production and in the installation fabrication and the physical delivery of the work to achieve what I see in my minds eye.

When I hear sound, I also have a vision of it, often it’s a landscape style picture and often it also has three-dimensional form. If you’re producing a sound field for your audience then the final appraisal of it will be the sum of all your efforts. Go on a site visit if possible; understand the fabrication, size and acoustic of a space; memorise it. This will also help in planning any room treatments.

When mixing in the installation, traditional skills still apply. First take measurements, make drawings, calculations, loudspeaker characteristics, dispersion angles, coverage, frequency response. Use spectrum analysers, sweep generators, impact response, and room equalisers.

My favourite project is one which I’m working on right now. Basically it’s a two part AV experience where we’ve created complimentary installations to run both inside a gallery and outside in the grounds, with sound and visual composition from the AVarts team in both installations.

Outside the gallery is a public open space which means anyone can enjoy it and be impacted by the art. Hopefully they will be, and if it doesn’t raise the level of their consciousness and awareness of the space then at least it should raise a smile.

Inside the gallery the work is more personal and challenging which should lead to heightened sensitivity, stimulating a reaction to both the audio and visual program content we’ve created. I can virtually guarantee that no one will have ever seen or heard anything like this in their lives before so it should be both provocative and entertaining.I’d love to be a fly on the wall in any of my installation designs; this one should be interesting and I’m looking forward to seeing and hearing the feedback.