A press release recently announced a new PlayStation3 game called DiRT, and stated that it used Ambisonic sound. For those of us in the surround sound community, implementing Ambisonics in games is an obvious match … but for those in the gaming world, the word Ambisonics is met with reserved excitement.
Codemasters are the people behind DiRT, and Simon Goodwin (Principal Programmer) is the pioneer. I tracked Simon down and asked him the how/what/why behind this significant event. I discovered Simon is not a newcomer to Ambisonics …
How was the team at Codemasters exposed to Ambisonics?
It predates my work at Codemasters. I became interested in it when I as working as a contractor for Amiga Inc, leading their audio development team around 2000 and optimistically planning a ten-year audio strategy for the corporation; Don Cox investigated Ambisonics or us and I spent a while lurking on the Sursound list and got to know the theory and the main names like Rumsey, Farina, Malham and of course Gerzon. I spoke to several people and the same ideas kept coming up.
So I visited Dave Malham and Tony Myatt at York MRC – our local centre of excellence – a couple of times, to gain a better understanding of Ambisonic routines, and hear it working well. I then implemented a hybrid third-order version on the PS3, taking advantage of vector processing and maths optimisations and integration with our Neon game engine and cross-platform audio systems.
Why did you choose Ambisonics?
It’s still the only mathematically and psychoacoustically valid way to pan in more than one dimension, as far as I know! And I’ve heard it done wrongly (even in 1D, surprisingly often) so I realise it’s an important problem.
When we surveyed our customers and found out that between a third and half of them were set up for surround – depending on the platform, and though they didn’t always use it – it was obvious that we needed a strategy to make our games sound really good for them, which did not neglect the needs of people listening on headphones, or stereo speakers, or the (still significant) mono minority. Rather than make or fake a 5.1, 4.0, stereo speaker, headphone (not the same!) and mono mix from some subset on the game disc, with all the problems of matrixing, phasing and asset management, Ambisonics was the obvious way to go.
Are you getting good feedback on the sound for this game?..
Certainly. IGN say:
“These truly next-gen graphics are complimented by some great sound effects. As well as the accurate navigation being piped to you by your navigator, the ambient sound effects are terrific. Engines have that rough “baaaarp, baaaarp” that rally cars are renowned for, while the sound of the gravel bouncing off the underside of the car is a subtle touch. I love the way that sounds become a little echoey when driving next to a wall, rather than just in tunnels.”
Pro-G Website: (overall score 8/10)
“A special mention has to go to the game’s audio, which when running through a 5.1 setup is incredibly immersive; even when racing alone, you get a great sense of moving through a tight course. Music is used sparingly, but menus and replays all feature suitably trendy tunes, and a simple but cool slow-motion mode during replays never grows tiresome, with the audio slowing down and becoming ever so slightly subdued to great effect. Sound: 9/10”
Edge magazine preview (June 2007)
“The environments are varied, challenging and – as you might expect -beautiful. As you might not expect, however, they also sound fantastic. The noise of grass stems whipping against the mud flaps and gravel peppering the underside of the chassis adds such personality and immersion to the tracks that you never find racing solo to be any less diverting than when racing alongside others.”
I’m sure you’ll find more via Google. Of course the credit is do not just to my tech and Gerzon’s Ambisonic theory but also to Adam Sawkins and Stafford Bawler, the game audio programming and sound design leads, and their teams.
Do you feel it is a step up on surround experience?
Well, true multichannel reverb (not just mono or stereo panned) and high-order Ambisonics via uncompressed 24 bit 7.1 channels are original features in gaming, as far as I know, along with the creative use of per-voice filters and often well over 100 voices playing at a time. But you ought to listen and make up your own mind.
I think it sounds a lot better than games I’ve worked on for the previous console generation, and other people’s “next-gen” game audio so far, but I’m biased, and aware that there’s still a long way to go.
Any plans to play around with height?
Of course, and we experimented with it for DiRT, though the option didn’t make it into the front-end of the game because we wanted to concentrate on existing consumer audio set-ups. Height is a solved problem, technically, with Ambisonics, and using the 3D data available in games – but the trick is to work out where to put the speakers! I talked about this at the IASIG meeting in San Francisco and am raising discussions with major hardware manufacturers (who we’d need onboard to set a new standard) and the AES, of which I’m a member. We could do it now, but it’s pointless if customers haven’t got the set-up required to hear the results, and we don’t want to clutter the game interface to support an ill-defined option, let alone a raft of them.
Despite all the APIs claiming 3D audio and barely managing 2D, I feel it’s not viable with 5.1, but 7.1 can just about cope – the trick is to remain compatible with ITU and the competing de-facto game standard quad layout (used in XAudio, PS3 MultiStream and Apple’s OpenAL panner) which expects speakers (other than the cinema-legacy front centre and sub) at right angles. That suits games fairly well, but not Hollywood cinema-packing fake surround conventions, and it’s probably even less well-publicised than the seldom-honoured ITU layout; in fact consumers who know no better seem more likely to opt for an approximation to quad – or a pile of speakers under the telly! – than get out their protractors for an ITU-compliant setup, even though the quad layout is not good for stereo music.
There’s still no standard for 7.1 layout, and I’m interested to hear from people who’d like to help discerning listeners into 3D; I feel that it’s more important to have a standard, which is a step forward but not incompatible with existing media (including stereo as well as Cinema mixes), and ergonomically practical, even if this means the vertical expression is less extreme than you might get if you started from scratch. Even so, there are options well worth exploring.
Have you considered using B-Format material (e.g.. recordings) as background audio? (not sure how you would cope with moving scenes)
Yes. Rotation is not a problem with Ambisonics, though it doesn’t work for other panners, which go in and out of focus, which is why it’s been avoided by well-made games – it’s been better up till now to leave the pre-rendered soundfield in place than to track the camera. Ambisonics gives us both options, and we’re not necessarily restricted to first order horizontal, either! B-Format is obviously the most sensible way to capture ambience in future, and though we’ve done eight-mic recordings of stadium sound for in-game surround hitherto, we plan to use soundfield mics from now on. And the same applies to synthetic soundfields, like battlefields, especially with multiple layers or shells of surround audio…
Can you give us any details on the implementation?.. are you doing any spatial effects (like reverb, or sound bouncing off the ground etc.)
The PS3 is so fast – tens of GigaFLOPs on each of seven CPUs available to us – that high-order Ambisonics suits it very well. Most of the optimisation effort went into the trigonometry needed to go from game-style orthogonal vectors and matrices to the azimuth and elevation model now standard for Ambisonics. After that, the encoder and decoder are very fast, especially as they parallelise well, without pipeline bottlenecks like division and tight operand dependencies.
Overall Ambisonics complements other aspects of nextGen PS3 game audio, like good quality sample-rate-conversion – rather than the noisy LERPs still sadly common on PCs – plus modern psychoacoustically-modelled decompression, and phase-coherent 512 band filtering on each voice. There’s so much CPU power on PS3 that all this, and multiple reverbs, can run on a single SPU (Synergistic Processing Element, an eighth of the PS3’s Cell processor array) with time to spare.
There are six independent reverb units running in the PS3 version, versus two stereo ones on Xbox360. These are not just for reflections in tunnels or when you get close to trackside objects – they works beautifully for reflections from other vehicles too, and give exciting effects when the car goes out of control – the sort of emergent behaviour you look forward to getting when you combine several advanced systems in one game!
What kind of decoding are you doing? (are you using shelf filters?, are you aiming at ITU speaker layouts?)
We have per-voice FIR filters but we are not using them as much as we could, and plan more research into this – listening tests will be crucial, and we have to cope with the world as it is, for our customers, rather than as we’d like it in a lab.
The HDMI 7.1 on PS3 already allows us to have six speakers in a regular hexagon, ideal for Ambisonics, without breaking the Blumlein stereo panning rules or Dolby cinema guidelines (so the front centre and sub are available for audio conceptually outside the soundfield, like co-driver calls, checkpoint notifications and front-end sounds).
5.1 has to be a compromise in any game, because ITU leaves big holes at the sides and rear which matter much more in games than films. In a game you really need audio to tell you what’s happening outside your narrow TV or monitor view. So we use a compromise between ITU and quad layouts if listeners are limited to compressed AC3 or 5.1 over HDMI, and some gain tweaks to improve consistency. We did consider giving people a choice of speaker layouts, or implementing auto-calibration, but the design team reckoned there were enough audio options – in the PS3 dashboard as well as in the game – for the time being, without adding more.
What kind of backup / help would you like to see from the ambisonics community for extending the use of ambisonics in gaming? (documentation? code libraries?…)
The more people using Ambisonics, demonstrating its value and sharing ideas, the better. We’ve only got so much time and this is just one aspect of our Central Tech Audio team effort. We already co-operate with open-source developers and while console development NDAs limit code sharing to some extent, I’m sure it’s not just Codemasters games that will benefit. We’re always looking out for good C++ programmers who understand audio, to expand our pool of specialists in the development studio, so if you’re keen to turn a hobby into a career it’s certainly worth filling in the speculative application form at http://www.thenewbreed.info – saying you heard about Codemasters Audio via Ambisonia.com, of course
Are you planning on using Ambisonics in other games?
Yes, and we plan to do much more with moving soundfields, impulse response reverbs and dynamic reflections, as well as true 3D on speakers and headphones. The Ambisonic support in DiRT PS3 is just a first step, though an exciting one.
Would be great to have some B-Format material to publish on Ambisonia.com … if you ever get around to extracting B-Format, please do let me know.
Will do, but don’t hold your breath. Your best bet for the time being is to find a well-configured PS3 with HDMI in 7.1 on matched speakers, and hear the game respond to you directly. It’s a lot of fun, especially if you’re a good listener.