Skyview: Spatial Audio AR app experience for iconic setting in Stockholm.

skyview7.png

Skyview is a gondola ride that takes visitors to the top of the world’s largest spherical building, AEG’s Globe Arena in Stockholm Sweden. Taking this remarkable site-specific setting, a few of us from Microsoft Research’s Soundscape team built a “world-first” Audio AR experience that adds immersive audio storytelling to the Skyview Gondola ride.

Visitors are given headphones and an iPhone with our Skyview app upon boarding the ride, and then embark on a guided interactive audio tour featuring ambisonic field recordings of the city’s top landmarks that I captured and edited over four trips to Stockholm in 2018-19.

As the Audio Experience Developer, I owned the end-to-end ambisonic design and Audio UI asset creation process. There was a very close partnership between our Swedish clients at AEG, as well of course with Microsoft’s project lead Jarnail Chudge and our SDE Michael Pramateftakis to integrate these interactive audio stories through repeated rounds of prototyping, research and design. I also wrote the copy for the narrative track overlay which added a storytelling element for contextual understanding, and I also ended up recording and performing the English version of this role of Narrator.

Happily, the project was honored with the 2020 Golden Wheel Silver Award in the Craft of the Year category (Sweden’s annual Advertising and Experience Industry Awards)!

 
Screen+Shot+2020-02-08+at+4.03.17+PM.jpg
DSCF7071.jpg
DSCF7226.jpg
DSCF7284.jpg
globe arena.jpg
Screen Shot 2020-02-08 at 4.46.18 PM.png

Spatial Audio Development Process

The Audio Experience Design effort began by creating audio wireframes and dummy assets to start iterating very quickly while working closely with SDE Pramateftakis to implement the intial prototypes of the app. Early on I developed the idea of a Home Base Layer that would play for the entirety of the gondola experience (~20 mins) and mimic the real time height and sonic landscape of the gondola at that point in it’s journey. This supports a continuity of psychoacoustic trickery; whether or not visitors are currently listening to a landmark selection, they will hear the Base Layer’s ambisonic audio that feels representative of their position (close to the ground = bicycles passing by and snippets of pedestrians’ conservations, while when they reach 130 meters in the air = wind, distant highway noise, seagulls passing, airplane flying overhead, etc). After a landmark vignette ends and fades out, the Base Layer fades back in and returns guests to this naturalistic soundtrack at the appropriate place in their real world journey.

Here are some pictures from my field recording capture site visits. I employed the 1st order ambisonic Sennheiser AMBEO VR mic, a Sennheiser Mk418 shotgun mic, and the Zoom F8 recorder. Often I’d spend about 30-60 minutes at each site, and end up distilling that down to an edited 30 second 3D audio vignette. Then I’d overlay about 15 seconds of narration which described the setting and scene, historical or cultural significance, architectural characteristics, etc.

Screen Shot 2020-02-08 at 4.01.59 PM.png

Here’s a brief video showing a screen record of that visualizes the ambisonic audio editing process within Reaper, using the Waves Ambisonic Toolkit set of editing plugins (with Bluetooth headtracking device attached to monitoring headphones to simulate the interaction). Colors represent frequency range, and editing towards a spatial balance is important to keep an eye on as the vignettes get refined and directional assets are laid into the 3D soundfield.

 

Key takeaways from the project development

Iterate quickly – Tight loop with developer allows for Build/Study/Learn

Exaggerate spatialization – Stretch the movement of mono sources on top of ambisonic recording to make it clear.

Subtle sound design without a visual overlay will be lost on avg guests.

Overlaid Sound Design should be:

In service of spatialness

Requires slow pace/looping, giving time for guests to localize the sound source within invisible geography.

Narration should:

Give explicit context or explanation for the sounds in real time.


Project Launch + Press

Launched in November 2019, the project has been warmly received by press and visitors, and has quantifiably added a value to the Skyview gondola experience that has in turn spurred new conversations about how we might use 3D audio-focused AR to transport the ears and imaginations of visitors within other site-specific projects.

Stockholm Direkt press / IT Kanalen press / Expressen press

Microsoft 3D Audio Symposium powerpoint presentation (link)

In Spring 2019 I was invited to give a presentation at an event on campus called the 3D Audio Symposium. It was an internal event where we shared learnings and experiences amongst different teams that all are in someway linked to the implementation of 3D Audio. It was great to share the Skyview project with my colleagues, and these slides will give a sense of my talk that day.