A Dynamic Audio Spatialisation Prototype using Puck.js and the Web Audio API

I developed a mobile headphone mounted interface that can track the orientation of the listener’s head and deliver dynamic real-time spatialised audio. Meaning that a sound can be positioned within a 360 degree audio field, and the position of that sound remains static in relation to the position of the listeners head and ears.

This was initially prototyped by mapping the magnetometer readings from an iPhone to OSC (Open Sound Control) messages, using the GyrOSC [1] mobile app, to control the Envelop [2] surround sound panner inserted into a single track playing a piece of music in Ableton Live, with the music delivered through a pair of wireless headphones to allow for the listeners 360 degree rotation.

Using the above prototype, the music’s perceived source in physical space could be determined quite accurately by spinning around on an office chair, or by standing up and shuffling around on the spot. This was the case for both the on-ear headphones and the bone-conducting headphones, the later resulting, to varying degrees, in the perception of a virtual audio source within the natural ambience of the physical space.

The failure of the surround sound panner to react to movements of just the head (and the ears), due to it being controlled by the hand-held magnetometer in the iPhone, was quickly made apparent (this had also been experienced previously with other smartphone sensor based mobile sound experiences). It was therefore decided that a head-tracking device would perform much better.

Alongside the above experiment I had been exploring the use of Bluetooth beacons as a way of deploying physical markers in space for the positioning of virtual audio sources, or for tracking the position of listeners.

Other than assessing a beacon’s RSSI (Received Signal Strength Indicator) value from its proximity to a smartphone via a Bluetooth scanning App, most other features, such as determining a range from multiple beacon sources (also known as ‘ranging’) and adjusting the TX, and Major and Minor values of the beacons, proved difficult or impossible to carry out as most beacons are ‘locked’ into their manufacturers API’s via their UUID’s in order to fulfil the IPS (Indoor Positioning Service) that they offer to customers who purchase their beacons.

It also became clear that in order to scan, receive and forward the RSSI values of multiple beacons to a central DAW (Digital Audio Workstation), or some kind of audio engine, would require a BLE (Bluetooth Low Energy) gateway, or GATT (Generic Attribute) Server. Again there are mobile apps available that can do this, but most, if not all, are specific to a beacon manufacturer.

This emerging investigation into BLE networks with both central and peripheral roles for BLE beacon devices lead me to the Puck.js [3] product, which presented itself as a potentially useful and versatile research tool in this area.

The Puck.js is a Javascript programmable BLE beacon. Built around the Nordic BLE chip, it can be programmed using Espruino, an open-source JavaScript interpreter. As well as being a BLE beacon, Puck.js can be programmed to act as GATT server, scan for other BLE devices and send and receive data over a Bluetooth connection. Puck.js is approx. 35mm in diameter and around 8mm thick, ships with an on-board magnetometer, ambient light sensor, temperature sensor and has a physical push-button on its surface. It can also be expanded to include other sensors and features such as GPS.

The small, light-weight and mobile Puck.js enabled me to test a headphone/head mountable magnetometer, and also provide further avenues of experimentation such as listener positioning and scanning for the proximity of other beacons, relaying data to and from other BLE capable devices (Figure 1).

Figure 1. The Puck.js mounted on top of a set of Bluetooth headphones.

<html>
<head>
<script src="https://www.puck-js.com/puck.js"></script>
<script src="https://cdn.jsdelivr.net/npm/resonance-audio/build/resonance-audio.min.js"></script> 
<style>
button {
width:200px;
padding:20px;
margin-bottom:10px;
}
</style>
</head>
<body>
 
<button id="connectButton">CONNECT</button><br/>
 
<button id="playButton">PLAY</button>
 
<p id="magValue"></p>
  
<script type="text/javascript">
  
var connectButton = document.getElementById("connectButton");
  
// Create an AudioContext
let audioContext = new AudioContext();

// Create a (first-order Ambisonic) Resonance Audio scene and pass it
// the AudioContext.
let resonanceAudioScene = new ResonanceAudio(audioContext);

// Connect the scene’s binaural output to stereo out.
resonanceAudioScene.output.connect(audioContext.destination);

// Define room dimensions.
// By default, room dimensions are undefined (0m x 0m x 0m).
let roomDimensions = {
width: 10,
height: 3.5,
depth: 10,
};

// Define materials for each of the room’s six surfaces.
// Room materials have different acoustic reflectivity.

let roomMaterials = {
// Room wall materials
left: 'grass',
right: 'grass',
front: 'grass',
back: 'grass',
// Room floor
down: 'grass',
// Room ceiling
up: 'transparent',
};

// Add the room definition to the scene.
resonanceAudioScene.setRoomProperties(roomDimensions, roomMaterials);

// Create an AudioElement.
let audioElement = document.createElement('audio');

// Load an audio file into the AudioElement.
audioElement.src = 'resources/EMR_recording_samples.mp3';

// Generate a MediaElementSource from the AudioElement.
let audioElementSource = audioContext.createMediaElementSource(audioElement);

// Add the MediaElementSource to the scene as an audio input source.
let source = resonanceAudioScene.createSource();
let listener = 
audioElementSource.connect(source.input);

// Called when we get a line of data
function onLine(v) {
console.log("Received: "+JSON.stringify(v));
      
//Display source position value 
document.getElementById("magValue").innerHTML = "Source Position: "+v;
      
// Set the source position relative to the user's compass bearing 
      
source.setPosition(Math.cos(v), 0, Math.sin(v));

}
    
// When clicked, connect or disconnect
var connection;
connectButton.addEventListener("click", function() {
if (connection) {
connection.close();
connection = undefined;
}
Puck.connect(function(c) {
if (!c) {
alert("Couldn't connect!");
return;
}
connection = c;
// Handle the data we get back, and call 'onLine'
// whenever we get a line
var buf = "";
connection.on("data", function(d) {
buf += d;
var i = buf.indexOf("\n");
while (i>=0) {
onLine(buf.substr(0,i));
buf = buf.substr(i+1);
i = buf.indexOf("\n");
}
});
});
});

// Play the audio button.
playButton.onclick = function (event) {
audioElement.play();
} 

</script>
</body>
</html>

 

Figure 2. The progressive web application source code

The Puck.js also enables experimentation with listener mounted beacons, either as central or peripheral devices within a BLE network through Espruino’s GATT server module.

This head-mounted position could also prove useful for indoor positioning and communication between participants headsets as it may eliminate much of the interference generated by human traffic that Bluetooth signals are prone to,especially if it was communicating with other beacons positioned directly overhead, or beacons attached to a ceiling. With this view, it could eventually be embedded in the headband of the headphones, or provided as a clip-on accessory.

Additional prototypes were made using Espruino’s BLE MIDI module and Web Bluetooth capabilities to send the calibrated magnetometer data as MIDI control messages over Bluetooth to a DAW, and to a Bluetooth enabled web application respectively. The later utilised a combination of Web Bluetooth and the Web Audio API to realise a standalone, mobile, dynamic and responsive surround sound experience (Figure 2). This was achieved by mapping the Puck.js’ magnetometer data to the Web Audio API’s PannerNodeparameters. Although this approach was successful in moving the audio source dynamically around the listener, as the Web Audio API provides an AudioListener interface with orientation properties, it would make better sense, both semantically and functionally, to use this going forward.

Figure 3. A sketch of the prototype system.

Potential future studies

To summarise, some of the possible future avenues of study this initial prototyping has revealed include:

  • Assessing the potential accessibility and participatory value of web apps over mobile apps.
  • Assessing the potential for dynamic audio spatialisation as a navigational tool. (and possibly geolocation based audio content delivery)
  • Assessing the potential for extending the geographic footprint of cultural experiences and narratives through the use of dynamic audio spatialisation and location based audio content delivery methods. This wold use the web audio prototype outlined previously, along with the web geolocation API).
  • Potentials for mobile adaptive music sequencing with the Physical Web and the Web Audio API? (possibly some innovation in the use of this combination)
  • Local infrastructure-less mobile and self-contained location and direction aware audio framework (the combination of web audio and web Bluetooth may enable a more ubiquitous distribution of location and direction aware audio experiences).

All of the above objectives could potentially form part of one carefully designed study.

Things that require attention

Some points that need addressing based on the use and evaluation of the outlined prototypes are:

  • Calibration of the magnetometer for different environments needs to be an integrated part of the system.
  • Smoother mapping of magnetometer data to web audio panner.
  • iOS compatibility issues for Web Bluetooth, will this be resolved in iOS 12? If not, how can I best deal with this? (third-party App?)
  • Build logging capabilities into the program for capturing data from the listener’s interactions

Refinement and clarification of my research area

The outlined prototyping activities have led me to think about several possible refinements, and some clarifications, of my area of research:

  • That my focus should be on dynamic, adaptive and responsive sequencing and remixing of pre-recorded audio content, as opposed to the generation of musical content. This draws on the strengths of my music production experience, whilst also making the framework more applicable to various potential applications (audio archival content delivery, music and the sound arts).
  • This focus, along with the development of a more ubiquitous delivery technology,  may enable me to revisit the potential for the framework to have both mobile indoor and outdoor applications. This is evidenced to some degree by UrbanRemix (Freeman et al; 2011) and Healy & Smeaton (2014), who present web-based and wearable technology based outdoor sound positioning systems respectively.
  • The ‘Noise’ element maintains an important role within the framework, but as a functional tool, rather than a resource for musical generation and sonification. (For example, the use of data-noise from Bluetooth and wireless activity to facilitate listener positioning or content re-sequencing).

Potential Practice-based studies

This focus has given the practice-based aspect of my research more clarity too. I can now identify potential applications and target collaborative practice with the developing framework around them. These include:

  • An adaptive soundtrack to accompany a cultural event or exhibition
  • Geolocation-based musical production game
  •  Musical event promotional tool
  • Sound art installation
  • Production of innovative net art – combining net art with the physical web
  • Audience interaction with a live electronic music performance
  • Archive audio content retrieval and delivery (Cultural / Heritage sites)
  • Extending cultural interaction ‘beyond the gallery walls’

Notes

  1. GyrOSC is a lightweight utility that sends your iPhone, iPod Touch, or iPad’s motion sensors over your local wireless network to any OSC capable host application. Control your live audio or video application with your device’s built-in gyroscope, accelerometer, and compass. http://www.bitshapesoftware.com/instruments/gyrosc/
  2. Envelop for Live is a collection of free, open-source spatial audio production tools that work with Ableton Live. Three-dimensional sound placement enables artists to create immersive mixes for multichannel spaces and headphone-based VR/AR environments, amplifying emotional impact by placing the audience inside the music. http://www.envelop.us/software/
  3. Puck.js is an open source JavaScript Bluetooth Low-Energy (BLE) beacon with on-board sensors that you can program and debug wirelessly. https://www.puck-js.com

References

  1.  Freeman, J; Disalvo, C; Nitsche, M. & Garrett, S. (2011). Soundscape Composition and Field Recording as a Platform for Collaborative Creativity. Organised Sound 16(3): 272–281.
  2. Healy, G., & Smeaton, A. F. (2009). An outdoor spatially-aware audio playback platform exemplified by a virtual zoo (pp. 837–5). ACM  ‘17, New York, USA: ACM Press.