Web based ASR (Augmented Sound Reality) proof of concept: Connecting the Physical Web to the Web Audio API

Both SuperCollider and PureData remain interesting and potential candidates for the purposes of facilitating mobile user audio interaction within installation environments, as both platforms can be rendered into mobile applications using iSuperColliderKit[3] for SuperColliderand either PDParty[4] or MobMuPlat [5,6] in the case of PureData. Cases for the use of PDParty are evidenced through the outdoor location based audio experiences of Robert Thomas [7], namely the University of Nottingham’s commissioned ‘Caves’ audio experience and the most recent manifestation of the RJDJ [8] application.

Unfortunately, both the above platforms have proven incompatible with scanning and ranging with iBeacons, with no found use cases, and little documentation, all such use cases seem to use GPS as a means of tracking for outdoor mobile experiences. The only way iBeacon communication seems possible with either PureData or SuperCollider is through the use of a custom Ionic based mobile application, such as Semantic Player [9], that can act as a Generic Attribute (GATT) server.

Therefore, my current exploration of the use of iBeacons for the tagging and tracking of users and physical objects within an indoor installation environment led me to investigate the Web Audio API [10] in conjunction with Web Bluetooth [11] as a potential solution.

It should be noted that I have been unable to find any use cases, or documentation, regarding the use of physical sensor input for controlling or manipulating the playback of audio through the Web Audio API, though I believe my initial explorations (which I will outline later on) do demonstrate that it is possible.

It would therefore appear that this combination of the Web Audio API with the Physical Web [13] to create progressive web based audio applications, could offer new and innovative possibilities for study and for accessible and dynamic audio content delivery.

An overview of the prototype design can be found in my previous post.

The Puck.js scans for an iBeacon with a specific ID every 2 seconds and stores it’s RSSI reading in a variable. Additionally, at a rate of 2.5Hz, the puck stores the current bearing of the listener (based on the readings from it’s onboard calibrated magnetometer) as a variable too. These two variables (the RSSI reading and the Bearing) are then printed out to Bluetooth for the web app to receive once it’s connected.

The Puck.js code:

var myValues ={prox:100, bearing:0};

// Start scanning for the beacon
function checkPresence() {
packets=5;
NRF.setScan(function(d) {
  var found = false;
  packets--;
  if (packets<=0)
    NRF.setScan(); // stop scanning
  else 
    
    for (var dev in d) {
      if (d.id === "d8:3c:52:4c:71:89 random") found = true;    
    }
  
    if (found) myValues.prox = d.rssi;

});

}
// check once every 2 seconds
setInterval(checkPresence, 2000); 


// Ambient magnetic offset values
const avg = { "x": 1039, "y": 282, "z": 2967 };

Puck.magOn(2.5);
Puck.on('mag', function(xyz){

  xyz.x -= avg.x;
  xyz.y -= avg.y;
  xyz.z -= avg.z;

// Get compass bearing value.
var bearing = parseInt((Math.atan2(xyz.y, xyz.x) * 180) / Math.PI);
  
myValues.bearing = bearing; 
  
Bluetooth.println(JSON.stringify(myValues));
});

You’ll see that the physical object’s bearing is stored in the variable ‘sourcePosition’. The values of the Web Audio nodes can then be manipulated in response to  the listener’s bearing in relation to the object’s bearing. In this example a simple increase in the gain node value is applied exponentially when the listener looks in the direction of the physical object, and decreased again when the listener looks away. An approximate value for the bearing of the listener in relation to the object is calculated to within 3o degrees (this is to allow for fluctuations in the bearing data and helps smooth out the transitions). A similar approach could be applied to a panner node to create a dynamic audio spatialisation experience

The web app code:

<html>
 <head>
 <script src="https://www.puck-js.com/puck.js"></script>
  
   <style>
     body { margin:0;  }
     
   button {
     width:200px;
     padding:20px;
     margin-bottom:10px;
   }
   #proximity {
     font-weight:bold;
     font-size:30px;
   }
   </style>    
 </head>
 <body>
 <button id="connectButton">CONNECT</button><br/>
 <button id="playButton1">PLAY 1</button>
   <p id="beacon1rssiValue"></p>
   <p id="listenerOrientation"></p>
  <script type="text/javascript">
  
var startProximity = 75; //Proximity value at which intro audio starts.
var sourcePosition = -140; //Bearing of the audio source.

var audioContext = new AudioContext();
var audioBuffer;
var getSound = new XMLHttpRequest();
getSound.open("get", "resources/farrar.mp3", true);
getSound.responseType = "arraybuffer";
getSound.onload = function() { audioContext.decodeAudioData(getSound.response, function(buffer) {
    audioBuffer = buffer;
  });
};
var gainNode = audioContext.createGain();
var pannerNode = audioContext.createStereoPanner();
var myListener = audioContext.listener;
gainNode.gain.value = 0.1;

getSound.send();
function playback() {
  var playSound = audioContext.createBufferSource();
  playSound.loop = true;
  playSound.buffer = audioBuffer;
  playSound.connect(pannerNode);
  pannerNode.connect(gainNode);
  gainNode.connect(audioContext.destination);
  playSound.start(audioContext.currentTime);
}
document.getElementById("playButton1").addEventListener("click", playback);


// Called when we get a line of data from the puck and call the useReceivedData function.
    function onLine(v) {
     useReceivedData(v);
    }   

// Connect and disconnect from puck button.
    var connection;
    document.getElementById("connectButton").addEventListener("click", function() {
      if (connection) {
        connection.close();
        connection = undefined;
      }
      Puck.connect(function(c) {
        if (!c) {
          alert("Couldn't connect!");
          return;
        }
        connection = c;
        
        
// Handle the data we get back, and call 'onLine' whenever we get a line.
        var buf = "";
        connection.on("data", function(d) {
          buf += d;
          var i = buf.indexOf("\n");
          while (i>=0) {
            onLine(buf.substr(0,i));
            buf = buf.substr(i+1);
            i = buf.indexOf("\n");
          }
        });
              });
    });
    
// Unpack JSON values and use them.    
    
function useReceivedData(d) {

var values = JSON.parse(d);

document.getElementById("listenerOrientation").innerHTML = "Listener Orientation: "+values.bearing;	

document.getElementById("beacon1rssiValue").innerHTML = "Beacon 1 RSSI: "+values.prox;

//Update audio playback values based on listener proximity and bearing

var proximity = Math.abs(values.prox);

var approxBearing = values.bearing - sourcePosition;

console.log(approxBearing); 

if (approxBearing <= 30) {

gainNode.gain.exponentialRampToValueAtTime(1.5, audioContext.currentTime + 1); 

} else {
  gainNode.gain.exponentialRampToValueAtTime(0.1, audioContext.currentTime + 1);
}

}

  </script>
 </body>
</html>

Identified Issues

So, at the moment this is still somewhat just a proof-of-concept and there are still issues to resolve, including:

  • There is currently no native support for Web Bluetooth in iOS, though it is possible to load the web application through a third-party app.
  • The reliance on bearing data for determining the field of vision for the user works well in small areas and where ‘tagged’ objects are located around the edge of the space. But obviously if the user can walk around, or behind tagged objects, their bearing in relation to the user will change dramatically. Therefore a POA or POE (point of entry or point of arrival) may need to be determined in order to log the users position at given points and present the objects location within the application relative to the users current position.
  • Fluctuations in the bearing data made the direct mapping of the bearing data to the surround sound panner very jumpy and disorientating, so more approximate values were used at given bearing points.
  • Using iBeacon RSSI readings as a measurement of proximity is hugely unreliable and also highly demanding on the Puck.js’s battery life.
  • No vertical audio spatialisation or object detection

Further Work

I believe it may be possible to extend the narrative power and enticement to interaction as outlined by Bishop [23] by delivering different audio content based on proximity and go beyond the manipulation of volume and spatialisation parameters.

In relation to my cultural case study example, a musical loop could change to a singing loop when the user is looking towards the poster, and the singing loop could change to a spoken word interview, perhaps giving the impression of the poster talking directly to the user, when the user is in close proximity to the poster. The dissection of the currently used archive material into looped regions could support this exploration.

Such an approach could add to the overall narrative structure which is presented to the user through a fusion of techniques (audio volume, audio spatialisation and visual representation) and create a further enticement to explore.

Along with the above, I’d like to continue to develop two more example use cases (creative sound art and adaptive music) as discussed previously, as I believe the techniques developed in this cultural / exhibition example could transfer across to these different contexts.

If measuring proximity using iBeacons is no longer going to be pursued as an option then this again opens up the possibility of a PureData or SuperCollider based solution, or potentially an AR Tookit, or Unity based mobile solution. The two later options appear desirable given their existing inbuilt audio listener objects, head and object tracking, and audio spatialisation capabilities.

Rather than focusing on the audio augmentation of individual objects it may be useful to think in terms of ‘zones’ or static ‘listening points’ within a gallery or installation. Such an approach could help resolve the issues involved in using ‘fixed point bearings’ for the associated objects and inaccuracies of the listener’s bearing data.

Related work and technologies

In relation to the headphone-mounted sensor prototype that I’ve been developing to facilitate a mobile, audio based augmented reality and dynamic audio experience, two emerging products have come to my attention. The first is Bose’saugmented reality glasses [18], still in development, the glasses have embedded speakers that rest just in front of the ears, although not bone-conducting, these speakers enable a mixed-reality audio experience, blending the external ambient sound with the audio delivered through the speakers. The glasses also carry head-tracking motion sensors that, when paired with your smartphones GPS, enable the overlay of audio information onto the real-world, physical objects based on the users location, and the direction in which they are looking. Boseis promising an accompanying Bose AR Wearable Software Development Kit (SDK) which could offer interesting possibilities for this product regarding the delivery of dynamic musical, creative and cultural audio content. I have requested access to the SDK via their website [19],

The other related technology is the RondoMotion [20], this resembles more closely the type of interface I’ve been attempting to develop. RondoMotionis essentially a clip-on 3-axis accelerometer and a 3-axis gyroscope sensor that can be attached to the headband of your existing headphones and communicates with your smartphone via Bluetooth Low-Energy (BLE). The developer’s primary concern is with the realisation of head-tracked dynamic audio spatialisation which it appears to deliver though an accompanying mobile application. This product resembles very closely my current prototype interface, though it does include a gyroscope for better user tracking, but is not capable of scanning for other BLE devices.

In terms of other related works and projects, the Plymouth iSoundinstallation [21], and the Sennheiser’s AMBEO system [22], as utilised by the V&A gallery remain interesting points of reference. The former relying on a network, or mesh, of iBeacons to create an Indoor Positioning System (IPS), the later utilising an intensive infrastructure of electromagnetic transmitters, or identifiers, and receivers, as outlined for the V&A’s ‘David Bowie is’ exhibition [23].

Notes & References

  1. SuperCollider – https://supercollider.github.io
  2. PureData – https://puredata.info
  3. iSuperColliderKit – https://github.com/wdkk/iSuperColliderKit
  4. PDParty – https://github.com/danomatika/PdParty
  5. MobMuPlat – http://danieliglesia.com/mobmuplat/
  6. Iglesia, D. (2016). The Mobility is the Message: the Development and Uses of MobMuPlat. Iglesia Intermedia; Google, Inc. California, USA.
  7. Robert Thomas – https://thenextweb.com/apps/2016/04/25/hear-rjdj-ios-app/
  8. RJDJ – https://thenextweb.com/apps/2016/04/25/hear-rjdj-ios-app/
  9. Semantic Player – https://code.soundsoftware.ac.uk/projects/semantic-player
  10. Web Audio API – https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API
  11. Web Bluetooth – https://developers.google.com/web/updates/2015/07/interact-with-ble-devices-on-the-web
  12. Physical Web – https://google.github.io/physical-web/
  13. GyrOSC – http://www.bitshapesoftware.com/instruments/gyrosc/
  14. Envelop – http://www.envelop.us/software/
  15. js – https://www.espruino.com/Puck.js
  16. Espurino – https://www.espruino.com
  17. Bose’s augmented reality glasses – https://www.theverge.com/2018/3/12/17106688/bose-ar-audio-augmented-reality-glasses-demo-sxsw-2018
  18. Bose Wearable SDK – https://developer.bose.com/wearable-sdk
  19. RondoMotion – https://www.kickstarter.com/projects/261641446/bring-your-headphones-to-life
  20. iSoundinstallation – Frontiers: Expanding Musical Imagination With Audience Participation. (2016)
  21. Sennheiser’s AMBEO system – https://en-uk.sennheiser.com/finalstop
  22. ‘David Bowie Is’ – https://en-de.sennheiser.com/news-david-bowie-is-sennheiser-helps-the-va-bring-together-sound-and-vision-
  23. Bishop, C. (2005). Installation Art: A Critical History. Tate: London.