Building re.flow Part 3: Web AudioContext and Sequencer

Building re.flow Part 3: Web AudioContext and Sequencer

600 100 Jason Marsh

This is the third part of a four-part set of posts about re.flow. If you missed part 1, go here.

Hear and see the end result here: http://dolby.flow.gl.

This post is about programming the Web AudioContext and the sequencer.

How the AudioContext works together with Dolby Digital Plus

For this project, the audio motion is hard-wired into the sound files themselves, so each file can be created as a 5.1 mix. At this stage of using Dolby tech, you can’t real-time move sounds in 3D space. To be clear, it is not doing real-time re-positioning of each sound: instead it is processing the 5.1 mixes into appropriate positioning based on the audio hardware attached. If the hardware is an HDMI out to a surround system, then you get 5.1 surround sound. If you are using headphones, it uses Head Related Transfer Functions to put a sound in a precise location.   That’s why getting beyond the built-in laptop speakers are so important; Dolby’s real-time processing gives truly amazing spatialization.

I’ll touch on a the project-specific topics regarding the Web AudioContext API, but Boris Smus has laid out the full documentation, so check out his  (free) book.

The AudioContext has built-in positioning of 3D objects, described well here: Mixing Positional Audio and WebGL. It sounds to me like this is just doing basic panning within a stereo mix, without volume changes for distance. But, if you are running on browser without Dolby Digital Plus enabled, this is a good fallback, so re.flow does so.

Here’s code for querying the browser’s capabilities, both the AudioContext and the EC-3  ( the codec for Dolby Digital Plus) capability:

[code language=”javascript”]
var isAudioContextSupported;
var isEC3supported;

function checkAudioCompatibility() {
var contextClass = (window.AudioContext ||
window.webkitAudioContext ||
window.oAudioContext ||
window.msAudioContext);
if (! contextClass) {
// Web Audio API is not available.
alert("This browser won’t work! Multichannel audio is not available with this browser. \n\n" +
"Try the following browsers: \n" +
‘ Desktop: Edge, Chrome, Firefox, Opera, Safari \n ‘ +
‘ Mobile: Firefox Mobile, Firefox OS, Chrome for Android. ‘);
isAudioContextSupported = false;
return;
}
isAudioContextSupported = true;

var myAudio = document.createElement(‘audio’);
myAudio.id = "audioTest";
if (myAudio.canPlayType) {
// CanPlayType returns maybe, probably, or an empty string.

playMsg = myAudio.canPlayType(‘audio/mp4; codecs="ec-3"’);
if ( "" != playMsg){
isEC3supported = true;
console.log("ec-3 is "+playMsg+" supported");
} else {
isEC3supported = false;
}
}
}

[/code]

And here’s the sweet part, which tells the browser that we are dealing with 5.1 surround (6 channels):

[code language=”javascript”]
var finalMixNode;

function initializeAudio(){
audioContext = new AudioContext();

finalMixNode = audioContext.destination;
console.log("maxChannelCount:" + audioContext.destination.maxChannelCount);

if (audioContext.destination.maxChannelCount >=6) {
audioContext.destination.channelCount = 6;
}
[/code]

The above code will report 2 channels in Chrome (or any non-EC3 browser), and 6 channels in Microsoft Edge. If there are not 6 channels available, then we leave the channelCount to its default of 2.

[code language=”javascript”]
//if not using EC3, then use the AudioContext’s positional panner
if (!isEC3supported) {
audioContext.listener.setPosition({x:0, y:0, z:0});
}
}
[/code]

Just to complete the code as to the soft-switch based the browser capabilities, here is the code for preparing a clip to play, which I’m managing by creating an “AudioTrack” object. An AudioTrack is a track that I can swap different clips into. It has functions like prepareClip, playClip, and stopClip. Here is the prepareClip code:

[code language=”javascript”]
AudioTrack.prototype.prepareClip = function ( buffer) {
if (this.isPlaying && this.bufferSource) {
this.bufferSource.stop();
}
this.isPlaying = true;
this.bufferSource = audioContext.createBufferSource();
this.bufferSource.buffer = buffer;
this.bufferSource.loop = false;

if (isEC3supported) {
this.bufferSource.connect(masterGainNode);
} else { //Since we are not using Dolby Digital, fall back to the built-in Panner
this.volume = audioContext.createGain();
// Connect the sound source to the volume control.
this.bufferSource.connect(this.volume);
this.panner = audioContext.createPanner();
this.panner.setPosition(0,0,0);
// Instead of hooking up the volume to the main volume, hook it up to the panner.
this.volume.connect(this.panner);
// And hook up the panner to the main volume.
this.panner.connect(finalMixNode);
}
}
[/code]

There is quite a bit of underlying object structure I created to manage the clips, the tracks, the 3D objects, the visualizers, and the animation. Instead of walking through all that code, here’s a diagram to give you a sense of it:

CodeDiagram

There are 16 AudioClips, each with a sound file. Based on the isEC3Supported flag, the file with be either an mp4/ec-3 file or an mp3 file.

There are 8 AudioTracks, each of which can have a currently playing AudioClip.

An AudioVisualizer is the Audio track plus the connections to the visualization for that track. It has the currently playing audioClip on that track, the 3D Object, the Shader on the 3D Object, and the key-framed Animation for the 3DObject. This will be explained in more detail in Part 4, describing the visualization aspects of the project. For now, think of it as a visual track, and we’ll have 8 of them.

The ClipKit manages a kit (think drumkit) of clips available to swap in and out of an AudioTrack. This is a singleton. It manages loading the audio files, and playing and starting a particular clip or track.

The Sequencer manages which clips are to play on which tracks, and when. It has an internal data structure called “measures” which is used to time out when to play which clips. Keeping all the tracks carefully synchronized is well described here: A Tale of Two Clocks – Scheduling Web Audio with Precision.

The sequencer interface is a bit different than a typical drumkit, such as WebAudio Drum Machine – Chromium. Each column represents 4 measures, instead of maybe a sixteenth note.

re.flow_Sequencer_on_white

It is also different than the usual looping interface: no clips are looped. Instead each clip is started and plays its full length, and then stops. Different clips have different lengths, as shown above. This keeps the audio flowing and remixing in much more complex ways than a bunch of short clips. And that complexity makes can make the experience more interesting over long sessions.

The clip start times can be edited in this interface, or by clicking directly on the 3D Objects. Since each track can have multiple objects, successive clicks toggle through the clips available on that track.

Once the user has built their own sequence, they might want to save it for later, or share it with their friends! So my friend Dustin Butler created a DynamoDB instance on the Amazon Web Services, and saves the measures structure as a JSON, attached to a unique key. Hopefully this simple save and share will promote some viral sharing!

So now we have audio clips, panning around in space, organized via a sequencer, savable in the cloud. But now we need some visuals! Move on to Part 4.