Oppen Do Down--first Web Audio API piece

In my previous post, I made notes about my reading of and preliminary understanding of Chris Wilson's article on precision event scheduling in the Web Audio API--in preparation to create my first Web Audio API piece. I've created it. I'd like to share it with you and talk about it and the programming of it.

Oppen Do Down, an interactive audio piece

 The piece is called Oppen Do Down. I first created it in the year 2000 with Director. It was viewable on the web via Shockwave, a Flash-like plugin--sort of Flash's big brother. But hardly any contemporary browsers support the Shockwave plugin anymore--or any other plugins, for that matter--the trend is toward web apps that don't use plugins at all but, instead, rely on newish native web technologies such as the Web Audio API, which requires no plugins to be installed before being able to view the content. The old Director version is still on my site, but nobody can view it anymore cuz of the above. I will, however, eventually release a bunch of downloadable desktop programs of my interactive audio work.

You can see the Director version of Oppen Do Down in a video I put together not long ago on Nio, Jig-Sound, and my other heap-based interactive audio work.

I sang/recorded/mixed the sounds in Oppen Do Down myself in 2000 using an old multi-track piece of recording software called Cakewalk. First I recorded a track of me snapping my fingers. Then I played that back over headphones, looping, while I recorded a looping vocal track. Then I'd play it back. If I liked it I'd keep it. Then I'd play the finger snapping and the vocal track back over headphones while I recorded another vocal track. Repeat that for, oh, probably about 60 or 70 tracks. Then I'd pick a few tracks to mix down into a loop. Most of the sounds in Oppen Do Down are multi-track.

As you can hear if you play Oppen Do Down, the sounds are synchronized. You click words to toggle their sounds on/off. The programming needs to be able to download a bunch of sound files, play them on command, and keep the ones that are playing synchronized. As you turn sounds on, the sounds are layered.

As it turns out, the programming of Oppen Do Down was easier in the Web Audio API than it was in Director. The reason for that is all to do with the relative deluxeness of the Web Audio API versus Director's less featureful audio API.

Maybe the most powerful feature of the Web Audio API that Director didn't offer is the high-performance clock. It's high-performance in two ways. It has terrific resolution, apparently. It's accurate to greater precision than 1 millisecond; you can use it to schedule events right down to the level of the individual sound sample, if you need that sort of accuracy. And the Web Audio API does indeed support getting your hands on the very data of the individual samples, if you need that sort of resolution. But the second way in which the high-performance clock is high-performance is that it stops for nothing. Which isn't how it normally works with timers and clocks programmers use. They're usually not the highest-priority processes in the computer, so they can get bumped by what the operating system or even the browser construes as more important processes. Which can result in inaccuracies. Often these inaccuracies are not big enough to notice. But in Oppen Do Down and pretty much all other rhythmic music, we need accurate rhythmic timing.

Director didn't offer such a high-performance clock. What it had was the ability to insert cue-points into sounds. And define a callback handler that could execute when a cue-point was passed. That was how you could stay in touch with the actual physical state of the audio, in Director. The Web Audio API doesn't let you insert cue-points in sounds, but you don't need to. You can schedule events, like the playing of sounds, to happen in the time coordinate system of the high performance clock.

This makes synchronization more or less a piece of cake in the Web Audio API. Because you can look at the clock any time you want with great accuracy (AudioContext.currentTime is how you access the clock) and you can schedule sounds to start playing at time t and they indeed start exactly at time t. And the scheduling strategy Chris Wilson advocates, which I talked about in my previous post, whereby you schedule events a little in advance of the time they need to happen, works really well.

There are other features the Web Audio API has that Director didn't. But, then, Director was actually started in 1987, whereas the Web Audio API has only been around for a few years as of this date in 2018. You can synthesize sounds in the browser, though that isn't my interest; I'm more interested in recording vocal and other sounds and doing things with those recorded sounds. You can also process live input from the microphone, or from video, or from a remote stream. And you can create filters. And probably other things I don't know anything about, at this point.

Anyway, Oppen Do Down links to two JavaScript files. One, oppen.js, is for this particular app and its particular interface. The other one, sounds.js, is the important one for understanding sound in Oppen Do Down. The sounds.js file defines the Sounds constructor, from top to bottom of sound.js. In oppen.js, we create an instance of it:

gSounds=new Sounds(['1.wav','2.wav','3.wav','4.wav','5.wav','6.wav']); 
gSounds.notifyMeWhenReady(soundsAreLoaded);

Actually there are 14 sounds, not 6, but just to make it prettier on this page I deleted the extra 8. I used wav files in my Director work. I was happy to see that the Web Audio API could use them. They are uncompressed audio files. Also, unlike mp3 files, they do not pose problems for seamless looping; mp3 files insert silence at the ends of files. I hate mp3 files for that very reason. Well, I don't hate them. I just show them the symbol of the cross when I see them.

The gSounds object will download the sounds 1.wav, etc, and will store those sounds, and offers an API for playing them.

'soundsAreLoaded' is a function in oppen.js that gets called when all the sounds have been downloaded and are ready to be played.

gSounds adds each sound (1.wav, 2.wav, ... 14.wav) via its 'add' method, which creates an instance of the Sound (not Sounds) constructor for each sound. The newly created Sound object then downloads its sound and, when it's downloaded, the 'makeAvailable' function puts the Sound object in the pAvailableSounds array.

When all the sounds have been downloaded, the gSounds object runs a function that notifies subscribers that the sounds are ready to be played. At that point, the program makes the screen clickable; the listener has to click the screen to initiate play.

It's important that no sounds are played until the user clicks the screen. If it's done this way, the program will work OK in iOS. iOS will not play any sound until the user clicks the screen. After that, iOS releases its death grip on the audio and sounds can be played. Apparently, at that point, if you're using the Web Audio API, you can even play sounds that aren't triggered by a user click. As, of course, you should be able to, unless Apple is trying to kill the browser as a delivery system for interactive multimedia.

I've tested Oppen Do Down on Android, the iPad, the iPhone, and on Windows under Chrome, Edge, Firefox and Opera. Under OSX, I've tested it with Chrome, Safari and Firefox. It runs on them all. The Web Audio API seems to be well-supported on all the systems I've tried it on.

When, after the sounds are loaded, the user clicks the screen to begin playing with Oppen Do Down, we find the sound we want to play initially. Its name is '1'. It's the sound associated with the word 'badly'. We turn the word 'badly' blue and we play sound '1'. We also make the opening screen invisible and display the main screen of Oppen Do Down (which is held in the div with id='container').

var badly=gSounds.getSound('1');
document.getElementById('1').style.color="#639cff";
gSounds.play(badly);
document.getElementById('openingScreen').style.display='none';
document.getElementById('container').style.display='block';

The 'gSounds.play' method is, of course, crucial to the program cuz it plays the sounds.

It also checks to see if the web worker thread is working. This separate thread is used, as in Chris Wilson's metronome program, to continually set a timer that times out just before sounds stop playing, so sounds can be scheduled to play. If the web worker isn't working, 'gSounds.play' starts it working. Then it plays the '1' sound.

Just before '1' finishes playing--actually, pLookAhead milliseconds before it finishes--which is currently set to 25--the web worker's timer times out and it sends the main thread a message to that effect. The main thread then calls the 'scheduler' function to schedule the playing of sounds which will start playing in pLookAhead milliseconds.

If the listener did nothing else, this process would repeat indefinitely. Play the sound. The worker thread's timer ticks just before the sound finishes, and then sounds are scheduled to play.

But, of course, the listener clicks on words to start/stop sounds. When the listener clicks on a word to start the associated sound, 'gSounds.play' checks to see how far into the playing of a loop we are. And it starts the new sound so that it's synchronized with the playing sound. Even if there are no sounds playing, the web worker is busy ticking and sending messages at the right time. So that new sounds can be started at the right time.

Anyway, that's a sketch of how the programming in Oppen Do Down works.

Chris Joseph gave me some good feedback. He noticed that as he added sounds to the mix, the volume increased and distortion set in after about 3 or 4 sounds were playing. He suggested that I put in a volume control to control the distortion. He further suggested that each sound have a gain node and there also be a master gain node, so that the volume of each sound could be adjusted.

The idea is that as the listener adds sounds, the volume remains constant. Which is what the 'adjustVolumes' function is about. It works well.

I am happy with my first experiment with the Web Audio API. Onward and upward.

However, it's hard to be happy with some of the uses that the Web Audio API is being put to. The same is true of the Canvas API and the WebRTC API. And these, to me, are the three most exciting new web technologies. But, of course, when new, interesting, powerful tools arise on the web, the forces of dullness will conspire to use them in evil ways. These are precisely the three technologies being used to 'fingerprint' and track users on the web. This is the sort of crap that makes everything a security threat these days.