Interactive Video and Audio

Understanding and creating interactive motion and sound on the web



Video and Audio capture, processing and playback through your web browser has, for a long time, been left up to Adobe Flash to provide. Today, modern browsers provide several HTML tags and a bunch of Javascript that will allow you to easily load, play, pause and stop audio and video.

Taken together, the HTML tags and related Javascript commands are known variously as HTML5 MediaNative Media, The Media API and HTML5 Video/Audio*. Why the multiple names? Don’t get me started.

*And that’s not all folks – modern browsers also allow you to go crazy with your audio; produce tones, create filters, control frequency, capture microphone input and heaps more audiophile goodness which you can get to by using the Web Audio API. Note : You can do more, but it’s much more difficult to learn and use, and is overkill if all you want to do is play an MP3 file. Have a look here for more info.  

Want “Siri” in your browser?? Google are messing around with it ATM and you can get it working in Chrome. Check this demo out.

The Tags

Addition of sound or video is very simple. Use the Audio or Video tags together with a src attribute  to link to your source file. Add the controls attribute to  display a control panel for the content :

<audio src = "path/to/audio.mp3 controls>  </audio> 
<video src = "path/to/video.mp4 controls>  </video>

different browsers support different file formats, so to be cross browser compatible you need to supply your audio/video in multiple formats. Use the structure below to do so. Your browser will play the compatible file.

<video controls>
  <source src="SampleVideo.ogv" type="video/ogv">
  <source src="SampleVideo.mp4" type="video/mp4">

Tag Attributes

Add a number of attributes to control load and playback behaviour. In this example the file “audio.mp3” will  preload, play immediately and loop when done.

<audio src = "path/to/audio.mp3 controls loop autoplay preload = "auto" > </audio>

src = “path/to/file.mp3” – URL for the audio stream

preload = “none” – don’t preload ||  “metadata”– metadata only  || “auto” – preload everything

autoplay – automatic playback as soon as it can do so without stopping

controls – display default controls

loop  – loop it


Modern browsers also provide a number of Javascript features we can apply to our content. Together, these form the HTML5 Media API which includes :

  • Methods that allow us to control our content, for example telling the browser to play or pause our content
  • Events that allow us to respond or react to things that are happening, for example when the video has finished playing.
  • Properties that allow us to investigate and get information about the characteristics of our content, such as the file size or duration.

In the example below, we have two elements in our page, an <audio> element and a <button> element.

We use javascript to watch for a mouse click event on the <button> element. When the click event occurs, we execute a bunch of instructions that tell the browser to start playing our song and to get some info about it.

<audio  id = "trackOne" src = "path/to/audio.mp3></audio>

<button id="playButton"> Play Video </button>


 var song = document.getElementById("trackOne");
 var button = document.getElementById("playButton"); 

 button.onClick = function() {;


In fact, you can find out all kinds of stuff about your audio with javascript, including how long the track has been playing for, the duration of the track, the playback rate, metadata and more.  A full list of javascript commands can be found at W3Schools HTML Audio/Video DOM Reference

Examples and Resources

Native Audio in the browser Article and examples by Mark Boas.

Using HTML5 Audio and Video Tutorials and examples from the Mozilla Developer Network

HTML5 Overview Examples from Opera Developer Resources

HTML5 Audio and Video Tutorials and examples from

MDN Video API Reference Javascript

MDN Audio API Reference Javascript

Source: Getting Started with Web Audio API – HTML5 Rocks

2. Howler.js

Howler.js is touted simply as a “JavaScript audio library for the modern web” that defaults to the Web Audio API and falls back to HTML5 audio.

Getting Started with Web Audio API

Before the HTML5 <audio> element, Flash or another plugin was required to break the silence of the web. While audio on the web no longer requires a plugin, the audio tag brings significant limitations for implementing sophisticated games and interactive applications.

The Web Audio API is a high-level JavaScript API for processing and synthesizing audio in web applications. The goal of this API is to include capabilities found in modern game audio engines and some of the mixing, processing, and filtering tasks that are found in modern desktop audio production applications. What follows is a gentle introduction to using this powerful API.

Getting started with the AudioContext

An AudioContext is for managing and playing all sounds. To produce a sound using the Web Audio API, create one or more sound sources and connect them to the sound destination provided by the AudioContext instance. This connection doesn’t need to be direct, and can go through any number of intermediate AudioNodes which act as processing modules for the audio signal. This routing is described in greater detail at the Web Audio specification.

Browser Support

If the library you’ve chosen uses HTML5 Audio, support is available everywhere including IE9+. But if the library uses the Web Audio API, as is the case with all of the above libraries except Fifer, then support is not as good.

Can I Use... support charts

There’s missing support in some mobile browsers and Safari requires vendor prefixes. The worst news, however, is the fact that there is no version of IE that supports the Web Audio API, not even IE11. It is an open issue with the IE team, so hopefully that will change very soon.

For many years recording webcam video on the web meant using Adobe’s Flash plugin. This is still true at the moment, but the new JavaScript Media Recorder API (previously known as the MediaSt…

Source: The New Media Recorder API – Pipe Blog


The new Web Audio API defined in the latest HTML 5 standards allows audio to be sourced, processed and filtered dynamically in real time. This opens a whole new world of creative possibilities in sound, at precisely the time that the potential audience for sonic experiences is booming. Next time you are on a crowded commuter train, have a look around. Notice how many people have their headphones plugged in to their mobile devices. This is the future, the new sound ecosystem, which is already starting to be exploited by many of the world’s top brand names.

and tags or the Web Audio API.

What’s the difference?

If you just want to get some sounds playing and stopping, the controls offered through HTML5 Audio will be fine.

The WebAudioAPI is a bunch of Javascript methods and commands that give you a lot of control over audio. You can use javascript to trigger and monitor  Loading, playback, crossfading, mixing and a whole bunch of audiophile related controls. You can also get info, playback time, elapsed time, you get spectrum analysis etc.etc. Most of the time this kind of access is only required for building speicalised interactive tool or high end UI.



SoundManager 2 API


Web Audio API Resources

Screen Shot 2016-03-16 at 1.33.49 PM.png

For more general info on APIs in general

What is an API?

What Are APIs, And How Are Open APIs Changing The Internet

What Is HTML5, And How Does It Change The Way I Browse? [MakeUseOf Explains]

10 Websites to See What HTML5 Is All About

Written by Matthew Hughes

February 19, 2015

What you might not know is that these individual components of the HTML5 are largely considered to be APIs in the truest sense. How so? Well, firstly, like all APIs, there’s a published and carefully designed standard of how this functionality of the browser works, and how developers use it.

Tutorials and How – To

HTML5: Video and Audio in Depth Video tutorial series by Steve Heffernan on

Capturing Audio & Video in HTML5  Article and Tutorials by  Eric Bidelman

Getting Started with Web Audio API Article and Tutorials by  Boris Smus

Learning Web Audio API  Article and links to resources by Aqilah Misuary at Sonoport.

Web Audio School A series of examples with step by step instructions by Matt McKegg

Web Audio API Spec at Mozilla Developer Network


Source: The HTML5 Speech Recognition API | Shape Shed

The MediaRecorder API (MediaStream Recording) aims to provide a really simple mechanism by which developers can record media streams from the user’s input devices and instantly use them in web apps, rather than having to perform manual encoding operations on raw PCM data, etc., which would be required when using Navigator.getUserMedia() alone.

Source: MediaRecorder API – Web APIs | MDN

For many years recording webcam video on the web meant using Adobe’s Flash plugin. This is still true at the moment, but the new JavaScript Media Recorder API (previously known as the MediaSt…

Source: The New Media Recorder API – Pipe Blog

A gentle introduction to loading and playing, cross-fading, and filtering sound using the Web Audio API.

Source: Getting Started with Web Audio API – HTML5 Rocks

Speech Recognition and Synthesis Using JavaScript

This post is a part 16 of Speech Recognition and Synthesis Using JavaScript post series.

In this post we will have a look at Speech Recognition API, Speech Synthesis API and HTML5 Form Speech Input API.

Speech Recognition API

Speech Recognition API allows websites to listen to audio using microphone and covert the speech to text.

At present only chrome browsers support this API. In future ofcourse other browsers will support it.

Let’s have a look at the API

« »