Practical Web Audio

There is little debate that Web Audio is cool. Take for example Stepkit by Brent Jackson (embedded below).

It’s definitely a fun toy to play with, but most of us probably couldn’t think of how this might be relevant to our jobs. When I presented 8-bit game music with the Web Audio API at last year’s Fluent Conference, I readily admitted that it was intended to be purely fun rather than practical.

Recently I explored the idea of adding audio to web apps, but I think the big problem isn’t that web developers were unsure how to add audio to their app, but that they don’t think they should add audio to web apps. In this article, I’d like to make the case that you should be considering audio when designing your web application user interface.

This article is based upon my presentation from presentation at Fluent 2015.

Examples of Audio in UI

Once you begin to think about it, audio is actually very common in UIs. Let’s look at some examples of where audio usage is common and then we’ll explore why it is used.

Video Games

The most obvious example is in game UI. In the example below from the Watch Dogs game, you can see that we’re not just talking about music, but the interaction with menus and transitions all include audio effects.

Mobile Apps

Many, if not most, mobile apps include some level of audio effects for both interactions and notifications. When you pull to refresh, or press certain actions, you often get some form of audio feedback. For example, here’s the audio effects settings for the Twitter app on Android that allows you to enable/disable sound effects.

Twitter for Android

Desktop Apps

Many, though by no means all, desktop applications have some level of audio-enhanced interactions or notifications. Skype may be among the first that comes to mind (perhaps not all positively) as it includes all forms of sound effects when a connection is made, a person signs on, a message is received, etc. Slack is another one that I use often that has a number of audio effects, mostly associated with notifications. Outlook, as shown below, adds sounds to a number of events that occur within the app.

Outlook audio settings

The Web is (mostly) Silent

So, desktop apps, mobile apps and games all include audio within their UIs but the web is mostly silent. In fact, I suspect most developers don’t even consider audio at all when designing their UI.

In part, I think this is for two reasons. The first is that we only had access to the Web Audio API relatively recently. However, browser support is now very broad.

CanIUse Web Audio

The second reason is that I think there is a miserable legacy for audio on the web starting with MIDI and moving on to annoying Flash intros. Developers have, at times, abused the audio options available to them (case in point) and this has left many people with a bad taste in their mouths.

Why Apps Include Audio

If you think about the types of applications that include audio above, they tend to do so for a variety of reasons.

Sound Communicates Atmosphere

This is the reason that most game UIs use audio – it helps to create the atmosphere that the game is trying to represent. Looking back at the example earlier of Watch Dogs, it is converying a slightly futuristic, perhaps slightly dystopian, atmosphere with its choice of sounds.

If you think about it another way, science fiction movies and television often depict computers that respond with all sorts of sounds when the user interacts with them. This isn’t intended to reflect real life – in reality, this would likely be annoying – but the sounds are part of what makes the scene feel futuristic.

We’ve all seen Star Trek etc and are used to the idea of ‘high tech’ machines beeping, whether to alert us or simply as user feedback devices when activated. Similarly many cellphones default to beeping on user actions eg entering txt etc. But it is rare to see it used on a website, not that anyone wants to be ‘beeping’ on every click, but there is the potential to add to the character & feel of a site via use of sound…
Tim Prebble

This use-case probably doesn’t apply frequently to web apps. However, there are brands that have sound as a key part of their brand identity. This can be taken into account when choosing the right audio to add to your UI for other reasons, which we’ll discuss below. Even if your app does not have a brand identity to adhere to, it’s important to consider the kind of atmosphere your sound choices may make.

Sound Communicates When You’re Distracted

distracted

It’s important to keep in mind what we mean by distracted here. We don’t exclusively mean when you aren’t paying any attention to the application, though that can be important too. Sometimes, “distracted” simply means that you are paying attention to some other aspect of the application than the part that I need you to notice at this moment.

Being a mostly visual medium, web application UIs tend to rely entirely on visual cues to communicate important information and changes. A new news item may glow briefly as it appears on a list or slide in using some form of animation. The goal here is to get your attention on a part of the app that may have changed or contain important new information.

However, when you are not looking at the application at all – say, for instance, you’re on another tab or outside of the browser entirely – we have very limited visual options to gain the user’s attention.

Audio is particularly useful when there is no screen or when looking at the screen is not possible or not desirable (such as when users want to multitask).
Karen Kaushansky

Sound Conveys Meaning

This is key, and goes beyond conveying atmosphere we discussed earlier. The distracted use-case is probably the easiest to grasp, but using sound to convey or enhance meaning is, in my opinion, the most important when used properly. To illustrate what I mean, let’s first look at an (admittedly silly) example.

The sounds in this scene were intentionally unrealistic, however, I bet you could have closed your eyes and still known, more or less, what was going on. This may seem unimportant to you until you consider the fact that, to take just one example, no one actually makes a springy noise when they jump. Yet, you hear that springy noise and you think “jump.” Why? Because that sound conveys a meaning, and that meaning was learned and reinforced over time.

In much the same way, if you design your use of sound with a good deal of thought, the users of your web application can learn the meaning and importance of different sounds.

Gaver (1986) investigated representational earcons, which he called auditory icons. His auditory icons are caricatures of naturally occurring sounds such as bumps, scrapes, or even files hitting mailboxes.
Meera M. Blattner, Denise A. Sumikawa, and Robert M. Greenberg (1989)

The quote above discusses the idea of “earcons,” which are effectively sound icons. If you think about what an icon is – a very simple, abstract image that conveys a lot of meaning. For instance, the much-maligned “hamburger menu” is just a few simple lines and yet we generally know that it represents some sort of menu or navigation.

Earcons are similar in that they are simple but can convey a lot of meaning. However, unlike what the above quote implies, they do not need to be realistic sounds. In fact, the meaning of an earcon does not need to be obvious the first time, as its meaning can be learned over time (though this doesn’t mean you shouldn’t consider your sound choices carefully).

[Earcons] are audio messages used in the user-computer interface to provide information and feedback to the user about computer entities.
Karen Kaushansky

Thinking About Sound

As we’ve seen, sound can be used to achieve a number of important goals within a UI that cannot simply be done visually. However, if you choose to consider adding sound to your application, it’s important to do so very carefully. Much like overusing any sort of UI element (like too many animations or too many fonts), overusing or poorly using audio can be painful to your users – perhaps even more so.

Choose Your Sounds Carefully

This rule applies to several aspects of your sounds:

  • Implied meaning – while, as we discussed earlier, the meaning of sounds can be learned, some sounds come with an implied meaning – for instance, a rising pitch usually implies a rising value or dissonant sounds usually imply some form of error.
  • Avoid being annoying – overuse of audio can be annoying, but so can the wrong sounds – beware of harsh sounds, especially if it will be used frequently (I’m looking at you Slack!).

    “Would the addition of sound to this event provide redundant information and therefore better feedback? Or would this sound merely be superfluous and/or annoying?”
    Victor Lombardi

Plan for silence

You cannot guarantee that your users will always be able to hear audio, so make sure you never communicate information only via sound – have the sound enhance a visual cue. Even if the user doesn’t have sound disabled, they could be using a browser that doesn’t fully support the Web Audio API.

The following is from the Apple iOS Human Interface Guidelines, but the same rules generally apply to desktop web apps:

Users switch their devices to silent when they want to:

  • Avoid being interrupted by unexpected sounds, such as phone ringtones and incoming message sounds
  • Avoid hearing sounds that are the byproducts of user actions, such as keyboard or other feedback sounds, incidental sounds, or app startup sounds
  • Avoid hearing game sounds that are not essential to using the game, such as sound effects and soundtracks

Some Ideas

Hopefully I’ve shown that there’s a case to be made for including audio in your UI. Not all apps need audio, but I think it should be considered whenever planning the UI of a web app – even if it is considered and decided that it doesn’t make sense in this situation.

However, perhaps you are still unsure of situations where you might use audio. So, I’ve put together a list of some ideas I had, though it is by no means comprehensive. Some of these were discussed in my prior article, Adding Audio to Web Apps, which covers the actual implementation details. You can also find even more examples in my GitHub repository.

Here’s just some of my own ideas to get your (better) ideas flowing (note: if I have an example of this in my repository, it will be linked):

  • Sliders

  • Inputs

    • Indicate a form input/validation error.
    • Indicate that a limit has been reached – useful for click-happy users who may think something isn’t working when, in fact, they’ve just hit a limit.
  • Page updates/status
    • Indicate changes to a page.
    • Redo/Undo – again, as a reinforcer of an action we don’t ever want our users to do accidentally.
    • Time limit – for instance, on for a session timeout or a purchase that has a time limit (for example, tickets). This and other examples can be used in conjunction with the Page Visibility API to ensure that the warning noise only occurs when the page is not visible to the user.
  • Buttons
    • Enable or disable specific functionality – another example of reinforcing an important action.
    • Multiple pushes modify a value – again, especially useful if the button and the value are not in close proximity.
    • For loading/loaded of asynchronous content – useful for long loading processes, this can reinforce a visual loading cue but can make the user aware a process has finished even if the visual cue is offscreen or the user is distracted.
  • Notifications
    • On error or standard notification – this is a good case where the meaning of a sound can be learned – for example an error notification will have a different sound than a purely informational notification.
    • Data/email received – for notifying the user of important information being received – if a lot of these will occur, it should be combined with the Page Visibility API so that the user only hears these when the page is not visible.
  • Other
    • Motion/Gesture based interactions – as non-pointer-based interaction becomes more common, so do inadvertent actions and thus the importance of reinforcing certain actions with sound.

I’d love to hear your ideas for using sound in web applications. After my Fluent session, I heard from a number of people who gave me fantastic examples of where these very sorts of use cases came up in their apps. If you have any of your own, please feel free to share in the comments.

Header image courtesy of Iwan Gabovitch

Comments

  • Fantastic post, really the kind of thing that will help push web audio along beyond toys and towards a better internet.

  • Kyle Hayes

    I’m glad you brought this topic up. I think its another important consideration for UX that our teams are _not_ talking about today but I could see it being very useful. Google Calendar provides a simple chime for when I have an event coming up and that’s been so useful since its typically in the background. Its unique enough that I learned it was Google Calendar and now I don’t eve need to open the app if I know what the meeting is.

    I could also see it being logical for in-app notifications whilst interacting with it as apps get more complex and are attention may not be set on the thing that needs your attention as you alluded to.

    Concepts get adopted better when the path is paved. Responsive and grid based layouts took off after Bootstrap, Web fonts took off after Google Fonts and font-awesome. Look at what Glyphicons have done for vector imagery in web apps.

    I think for this concept of interface audio cues to get some traction it would be good to have a Glyphicons for sound. Open free licensed sounds that can be easily incorporated into an existing application as standalone sounds or a way of combining them to synchronize new sound patterns. I’ve not looked at the Web Audio API yet but I’m curious as to its API.

    Thank you for another great article.

  • https://danielrapp.github.io/doppler/ – using the Web Audio API to make… a new input method.

  • Pingback: Dew Drop – May 14, 2015 (#2014) | Morning Dew()

  • Pingback: Today’s Readings | Aaron T. Grogg()

  • Pingback: RealTimeWeekly | Real Time Weekly: WebRTC, websockets, IoT()

  • Pingback: Sound Conveys Meaning – james nesfield()

  • Andrew M. Sheppard

    What insanity be this? I agree: how quickly do we forget how MIDI was abused back in GeoCities heyday. So why is this ‘concept’ being revisited? Yeah, I understand the point re: UI/UX enhancements but this assumes a user _wants_ audio feedback. Opt-out or opt-in? What’s the default to be?

    Your Morpheus meme img was amusing but I don’t generally read blogs w/ the same focus as I do technical white papers or theses. Man pages & doc sites don’t need Star Trek’s ‘flare’.

    … it’s bad enough I had to install FlashBlock to get rid of those blasted Flash ads w/ their autostarted audio. I wear headphones whilst I work & normally have a dozen or so tabs open @ any one time.

    Why do you hate my eardrums? X-D