ALL ART BURNS

It does, you know. You just have to get it hot enough.

Thursday, June 30, 2005

Why does the iPod have a screen?

There are areas of industrial design that might seem esoteric but that I think every lay person can appreciate. One area is the use of sound (audio feedback) in user interfaces. I’m sure plenty of designers think about sound when they’re designing something, but it seems that more often than not it’s a secondary issue, relegated to simple feedback during input commands or emergency notification of a problem.

Just about everything in existence makes some sort of sound when used, either by design or by accident of design. Think about the last time you tried to do something only by touch, say plugging a cable into the back of a stereo receiver, threading a nut onto a bolt in a place you can’t see, or fumbling around in the dark looking for a lightswitch. Think about all the noises you hear and how you respond to them. One of the first things I was taught about MIG welding was to listen for the characteristic sound of a good weld — the sound of sizzling bacon. You might not be able to see the torch or the bead, but if you hear the sizzling sound, you know you’re probably making a good weld.

Now compare that to the Icom R-3 scanner which has an amazing range of responses to key presses: “beep” and “beep beep”. Never mind that it has an internal tuner capable of going from dc to daylight, a speaker and a video screen, the only thing it can do in response to user input is “beep” or “beep beep”. Sure, it’s got a dual-LCD display that can show live ATSC or wireless security cam video on one screen while displaying the frequency on the secondary screen, but odds are I’m listening to an audio channel and not staring at a tiny screen. Even if I am staring at the screens, all the feedback I get is “beep” or “beep beep”.

It never hit me just how annoying this was until I started reading about the Elecraft KX1. The KX1 is a miniature, low-power transceiver with only a three digit LED display. The rest of the feedback is audio, sent in Morse code over the audio output. You have to be a licensed amateur radio operator to transmit with a KX1 and it’s only usable in CW (amateur radio lingo for Morse code transmissions), so having it send user interface data via Morse is a rather obvious design decision.

So this leads me back to the title: why does the iPod have a screen? The iPod is about listening to music, not watching video. It’s certainly inconvenient to have to pull the iPod out of your belt pouch while jogging — and unsafe to look it while driving — just to figure out what music you want to play next. You could argue that the iPod Shuffle lacks this limitation, but it since all it can do is random play or sequential play, it doesn’t need any sort of user interface.

Here’s what I think the iPod should do: read to you what is currently on the iPod. Not just in sequential order while you mouse around, but skip around the way your housemate would while randomly poking through the CD rack. “Aphex Twin. White Zombie. Conlon Nancarrow. Apocolyptica. Sugarcubes. Dean Martin. Bauhaus. C-Tec. Lovespiralsdownards. Chris and Cosey. Dead Voices on Air. Jane Jensen. Orbital. Download. Kraftwerk. Dead Can Dance. Shriekback. Meat Beat Manifesto.” When it reads off something you like, you hit the “ok” button, and it starts reading off similar things by genre and alphabetically.

Let’s say you press “OK” right after it says “Orbital.” It will start listing artists that start with “O”, artists in genres related to “Orbital”, and finally, everything that begins with the letter “O”: “Orbital albums by title. The Orb. 187 Calm. Aphex Twin. KLF. Electronica. Trance. Rave. Roy Orbison. Underworld. The Ordinaires. Willam Orbit. Vidna Obana. Infected Mushroom. Goa. Wipeout XL Soundtack.” and then every disc, track and artist that begins with the letter “O”.

Text-to-speech conversion isn’t free, but does it cost more than adding photo support to the iPod? Which would I rather have: the ability to display photos, or a touch/audio user interface that lets me keep my eyes on the road while picking the music I want to listen to?

Update:

I completely forgot about the PhatBox from PhatNoise. It’s a hard drive based MP3 player for your car that uses “PhatNoise Voice Indexing” technology to read off the titles of your MP3 files. A pal of mine has one of these in his R32 and loves it.

Technorati tags: industrial design | design | ipod | audio

posted by jet at 00:28  

4 Comments »

  1. I don’t even *have* an iPod, and already I want this feature.

    Comment by SB — 2005/06/30 @ 09:05

  2. have you used tts engines much? if not try (with windows, java, and ie) ecards.veepers.com the quality of voice is lest than spectacular. Also we have many servers (windows only cause they don’t support linux/unix) that run that tts generation and they require an unusually high amount of maintenance. So tts is just not that simple yet. So yes I feel that it does cost more than photo support. Even the just the voice licensing fees minus all the technical goo, costs more.

    Comment by ra — 2005/06/30 @ 11:00

  3. This would be an interesting option, but I do not think it should get rid of the screen. Most of the time I am listening to my iPod, I am at a desk or walking. I would rather have the chance to look at the screen because I can use a screen much faster then I can use a voice interface.

    Many people use the screen when they are using a playlist. Often I do not know the song that is playing. I would not want to stop the song so I can hear the title or artist.

    Comment by rich — 2005/07/06 @ 11:50

  4. First, you do realize how much you sound like a UI engineer, don’t you?

    Second, in response to “ra”, there are plenty of linux open source TTS engines. Try out festival. (If you have FC4, you can “yum install festival”) and then try it out. Its got a fairly simple command line interface and a pretty nice sounding voice. I’d say it was comparable to Apple’s speech engine in quality, although there aren’t as many voices installed by default. Maybe there are more/better voices on the ‘net somewhere.

    Third, about the PhatBox — I was really disappointed when I found out that it pre-renders all the TTS and menuing system into mp3 files. This means that putting 1 or 2 more songs onto your PhatBox disk isn’t a trivial copy opreation. You have to re-render the whole menuing system on your PC. Not sure if these sorts of tools are available on Linux and render using Festival.

    Comment by slacy — 2005/07/26 @ 17:53

RSS feed for comments on this post. TrackBack URI

Leave a comment

Powered by WordPress