• RSS feed
  • Blog
  • About
  • Projects

Taking a critical look at market and technology development around the enterprise space.


ellementK: (ĕll'ǝ-mǝnt-kā) noun - A fundamental, essential, or irreducible constituent of a composite entity. Middle English, from Old French, from Latin elementum. In this case, also related to the modern French mentir, to lie. (adapted from Dictionary.com)


About Eleanor Kruszewski: I'm known variously as Eleanor or Elle. My last name is like that coach from Duke - kru-shef-ski.

Based in Menlo Park, CA, I work for Yahoo! in their Developer Network. The easiest description of what I do is the MBA shin kicker, handling community, marketing, commercial programs and sundry backend stuff.

Disclaimer: I've done big corps, midcorps, and startups, so I overstate and oversimplify as much as anyone else. These opinions are my own, not my employer's.

« Transcript of Jon Udell podcast on IVR   |   Main   |   Last geek dinner of 2004 »

How can we tap into all this audio content?

I hate podcasts and other captured audio on the web - it just doesn’t work for me. To really listen to someone speak, I find I need to be able to see them - watch their lips move and their overall delivery. The only way I can deal with it is to transcribe it, which I did here for the recent Udell podcast on IVR applications and here back in August for the Meta Group talking about Workplace.

What a pain. Sign me up for that magic speech transcription technology you were looking for when you find it. Transcription is very tedious, but personally, it’s the only way I can stay focused on the content.

More seriously, the culture of podcasting may very well drive innovation in this space - nothing like having the innovators’ commentary marooned to get them focused on a problem! When I was at BloggerconIII (some talks of which are available via ITConversations audio stream), the session on podcasting addressed this topic. What was strange was that several people, Dave Winer (conference organizer and very influential among bloggers and podcasters alike) declared that he intended never to provide transcripts for his podcasts at iPodder.org. Steve Gilmor, ZDNet editor and columnist and participant in the eponymous Gillmor Gang (to learn more about it, and see an example of the “navel gazing” that I find to be characteristic of this kind of group, see their blog) said that they did one transcript of the show, shoving the task out to India, but they were not happy with the results. The message from these two guys was that podcasting would stay audio only, and people would just have to sit through it.

Now, I’ve said that audio doesn’t work for me personally, and that’s my bias. I’m not the only one though - this has been discussed by Marc Canter and Tim Bray. You can see Udell’s response here. Tim’s phrase “four guys talking” captures my problem exactly.

I don’t intend to be hypercritical here, but it’s important we look at what this mode of interaction means - what it allows and requires both for the creator and the listener. The creator - as we saw in Jon’s case with the IVR conversation - benefits by just recording the conversation, doing the necessary processing (which is work and requires special equipment - but it’s also tech tinkering, which is fun more than tedious), and serving it. They can share the content directly, without needing to mentally pre-process it. Listeners benefit too, as Jon says himself, by having direct access to the full context of a conversation, rather than have it distilled through the views and the prejudices of the interviewer.

It’s true that text is lossy, but in podcasting we often just think about the benefits. The costs for the users are fairly high. Skimming is impossible. Searching is impossible. Pacing is out of control - if it’s too fast, you must go back (which is very cumbersome given the poor interface of the web plugins I use here, but might be easier on, say, an iPod); if it’s too slow, you’re stuck. Take Eric Rice - podcaster extraordinaire - for example. Now I like Eric personally, but he is a showman. He loves podcasting because it puts him in control of the pacing and the delivery. Listening to his podcasts, you can tell he is a radio personality, and it is his personality that he’s sharing in these ‘casts. So Eric shows us that the line between content and entertainment blurs with podcasts. And that’s great for all the people who tune in to talk radio. But wouldn’t it be better if this media were indexable, searchable, and fungible….. more like text.

Reading this page - I bet that you don’t read every word. No one does. But with audio, the words come to us as delivered. You can’t skip to the bottom because it’s not “there” yet; audio doesn’t exist in our minds until we hear it and process it. Anyhow, this is getting too far off the path, but it’s important because I’m not hearing much discussion on this to temper the hype around podcasts. Sure, it’s democratizing broadcasting and making it so that newcomers like Eric can get famous and people like Adam Curry, a sort of washed-up icon of the 1980s, can get “airtime”. But it’s also proliferating information that’s hard to consume and which requires time - the most scarce resource of all - to consume. I mean, what are we supposed to do with these? Where is the context?

So there’s a problem, but this problem is one these guys will want to solve. After all, while they do want to control the “experience” and delivery of their unique content, they also want to see it reach the widest possible audience. They’re not famous, or influential, or rich if no one is “listening”, if the ideas are trapped in audio. And there are people like me and Tim in the world, never mind the non-native English speakers and the deaf for whom this data is not accessible.

This first wave of podcasting is important, but it needs to be integrated into the rest of our information architecture - and right now that means text. Fortunately, since these guys are at the forefront of technology development, this problem will get solved. In fact, this might be one of the first applications for the speech applications that IBM recently open sourced. Or maybe these guys will pick up a copy of ViaVoice and get started training it.

:-)
Though during my research on IVR tech for a recent research request, I did find this Gartner report that discusses the state of audio search technology (back in 2002, so it’s surely more advanced now). It sounds like the technology exists, it must just be expensive and difficult to implement. Sign me up.

Updated to reflect proper grammar - Without being coy, I’ve gotten used to sloppy proofreading since my audience has been mostly Japanese. I’ll have to proofread now that us picky native speakers are tuned in.

This entry was posted on Thursday, December 23rd, 2004 at 5:07 pm and is filed under Emergent.

You can follow any responses to this entry through the RSS 2.0 feed.

You can leave a response below, or trackback from your own site.

Leave a Reply

  • Recently modified posts
    • Last day of ciccadas, hummingbirds, and fighting with blue jays
    • Finally, the Amazon Darknet review
    • OpenOffice 1.1.4: motivation for switching and review
    • Viral marketing movie preview for bloggers tonight "Yes"
    • Mac moves to Intel as the Windows tax grows heavier
    • Fun with the thinking man's drinkers
    • Notes from Stanford US-Asia lecture with Prahalad and Barker
    • Blog as narrative: Nature speculates on flu crisis
  • Recent comments
    • propecia online on "Home networking: ..."
    • Hydrocodone. on "Wal-Mart RFID pilot:..."
    • Hydrocodone. on "Last geek dinner..."
    • Hydrocodone. on "Titans Intel and..."
    • Hydrocodone. on "SCO-Linux copyright battle..."
  • View by category
    • Datapoints (23)
    • Emergent (82)
    • Enterprise IT (49)
    • Events & Happenings (48)
    • Geek (50)
    • Life-Culture-Play (35)
    • Mobility (36)
    • Open Source (22)
    • Strategy-Marketing (53)
    • Toys, Tips, & Tricks (14)
    • Venture & Startup (8)
  • Archives
    • January 2006
    • June 2005
    • May 2005
    • April 2005
    • March 2005
    • February 2005
    • January 2005
    • December 2004
    • November 2004
    • October 2004
    • September 2004
    • August 2004
    • April 2004
    • March 2004
    • February 2004
    • January 2004
    • December 2003
    • November 2003
    • September 2003
    • August 2003


Creative Commons License This work is licensed under a Creative Commons License

EllementK is proudly powered by WordPress - RSS Entries and Comments.