ellementK: (ĕll'ǝ-mǝnt-kā)
noun - A fundamental, essential, or irreducible constituent of a composite entity. Middle English, from Old French, from Latin About Eleanor Kruszewski: I'm known variously as Eleanor or Elle. My last name is like that coach from Duke - kru-shef-ski. Based in Menlo Park, CA, I work for Yahoo! in their Developer Network. The easiest description of what I do is the MBA shin kicker, handling community, marketing, commercial programs and sundry backend stuff. Disclaimer: I've done big corps, midcorps, and startups, so I overstate and oversimplify as much as anyone else. These opinions are my own, not my employer's. |
« Transcript of Jon Udell podcast on IVR | Main | Last geek dinner of 2004 » How can we tap into all this audio content?I hate podcasts and other captured audio on the web - it just doesn’t work for me. To really listen to someone speak, I find I need to be able to see them - watch their lips move and their overall delivery. The only way I can deal with it is to transcribe it, which I did here for the recent Udell podcast on IVR applications and here back in August for the Meta Group talking about Workplace. What a pain. Sign me up for that magic speech transcription technology you were looking for when you find it. Transcription is very tedious, but personally, it’s the only way I can stay focused on the content. More seriously, the culture of podcasting may very well drive innovation in this space - nothing like having the innovators’ commentary marooned to get them focused on a problem! When I was at BloggerconIII (some talks of which are available via ITConversations audio stream), the session on podcasting addressed this topic. What was strange was that several people, Dave Winer (conference organizer and very influential among bloggers and podcasters alike) declared that he intended never to provide transcripts for his podcasts at iPodder.org. Steve Gilmor, ZDNet editor and columnist and participant in the eponymous Gillmor Gang (to learn more about it, and see an example of the “navel gazing” that I find to be characteristic of this kind of group, see their blog) said that they did one transcript of the show, shoving the task out to India, but they were not happy with the results. The message from these two guys was that podcasting would stay audio only, and people would just have to sit through it. Now, I’ve said that audio doesn’t work for me personally, and that’s my bias. I’m not the only one though - this has been discussed by Marc Canter and Tim Bray. You can see Udell’s response here. Tim’s phrase “four guys talking” captures my problem exactly. I don’t intend to be hypercritical here, but it’s important we look at what this mode of interaction means - what it allows and requires both for the creator and the listener. The creator - as we saw in Jon’s case with the IVR conversation - benefits by just recording the conversation, doing the necessary processing (which is work and requires special equipment - but it’s also tech tinkering, which is fun more than tedious), and serving it. They can share the content directly, without needing to mentally pre-process it. Listeners benefit too, as Jon says himself, by having direct access to the full context of a conversation, rather than have it distilled through the views and the prejudices of the interviewer. It’s true that text is lossy, but in podcasting we often just think about the benefits. The costs for the users are fairly high. Skimming is impossible. Searching is impossible. Pacing is out of control - if it’s too fast, you must go back (which is very cumbersome given the poor interface of the web plugins I use here, but might be easier on, say, an iPod); if it’s too slow, you’re stuck. Take Eric Rice - podcaster extraordinaire - for example. Now I like Eric personally, but he is a showman. He loves podcasting because it puts him in control of the pacing and the delivery. Listening to his podcasts, you can tell he is a radio personality, and it is his personality that he’s sharing in these ‘casts. So Eric shows us that the line between content and entertainment blurs with podcasts. And that’s great for all the people who tune in to talk radio. But wouldn’t it be better if this media were indexable, searchable, and fungible….. more like text. Reading this page - I bet that you don’t read every word. No one does. But with audio, the words come to us as delivered. You can’t skip to the bottom because it’s not “there” yet; audio doesn’t exist in our minds until we hear it and process it. Anyhow, this is getting too far off the path, but it’s important because I’m not hearing much discussion on this to temper the hype around podcasts. Sure, it’s democratizing broadcasting and making it so that newcomers like Eric can get famous and people like Adam Curry, a sort of washed-up icon of the 1980s, can get “airtime”. But it’s also proliferating information that’s hard to consume and which requires time - the most scarce resource of all - to consume. I mean, what are we supposed to do with these? Where is the context? So there’s a problem, but this problem is one these guys will want to solve. After all, while they do want to control the “experience” and delivery of their unique content, they also want to see it reach the widest possible audience. They’re not famous, or influential, or rich if no one is “listening”, if the ideas are trapped in audio. And there are people like me and Tim in the world, never mind the non-native English speakers and the deaf for whom this data is not accessible. This first wave of podcasting is important, but it needs to be integrated into the rest of our information architecture - and right now that means text. Fortunately, since these guys are at the forefront of technology development, this problem will get solved. In fact, this might be one of the first applications for the speech applications that IBM recently open sourced. Or maybe these guys will pick up a copy of ViaVoice and get started training it.
:-) Updated to reflect proper grammar - Without being coy, I’ve gotten used to sloppy proofreading since my audience has been mostly Japanese. I’ll have to proofread now that us picky native speakers are tuned in. |
|
|
EllementK is proudly powered by WordPress - RSS Entries and Comments. |
||