Speech Recognition Is Only Part of the Future

A week or so ago, Fred Wilson Dictated a Blog Post.  In it he dictated a blog post on his Nexus One phone.  He then discovered Swype which now has an unofficial Android app.   As usual the comment threads on AVC were very active and had lots of thoughts about the future (and past) of voice and keyboard input.

When I talk about Human Computer Interaction, I regularly say that “in 20 years from now, we will look back on the mouse and keyboard as input devices the same way we currently look back on punch cards.”

While I don’t have a problem with mice and keyboards, I think we are locked into a totally sucky paradigm.  The whole idea of having a software QWERTY keyboard on an iPhone amuses me to no end.  Yeah – I’ve taught myself to type pretty quickly on it but when I think of the information I’m trying to get into the phone, typing seems so totally outmoded.

Last year at CES “gestural input” was all the rage in the major CE booths (Sony, Samsung, LG, Panasonic, …).  In CES speak, this was primarily things like “changing the channel on a TV using a gesture”.  This year the silly basic gesture crap was gone and replaced with IP everywhere (very important in my mind) and 3D (very cute, but not important).  And elsewhere there was plenty of 2D multitouch, most notably front and center in the Microsoft and Intel booths.  I didn’t see much speech and I saw very little 3D UI stuff – one exception was the Sony booth where our portfolio company Organic Motion had a last minute installation that Sony wanted that showed off markerless 3D motion capture.

So – while speech and 2D multitouch are going to be an important part of all of this, it’s a tiny part.  If you want to envision what things could be like a decade from now, read Daniel Suarez’s incredible books Daemon and Freedom (TM) .  Or, watch the following video that I just recorded from my glasses and uploaded to my computer (warning – cute dog alert).

  • Speach recognition will never totaly replace the need for hand operated controls anymore than software totaly replaced paper, or robots totaly replaced human manual labor. People will always need hand operated controls to take notes in meetings and enter confidential data while in public. It is also just faster and easier to hit an email button rather than tell your computer to check your email. The need for mobility hampers the adoption of voice recognition. Voice recognition drains batteries quickly which destroys mobility. That's why voice recognition is done in the cloud but being connected to the cloud also drains batteries. The thing that is going to replace the keyboard can be found at singlehandtextentry.com and abolishthetinykeyboard.com

  • Take a lateral step here. There's a lot of video content on the internet that has a very hard time being "searchable." I'm waiting for an A / V spider to crawl through youtube, converting speech to text, and indexing it on search. I don't think it exists yet, but it will.

  • Cool video, Brad. I'm waiting for visual overlay in my heads up display. It would make annotation of the real world and interaction with the real world data rich.

  • Good lateral step.  Yes – it will exist.  Soon enough!

  • Yes – imagine all the applications especially with realtime geo!  This is at the core of Daemon and Freedom (TM) – Suarez does a great job incorporating it into the story.

  • Pingback: Tweets that mention Speech Recognition Is Only Part of the Future -- Topsy.com()

  • I agree with you about keyboards – but I think it'll be sooner than 20 years before something better is available.

    The video bit is interesting as well. How long until I live a 24 hour augmented experience with both sight and sound running through my mobile device? With face recognition software catching up, we don't seem far from being able to look at someone through some glasses like yours and get a heads-up display of relevant information (profile data from Facebook, recent tweets, etc.) — all Terminator-style.

    Regardless of how data gets in (voice recognition or otherwise) getting people used to a heads-up display will allow all kinds of advanced (and way more efficient) ways to use software. Looking forward to it ….

  • Totally agree that it’ll be much less than 20 years.  My point was that in 20 years we will look back and mice and keyboards and think they are “quaint.”

  • Kyle

    "When I talk about Human Computer Interaction, I regularly say that “in 20 years from now, we will look back on the mouse and keyboard as input devices the same way we currently look back on punch cards.”"

    Doubtful, I can see the setup keeping pace with what tech allows, but the basic input paradigm will remain in place due to our physiology, the only changes will be evolutionary. Our nervous systems' are setup so that 'brain to hand' is a high bandwidth connection for output, HCI via hands in some form, even if it's a virtual keyboard (doubtful, no tactile feedback) or gestures (wii ish) or whatnot will always be our primary interface with most computers (unless we perfect direct CNS to machine interfaces far in the future), speech is too messy for most HCI, and it is the only other system close to the output bandwidth of our hands.

  • I don’t think you are thinking creatively enough.  Our hands will be an important part of any input paradigm, but I’m not sure we’ll be typing on virtual QWERTY keyboards 20 years from now.

  • Hey Brad – Thought provoking stuff! I think the technology might far-outpace the cultural implications of alternate input methods.

    I think there are deep cultural obstacles to overcome before things like high-tech glasses are acceptable in public. I think of the urban dictionary for Bluetool — anybody who wears their Bluetooth in public.

    Oh, and nice boxers! Love your stuff. Still go back regularly and re-read "Play the Point, Not the Score"

  • I encourage you to read Suarez’s Freedom (TM) – it’s got great stuff on the societal impact of this stuff.

  • Kyle

    QWERTY keyboards existed long before they were used as HCI devices for a reason. I would agree the keyboard's role could become less important in 20 years, but I would bet they will still be the primary input device for written communication, and would bet some future derivative will still be still used enough to make the statement "we will look back on the mouse and keyboard as input devices the same way we currently look back on punch cards" false.

  • I think what you're looking at in twenty years is a re-imagining of how we interact with technology in the first place. We're currently constraining ourselves to figuring out how to better interact with today's technology rather than stepping back and asking how we'd really like to interact with our experiences and data. I'll coin the phrase here, "naturalized interfaces", in which technology simply augments the more natural way of looking at data. Companies like Oblong and the new interfaces that you're seeing this year and last at CES are still exploring the potential of what those different technologies can deliver, but in twenty years we should be moving past that. Your ultimate interface is a *blend* of all these different technologies in a way that feels incredibly seamless and natural.

    So, maybe instead of doing spreadsheet analysis on a screen, I have a virtual display in front of me (it doesn't matter if it's projected or augmented reality onto my eyeballs – the question is whether it's for a shared or personal experience) and I can physically raise and lower inputs into my analysis either through a tangible object or haptic feedback – the point is that I *feel* the physical manipulation of the input. If I need more granular control, maybe I use the pinch or inverse-pinch popular on early multitouch devices like the iPhone. If I'm reviewing a couple of companies, I toss another business card onto my table which begins to pull the financial data from the cloud to add into my scenario… you get the idea.

    I want an interface world where I'm thinking about my interactions in the most natural ways, not conforming to another technology out there. I want my interaction verbs to be the ones I learned in kindergarten — I want more of this product by adding to a pile of 'stuff' not product++ or sum(a13:a47).

    My virtual world and physical world shouldn't be separate worlds to my perception.

  • DaveJ

    The video reminds me of "Strange Days." Was hoping to see Juliette Lewis.

    20 years is plenty of time for direct brain/machine interfaces. There are already retinal prosthetics that provide sensory input for the blind, and see the work of Nicolelis (http://en.wikipedia.org/wiki/Miguel_Nicolelis) and others doing BMI for control. I'll be a little surprised if I have to move at all to control a computer in 20 years. All the clever stuff people are doing right now is just interim.

  • I've been playing with Dragon Dictation for the iPhone the past several days. While the reco quality is higher than I was expecting (things have gotten a lot better since my last use over a decade ago), the impact it had on the content (words) I created was surprising. Typing this blog comment gives me parallel processing abilities and context that I don't have with speech (serial). The content I create when speaking is interestingly different from the stuff that comes out of my fingers on a keyboard. Keyboard is more concise and structured… voice is more verbose and loose. I've been trained over the decades, via email, to produce one way on the keyboard, and another w/ my voice.

    My latest gripe about modern HCI is display (you touch on one dimension of it in this post). Right now, my wife and I are sitting in front of a large TV screen (passively watching a show), each w/ laptops in our laps, and iphones at our sides. If something pops up on my iphone that I want her to see, I have to hand her the phone (despite the fact she has a better screen sitting right on her lap). If I want to talk to her about our schedules tomorrow, we have to share a laptop screen, as opposed to just looking at the big screen 10' from us and having a communal experience around shared content. Display is horribly broken (as is input (keyboard/mouse)). Need a fix. Unfortunately, while we can all dream up solutions, some replete with technical architecture within today's state of the art, I fear the deployment of said solutions is going to be akin to auto-industry investment. So much capital is required to change the physical interaction world around us (as opposed to software alone) that relatively few swings at bat can be made.

    Longing for highly iterative hardware I suppose.

  • I love the old story that QWERTY keyboards were created to “slow down” typists so they wouldn’t jam the typewriter.  I’m not suggesting there won’t be some sort of “keyboard like” device – it’s just not going to be a “keyboard”.  No way 20 years from now “the keyboard” is still the best way to “input” information.

  • Perfectly said.

  • If implants are involved, I’m ready to sign up.

  • Great example.  We just went through this at Ryan’s house.  His TV downstairs wasn’t connected to the Comcast box.  We snaked cables through the wall.  It then had to reload all the data.  He had to run upstairs to figure out the channel.  We then watch 24 with him and his Mac in his lap and me and my iPhone.  We’re watching, checking things online on our various devices, and just generally disconnected electronically while watching T+5 years (theoretically) on the screen which is better but not that much (same basic paradigm).  Bruce’s comment above is right on the money.

    Now – I don’t think this will take “auto industry investment” – I think it’s going to be radical discontinuous innovation that happens in the next decade.  Buckle up.

  • John Dean

    Actually Everyzing (powered by BB&N) has this techology available. MIT's open courseware (Classroom?) has been doing this with lectures from their professors (using their own Sphynx VR software)

  • John Dean

    Another MIT project is doing something like this- SixthSense at the MIT Media lab (http://www.pranavmistry.com/projects/sixthsense/index.ht...

  • Yes – SixthSense is a neat project – still very hacky from the outside looking in but definitely a sense of where things can go.

  • Mike Greczyn

    I used to crew AWACS planes for USAF (as an aside, if you want to see a crappy UI, go fool with pretty much anything operated by DOD). There was an old story floating around that the first AWACS crew stations had keyboards with the keys arranged in alphabetical order so that folks who were whizzes on QWERTY boards wouldn't jam up the computer memory. I always thought that sounded oddly familiar, and then someone reminded me of the origin of the QWERTY arrangement.

  • Good point about the difference between what content you say and what you type. I've had this experience when reviewing transcripts of con-calls I've been on, and even when I'm trying to be concise and summarising (rather than creative / discursive) it's not like anything I'd type.

    Watching 'Mad Men' recently reminded me that dictation isn't new – it just used to be processed by a human before it was committed to 'paper'. Perhaps the dictation s/w of the future will be able to learn from how I correct what I dictate using the (virtual) keyboard.

  • Awesome – yet another great anecdote about slowing down humans to make it easier for the computers.

  • Dictation as a metaphor is a good one.  I never think of it because I grew up after dictation.  But I fondly remember my father sitting at home at night at his desk with a stack of medical charts in front of him dictating his notes on each one.  Someone in his office then transcribed all the dictation the next day.

  • The problem with any sort of gestural interface for the mainstream consumer: people are lazy. While the keyboard and mouse are outdated, but you can do a heck of a lot with them, with very little energy or movement. (By gestural, I'm excluding touch-based interfaces.)

  • The problem with any sort of gestural interface for the mainstream consumer: people are lazy. While the keyboard and mouse are outdated, but you can do a heck of a lot with them, with very little energy or movement. (By gestural, I'm excluding touch-based interfaces.)

    As cool as gestural interfaces are, I have yet to see a good application concept for everyday use of gestural interfaces (if you count the Wii, the Wii doesn't improve human-computer interaction, it provides a new gaming experience.) I would love to see them if they exist or have been talked about.

    There are many situations where they certainly would be valuable: any time you need to pan through a 3d space or play with a 3d object. But why would the general consumer need to other than exploring google earth? If you deal with large data, or are an artist, there certainly are applications.

    Perhaps that will be the mainstream use, playing with the data sets that now surround us? At least once we "grow into" the interface.

    I do think Oblong's stuff is awesome, as I'm technically working on something that could greatly benefit from having a gestural interface. However, I haven't gotten a good answer to the question:"What will I use it for?"

    It's hard to predict where the applications will go. But that's what will ultimately drive the adoption. And I haven't heard enough talk about what we truly and deeply need them for.

  • I agree. These are just interim. Computer-Brain interfaces will be the true paradigm. This will change everything: the human condition won't be the same.

    Once I have a little cash and credibility under my belt, I'll spend all my days working on this. Something will surely have gone awry if we can't achieve this in the next 60 years.

  • I agree that the applications will drive mainstream usage.  To date Oblong has been focused on specialized heavy data apps, but there are some very interesting broader apps coming from them (look for a few announcements soon).  In addition, as they move from a “glove” world to a “gloveless” world (all software on their end) really interesting things start to appear.  For example, imagine what you could do in your car if you could use your hands to interact with the computer that now resides (stupidly) in the very middle of your console.

  • Mike Greczyn

    Imagine an early 80s vintage computer whose main job is to process a bazillion radar returns every 10 seconds while also dealing with 20 crew members frantically pounding in keyboard commands (you can't accurately describe the way a person uses a keyboard in a high-stress environment as "typing"), and the story is plausible.

  • Chris Emery

    I believe that all these methods suffer from tunnel vision. They all involve the machine as something separate and non-evolving. I believe that we will create a network of machines that are in constant communication with all our forms of expression and which will always be aware of what we are aware of and alter their operations based upon all our past history.

  • Pingback: xbox 360()

  • Pingback: penis advantage()

  • Pingback: mike geary truth about abs()

  • Pingback: bumperstickerquotes.org()

  • Pingback: penis advantage()

  • Pingback: how to get edu backlinks()

  • Pingback: hostgator review()

  • Pingback: get a free ipad()

  • Pingback: best 60 inch led tv()

  • Pingback: penis advantage reviews()

  • Pingback: get backlinks()

  • Pingback: xbox 360 giveaway()

  • Pingback: cheap portable dvd player()

  • Pingback: the truth about abs review()

  • Pingback: penis advantage scam()

  • Pingback: Jc Ubertini()

  • Pingback: Alberto Mingrone()

  • Pingback: cheap edu links()

  • Pingback: hostgator coupons()

  • Pingback: how to get a free ipad()

  • Pingback: penis advantage review()

  • Pingback: portable dvd player for car()

  • Pingback: truth about six pack abs()

  • Pingback: Thalia Vogds()

  • Pingback: Cody Shappell()

  • Pingback: Buck Dacunto()

  • Pingback: Pamella Hohensee()

  • Pingback: Carolann Meehleder()

  • Pingback: Alonso Broege()

  • Pingback: Deja Tuell()

  • Pingback: Deja Tuell()

  • Pingback: Michal Liukko()

  • Pingback: Emile Commins()

  • Pingback: Lynn Repenning()

  • Pingback: Willis Morre()

  • Pingback: Sal Moreida()

  • Pingback: Roy Hequembourg()

  • Pingback: Stephen Degrandpre()

  • Pingback: tao of badass reviews()

  • Pingback: Vanetta Daiz()

  • Pingback: Princess Guidaboni()

  • Pingback: Emile Commins()

  • Pingback: Anglea Morganfield()

  • Pingback: Hung Nuriddin()

  • Pingback: Terra Goshen()

  • Pingback: Fae Bouie()

  • Pingback: Thanh Slimmer()

  • Pingback: Eduardo Penister()

  • Pingback: Renea Jurkiewicz()

  • Pingback: Curt Utt()

  • Pingback: Rita Linzan()

  • Pingback: Merlene Mimnaugh()

  • Pingback: Corrinne Malakan()

  • Pingback: Corey Jobe()

  • Pingback: Jaime Thor()

  • Pingback: Martine Mazzarella()

  • Pingback: Mistie Trim()

  • Pingback: Charolette Manigold()

  • Pingback: Marlin Linebrink()

  • Pingback: Will Kusiak()

  • Pingback: Corrinne Lavgle()

  • Pingback: Arlyne Sbano()

  • Pingback: Nikki Konger()

  • Pingback: Staci Lerwick()

  • Pingback: Shaun Escher()

  • Pingback: Trudie Vasque()

  • Pingback: Karan Delgado()

  • Pingback: Sandy Besong()

  • Pingback: Conrad Arizmendi()

  • Pingback: Gertha Gutenberg()

  • Pingback: Sherwood Micha()

  • Pingback: Hobert Dentel()

  • Pingback: Steve Nivar()

  • Pingback: Vuitton Tricolor Epi()

  • Pingback: Rosy Spancake()

  • Pingback: Lisha Netley()

  • Pingback: Kent Yanez()

  • Pingback: Alix Kosuta()

  • Pingback: Leigh Merkle()

  • Pingback: Benjamin Berkowitz()

  • Pingback: Leonida Streets()

  • Pingback: Kum Kotas()

  • Pingback: Loura Vonbraunsberg()

  • Pingback: Antone Huckabee()

  • Pingback: Marylynn Pernice()

  • Pingback: Eartha Salvador()

  • Pingback: Yaeko Krakowsky()

  • Pingback: Anton Fermo()

  • Pingback: Gabriel Pinkleton()

  • Pingback: Dewitt Burch()

  • Pingback: Elliott Lampe()

  • Pingback: Wes Folliard()

  • Pingback: Dia Kushlan()

  • Pingback: Grover Obermeier()

  • Pingback: Doria Dunsmore()

  • Pingback: Jamal Stoltz()

  • Pingback: Rowena Crill()

  • Pingback: Hershel Hachey()

  • Pingback: Melynda Coomer()

  • Pingback: Kieth Katayama()

  • Pingback: Julieann Lundell()

  • Pingback: Burton Losito()

  • Pingback: Kris Metevier()

  • Pingback: Pat Szewc()

  • Pingback: Charles Angier()

  • Pingback: Dewitt Burch()

  • Pingback: Tequila Hochhalter()

  • Pingback: Jeffery Wyatt()

  • Pingback: Ivelisse Likio()

  • Pingback: Paul Emayo()

  • Pingback: Elza Vreeland()

  • Pingback: Tia Clennon()

  • Pingback: Maurita Jackiewicz()

  • Pingback: Bernadine Shore()

  • Pingback: Toby Terinoni()

  • Pingback: Lashell Hulen()

  • Pingback: Breana Dugdale()

  • Pingback: Geraldo Dubach()

  • Pingback: Pearl Connerty()

  • Pingback: Freeman Poon()

  • Pingback: Victor Hulin()

  • Pingback: Cami Sampere()

  • Pingback: Xavier Casparian()

  • Pingback: Erin Milward()

  • Pingback: Elwanda Molineaux()

  • Pingback: Casandra Ferrusi()

  • Pingback: Kara Bugna()

  • Pingback: Lenita Heidler()

  • Pingback: Vinita Luebbert()

  • Pingback: Jacques Krawetz()

  • Pingback: Denis Courton()

  • Pingback: Lacy Kouns()

  • Pingback: Ellis Cristobal()

  • Pingback: Andre Deller()

  • Pingback: http://techscreen.tuwien.ac.at/groups/services/wiki/aa5dd/Exactly_what_All_and_sundry_Must_Understand_about_the_Truth_about_Six_Pack_Abs.html()

  • Pingback: http://64-52-8-67.client.cypresscom.net/groups/metiritalk/wiki/d7418/Unlock_your_fantastic_tunes_expertise_while_using_ideal_defeat_generating_program.html()

  • Pingback: Susy Miramontes()

  • Pingback: Bertram Pogar()

  • Pingback: http://iiis.ne.jp/groups/1a82f/wiki/7dc38/The_magic_of_creating_as_much_as_get_the_ex_back_again.html()

  • Pingback: Cher Reihl()

  • Pingback: http://minnie.hartlandschools.us/groups/httpminniehartlandschoolsusgroupsvesmusic/wiki/739fe/The_magic_of_creating_as_many_as_obtain_your_ex_back_again.html()

  • Pingback: Hosea Maiers()

  • Pingback: Kayleen Hefflinger()

  • Pingback: Von Brog()

  • Pingback: Christiane Liebelt()

  • Pingback: Berta Halama()

  • Pingback: Edda Pomrenke()

  • Pingback: Desirae Mozie()

  • Pingback: Wilson Santorella()

  • Pingback: Shaina Vonderkell()

  • Pingback: Nola Yerke()

  • Pingback: Efrain Reges()

  • Pingback: Lolita Ritchie()

  • Pingback: Ping Zuver()

  • Pingback: Stefanie Sienko()

  • Pingback: Dane Wrightington()

  • Pingback: Petra Tutwiler()

  • Pingback: Gregg Zuwkowski()

  • Pingback: Dorine Born()

  • Pingback: Abe Hooker()

  • Pingback: Kennith Kalil()

  • Pingback: Nana Zacher()

  • Pingback: Donnie Klint()

  • Pingback: Kelvin Chase()

  • Pingback: Princess Burson()

  • Pingback: Bryon Maffei()

  • Pingback: Charis Alken()

  • Pingback: Kendal Eun()

  • Pingback: Angelia Trufin()

  • Pingback: Keven Kolber()

  • Pingback: Marty Holzwarth()

  • Pingback: Lewis Lafferty()

  • Pingback: Nereida Akery()

  • Pingback: Marvin Filbert()

  • Pingback: Tracey Campanella()

  • Pingback: Mertie Ancona()

  • Pingback: Anja Maslowsky()

  • Pingback: Damien Schuett()

  • Pingback: Ivana Hamano()

  • Pingback: Marg Toplistky()

  • Pingback: Ismael Lanois()

  • Pingback: Doretha Wand()

  • Pingback: Clinton Milbert()

  • Pingback: Celestine Sierer()

  • Pingback: Zack Daul()

  • Pingback: Chuck Blindt()

  • Pingback: Colton Granberg()

  • Pingback: Kenda Strock()

  • Pingback: David Renova()

  • Pingback: Theodore Peroff()

  • Pingback: Javier Kerman()

  • Pingback: Camilla Golish()

  • Pingback: Mahalia Qasba()

  • Pingback: Randolph Cuestas()

  • Pingback: Osvaldo Barsotti()