Thursday, April 14, 2011

Text To Speech for e-Learning

Text to speech technology has come along way from when I first heard a computer speak.  The following video is a TED speech by Roger Ebert.  Roger lost the ability to speak, and for that matter eat, when he lost his lower jaw to cancer a few years ago.  For part of his speech he is using the Alex voice on his Macbook.  Take a listen.  It's not bad, but still far from perfect.



I've learned that one of difficulties is in the fact that a computer doesn't breath.  In addition to naturally pauses indicated in sentences by commas, we also pause elsewhere in our speech to simply account for our need as humans to breath.  I have been experimenting with this in Adobe Captivate's voice narration capabilities.  the North American voices included with the product are from a company called Neospeech.  Their voices use a text to speech programming language known as VTML.  In addition to the text you want spoken you can include VTML tags which will indicate items like speed, pitch, pauses, and a few other items.  As I experiement with this, I find my narrative is sounding less and less like a robot and more and more human.  It's not perfect, but it is getting better.