Exams always seem to present far too much opportunity for distraction… Well, now that they’re past (and, i hope, passed), let me report on some text-to-speech synthesis software with which i’ve been playing.
My initial motivation to play with text-to-speech stuff was simply that, by the end of a long day in front of the PC, and several more hours with the books, i often can’t keep my eyes open any longer, but my brain still wants some food… Add to that the vast corpus of largely unexplored free, public domain “e-books” at Project Gutenberg, and i thought it a good idea to experiment again with some text-to-speech toys.
At the heart of my text-to-speech setup is Festival, a framework being developed by the Centre for Speech Technology Research at the University of Edinburgh for general-purpose, multilingual speech synthesis. After reading the excellent user documentation on it, i’m looking forward to spending some more time tweaking intonation and phrase breaks to make the synthesised speech more natural.
My desktop environment, KDE, includes kttsmgr, the KDE Text-to-Speech Manager, which provides a nice interface to Festival. kttsmgr offers a simple means of synthesising the contents of the clipboard, a file, etc. It also enables intuitive replacement of words (e.g., abbreviations) which Festival does not otherwise pronounce correctly.
After retrieving them from Project Gutenberg, i run e-books through GutenMark, a tool to create more readable HTML or LaTeX documents from the Project Gutenberg markup. GutenMark does a fair job of marking up headings, direct speech, etc., though i have to make one or two quick global replacements of HTML entities on which kttsmgr/Festival chokes.
i then fire up Konqueror, the lightweight browser included in KDE. Konqueror interfaces directly with kttsmgr, allowing one to view a page or select a block of text and hit “Speak Text”.
So far, i’ve listened to some excellent reads:
- The autobiography of Benjamin Franklin
- Edison, his life and inventions
- Dickens’ A child’s history of England
- GK Chesterton’s Orthodoxy
The sythesised voice is a bit mechanical, but i found that, after fifteen minutes or so of listening, i was able to follow without much effort. Now i can “read” while messing around the flat, or put on my headphones and enjoy a good bedtime story.