The venerable eSpeak is a mainstay of Linux distributions. It is a clever Text-To-Speech (TTS) program which will read aloud the written word using a phenomenally wide variety of languages and accents. The only problem is that it sounds robotic. It has the same vocal fidelity as a 1980s Speak 'n' Spell toy. Monotonous, clipped, and painful to listen to. For some people, this is a feature, not a bug. I have blind friends who are so used to eSpeak that they can crank it up to hundreds of words per minute and navigate through complex documents with ease. For the rest of us, it is a steep and unpleasant learning curve. There are lots of modern TTS programs using all sorts of advanced AI. Many of them are paywalled or require you to post your text to a webserver - with all the privacy and latency problems that causes. Some are restricted to high-powered GPUs or other expensive equipment. Piper is different. It is local first, runs quickly on modest hardware, and is open source. The easiest…
No comments yet. Log in to reply on the Fediverse. Comments will appear here.