It started with a simple question: "Is there a way to stream text to on macOS so it sort of renders as words come in?" Fast forward to today, and I have a full TTS pipeline for Claude Code, with voice cloning, speaker swapping, regex-reading incidents, and an AI that narrates its own victory laps in a cloned voice. Do I need this? No. Do I even use it that often? No. But, I also accidentally built something genuinely useful for accessibility. Skip to the TL;DR if you just want the architecture ...
No comments yet. Log in to reply on the Fediverse. Comments will appear here.