1 points · 22 hours ago · 0 comments

I find myself unreasonably excited for the work that Taalas is doing. They’re turning open weight models directly into ASICs (application specific integrated circuits) so that the chip essentially acts as the model - and only as the model. You get raw transistor switching speed within the model. Not only does this result in incredibly fast computation (~15k tokens a second for llama 8b!) but you are doing so at ridiculously low power consumption and efficiency.

No comments yet. Log in to reply on the Fediverse. Comments will appear here.