Bubbles
1 points · 2 days ago · 0 comments

The iPhone 17 runs a 3 billion parameter language model on-device at 30 tokens per second. Obviously, the average consumer has no idea what that sentence means, and Apple hasn’t figured out how to make them care. I believe that’s about to change. Apple now has complete access to Google’s Gemini model in its own data centers, with the ability to distill it into smaller models built for iPhones and iPads. Knowledge distillation works like this: you take a large model, have it perform tasks with...

No comments yet. Log in to discuss on the Fediverse