A popular way to explain how current LLMs work is to say that “all” they do is predict the next most likely word in a sentence. From one perspective, this is correct. Trained on all human language, the LLMs distilled billions of word sequences so that they can imitate authentic-sounding strings of words that have never been said before. These sentences sound plausible because, based on training on millions of average human texts, the models were predicting what an average human might say next. They really did succeed in doing that expected task. What is harder to account for is the emergent creative abilities of the LLMs. The amount of intelligence required to compose one coherent sentence can almost be reduced to the rules in a grade-school grammar book. But the amount of intelligence needed to produce a string of sentences focused on one topic — a paragraph — far exceeds any rules. And the amount of intelligence wrapped up in a string of paragraphs, as in a conversation, begins to…
No comments yet. Log in to reply on the Fediverse. Comments will appear here.