1 day ago · Tech · 0 comments

On other platforms: Web, Apple Podcast, YouTube. I had the chance to chat with Sara Marjanovic, PhD student at University of Copenhagen, about the thinking process of LLMs. Deepseek R1 has been the first open model with a visible thinking trace, and this opened the doors to new ways to evaluate and research LLMs. It made possible to benchmark thinking vs non-thinking models, compare different reasoning processes, look at traces to see what the reasoning process looks like, and find potential flaws or research direction to improve the effectiveness, as well as see how it influences the behaviour of the model. What's interesting about looking at the thinking process? Few things stood out to me from the conversation with Sara: Overthinking. Usually the agent defines the problem, find an answer and then verify again the thinking before wrapping up the reasoning process and provide the answer to the user. Sometimes, the agent enters in a loop and keeps repeating the same sentence, without…

No comments yet. Log in to reply on the Fediverse. Comments will appear here.