2 hours ago · Tech · 0 comments

A small-model replication sketch inspired by Anthropic's emotion-concept work. Anthropic's 2026 paper, Emotion Concepts and their Function in a Large Language Model, reports that Claude Sonnet 4.5 contains internal activation directions corresponding to emotion concepts such as calm, afraid, and desperate. The key claim is not that the model feels anything, but that these directions are measurable and can causally influence behavior. When the Claude Code leak was circulating, I came across a Polymarket tweet about it. The tweet said the leaked code showed Claude Code detecting profanity in user prompts and logging it as a frustration signal. The claim stuck with me because it framed visible user frustration as telemetry. I wanted to look at the other side of that loop. Users get frustrated when coding agents fail. What happens inside the coding model when it repeatedly sees test failures, broken patches, and a shrinking retry budget? Here is the coding-agent question I wanted to test:…

No comments yet. Log in to reply on the Fediverse. Comments will appear here.