Only six weeks after Opus 4.7, we have Opus 4.8. For everyone, that means another incremental upgrade to Claude. It is once again smarter, and can do tasks for longer, and comes with a number of hot new features. For me, that also means reading another 244 page system card. It was only April 20 when I did a full review of the Opus 4.7 system card, plus an additional post focusing on related issues of model welfare. These updates are incremental and coming more rapidly, and this still is below the capability level of Claude Mythos, so the focus will be on the delta. What is different about Opus 4.8 versus what we already know about Opus 4.7 and Mythos? It turns out there’s still a lot to talk about. Image created as self-portrait for this post by Claude Opus 4.8 Table of Contents Here We Go Again: Executive Summary. Introduction (1). RSP Evaluations (2). Move That Goalpost. The Failures Are News. Alignment Risk Slowly Rises. New Risk Pathways Just Dropped. Cyber (3). Harmful Requests…
No comments yet. Log in to reply on the Fediverse. Comments will appear here.