“It’s Hard to Eval” Is a Product Smell 0 ▲ Hamel's Blog 20 hours ago · 18 min read3666 words · Tech · hide · 0 comments For the past 3 years, AI evals have been my professional focus.1 The most common objection I hear to evals is “our product is hard to eval”. This objection is a product smell. Artifacts that are hard for you to verify are often hard for users too. In the worst case, users have to redo the work from scratch to verify the output. More importantly, designing your product for ease of verification should come before building evals. In this post, I’ll walk through three products I advised on that faced this issue. I’ll also show before and after sketches to demonstrate design principles. After these examples, I’ll discuss how to apply this general pattern to your product. Example 1: the AI data agent Almost every company I’ve worked with builds an internal AI data agent. You ask it a business question, like what was net revenue for Product A last quarter, and it finds relevant data sources, runs the queries, and provides an answer. The goal of this agent is to reduce dependency on data… No comments yet. Log in to reply on the Fediverse. Comments will appear here.