Patterns for Building Cybersecurity Evals 0 ▲ Eugene Yan 1 hour ago · Tech · 0 comments A sandboxed target, inputs that influence task difficulty, tools, and a grader. No comments yet. Log in to reply on the Fediverse. Comments will appear here.
No comments yet. Log in to reply on the Fediverse. Comments will appear here.