1 hour ago · Tech · 0 comments

You can't out-prompt an attacker — to the model, your system instructions and a malicious support ticket are the same text. So stop defending the prompt and lock down the boundaries you actually control: tools scoped to the authenticated user server-side, middleware that screens and logs, output handled as untrusted input, a human in front of anything irreversible, and a fake-free test that fails CI the moment someone drops the auth scope. Read more

No comments yet. Log in to reply on the Fediverse. Comments will appear here.