4 hours ago · Tech · 0 comments

As an AI trainer, there is a fun game to test AI image models and see how far we still are from having an “intuitive human” model. This is not a test of image generation, but of cultural human reasoning. Ask the model to generate 12 eggs. It will likely give you a carton with 12 eggs. Ask for 13 eggs, and it may give you a carton with 14 eggs. Ask again for 13 eggs, and it may give you a carton with only 12 eggs. What happens here? From my perspective as an AI trainer, users should not have to adapt their prompts for the model to generate or understand the request better. Instead, the model should evolve to understand implicit human requests. If a user wants to generate an image of 13 eggs, does that really sound like a glitchy prompt? Why couldn’t the model generate 13 eggs in another context instead of assuming they should always be in a carton? The interesting part is that it could not follow the prompt, which, in this case, is a major failure. We know that egg cartons are usually…

No comments yet. Log in to reply on the Fediverse. Comments will appear here.