Immediately after switching the page, it will work with CSR.
Please reload your browser to see how it works.
I found that when editing images of myself, the result looked weird, like a funky version of me. For the cat, it looks "more attractive" I guess, but for humans (and I'd imagine for a cat looking at the edited cat with a keen eye for cat faces), the features often don't work together when changed slightly.
Here's the best out of a few attempts for a really similar prompt, more detailed since Flash is a much smaller model "Give the cat a detective hat and a monocle over his right eye, properly integrate them into the photo.". You can see how the rest of the image is practically untouched to the naked human eye: https://ibb.co/zVgDbqV3
Honestly Google has been really good at catching up in the LLM race, and their modern models like 2.0 Flash, 2.5 Pro are one of (or the) best in their respective areas. I hope that they'll scale up their image generation feature to base it on 2.5 Pro (or maybe 3 Pro by the time they do it) for higher quality and prompt adherence.
If you want, you can give 2.0 Flash image gen a try for free (with generous limits) on https://aistudio.google.com/prompts/new_chat, just select it in the model selector on the right.
Hm, no, I’ve never had this thought.
Humans do not learn these things by pure observation (newborns understand object permanence, I suspect this is the case for all vertebrates). I doubt transformers are capable of learning it as robustly, even if trained on all of YouTube. There will always be "out of distribution" physical nonsense involving mistakes humans (or lizards) would never make, even if they've never seen the specific objects.