Qwen's flamingo is artistically far more interesting. It's a one-eyed flamingo with sunglasses and a bow tie who smokes pot. Meanwhile Opus just made a boring, somewhat dorky flamingo. Even the ground and sky are more interesting in Qwen's version
But in terms of making something physically plausible, Opus certainly got a lot closer
The fundamental challenge of AI is preventing unprompted creativity. I can spin up a random initialization and call all of it's output avante garde if we want to get creative.
I recently fell down the rabbithole of AI-generated videos, and realised that many of the "flaws" that make them distinctive, such as objects morphing and doing unusual things, would've been nearly impossible or require very advanced CGI to create.
"artistically interesting" is IMHO both a subjective and 'solved' problem. These models are trained with an "artistically interesting" reward model that tries to guide the model towards higher quality photos.
I think getting the models to generate realistic and proportional objects is a much harder and important challenge (remember when the models would generate 6 fingers?).
But in terms of making something physically plausible, Opus certainly got a lot closer