
June 1, 2023
Why All AI Art Looks the Same
The aesthetic monoculture problem, and why it's the opposite of what advertising needs.
Open LinkedIn. Scroll for thirty seconds. Count the AI-generated images. Now try to tell them apart.
You can't. They all have the same cinematic lighting. The same shallow depth of field. The same vaguely aspirational human figure gazing into the middle distance. Every one of them was made with Midjourney, and every one of them looks like it. They could sell anything, which means they sell nothing.
I've spent the last twenty years making things for screens. TV commercials, film, branded content, advertising campaigns across the Middle East. I've also spent the last year obsessively generating AI images, building workflows, testing every tool I can get my hands on. And the thing that strikes me most about the current moment isn't how good the images are. It's how same they are.
The Midjourney Aesthetic
Let's name it. You know it when you see it. The "Midjourney look" is a specific visual language: hyperreal skin textures, dramatic rim lighting, bokeh backgrounds, color grading that splits the difference between a Fincher film and a perfume ad. It's technically impressive. It photographs well on a phone screen. And it has become the visual equivalent of muzak.
Midjourney V5, released in March 2023, made this worse, not better. V4, for all its roughness, had more stylistic range. You could coax strange, painterly, illustrative results out of it. The images had texture and personality, sometimes by accident. V5's leap to photorealism collapsed that range. The default output became so polished, so "good," that most users stopped pushing against it. Why would you? The path of least resistance produces a beautiful image every time.
V5.1 doubled down. Its "RAW mode" helps somewhat, but the gravitational pull toward that house style is immense. And the problem isn't Midjourney's fault, exactly. The problem is human behavior meeting statistical probability.
Why This Happens
The homogenization of AI art isn't a bug. It's the predictable result of how these systems work.
Training data shapes the default. Every image generator learns from millions of images scraped from the internet. The most common aesthetic in commercial photography, editorial work, and stock imagery is... the aesthetic you see in AI output. High production value. Dramatic lighting. Beautiful people in beautiful settings. The model learns that this is what "good" looks like because, statistically, this is what "good" looks like in the training set.
Default parameters reward the median. When you type a prompt into Midjourney or DALL-E 2 without specifying a style, the model gives you its best guess at the most probable image matching your description. "Most probable" means "most like the average of everything it's seen." That average is competent, polished, and generic.
Users optimize for likes, not distinction. Social media rewards the impressive, not the interesting. An AI image that looks like a movie poster gets engagement. An AI image that looks like a collage by Hannah Hoch does not. So users learn to prompt for the thing that performs, and the feedback loop tightens.
Prompt culture is a monoculture. Go look at any prompt-sharing community. The same modifiers circulate endlessly: "cinematic lighting, 8K, ultra-detailed, trending on ArtStation." These aren't creative decisions. They're cargo cult incantations. People copy prompts that produced impressive results, and the results converge further.
This is not a new dynamic. It's the stock photography problem with a faster engine. Getty Images didn't make every ad look the same because the photos were bad. They made every ad look the same because they were good enough, cheap enough, and easy enough that lazy creative teams stopped trying.
Adobe Enters the Chat
Adobe Firefly, which launched in beta in March 2023, is fascinating for what it reveals about where this is heading. Adobe trained its model on licensed Adobe Stock images, openly licensed content, and public domain material. The legal angle is smart. The creative angle is telling.
Firefly's output looks like Adobe Stock. Because it was trained on Adobe Stock. The images are clean, professional, and instantly forgettable. They look like what they are: the statistical average of commercially licensed photography. For a certain kind of production work (mockups, comps, placeholder content), this is useful. For advertising that needs to stop someone mid-scroll, it's a dead end.
Adobe is integrating Firefly into Photoshop and the rest of Creative Cloud. This means millions of designers will have a "generate background" button that produces the same backgrounds as every other designer using the same button. The implications for visual culture are significant, and mostly depressing.
The Advertising Problem
Here's where this stops being an aesthetic debate and becomes a business problem.
Advertising exists to create distinction. That's the job. A brand campaign that looks like every other brand campaign has failed at the most fundamental level. It doesn't matter how beautiful the images are if they're interchangeable with your competitor's images.
I work in creative production in Dubai. I've spent two decades making content for TV, film, and brands across the Middle East. The excitement about AI image generation is real, and it's not misplaced. These tools can do remarkable things for concept development, for rapid prototyping, for pre-visualization. But when the final output of any creative project is indistinguishable from a Midjourney community showcase, something has gone wrong.
Think about the most memorable ad campaigns of the last decade. Apple's "Shot on iPhone" series. Nike's "Dream Crazy." Old Spice's absurdist humor. Burger King's moldy Whopper. None of them looked like everything else. That was the point. Distinctiveness isn't a nice-to-have in advertising. It's the entire value proposition.
Now look at the AI-generated ad concepts flooding LinkedIn. Strip away the brand logos and tell me which company they're for. You can't. They all exist in the same visual universe: a sleek, hyperreal, vaguely futuristic nowhere. They're impressive in isolation. They're invisible in aggregate.
The Stable Diffusion Counterpoint
If the problem is aesthetic convergence, the most interesting counter-force right now is Stable Diffusion. Not because its default output is better (it's often worse). Because it's open.
Stable Diffusion 1.5 and 2.1, running through Automatic1111's web UI or similar interfaces, let you do something Midjourney doesn't: train your own models, merge model weights, use LoRAs and textual inversions to create custom aesthetics that are actually yours. You can train a model on Art Deco poster design, or Soviet propaganda art, or the specific color palette of a 1970s Italian horror film. The results require more work, more technical knowledge, and more creative direction. They also look like nothing else.
This is the fork in the road that matters. The closed, easy-to-use tools (Midjourney, DALL-E, Firefly) optimize for quality and accessibility. The open tools (Stable Diffusion and its ecosystem) optimize for flexibility and customization. For casual use, the closed tools win. For professional creative work where distinction is the goal, the open tools have an enormous advantage that's barely being exploited.
The irony is that most agencies are gravitating toward the closed tools precisely because they're easier. Which means they're all gravitating toward the same aesthetic. Which means they're spending money on tools that undermine the very thing they're being paid to produce.
"Impressive" Is Not "Distinctive"
I want to draw a line here that I think matters.
When someone shows me an AI-generated image and says "look how good this is," they usually mean it's technically impressive. The rendering is photorealistic. The lighting is plausible. The composition follows established rules. This is the "impressive" axis, and AI image generators have gotten very good at it.
But impressive and distinctive are different things. A session musician who can play any style perfectly is impressive. Thelonious Monk missing notes on purpose is distinctive. A cinematographer who lights every scene like a car commercial is impressive. Roger Deakins choosing to light a battle sequence with a single flare in 1917 is distinctive.
Distinctiveness requires taste, intention, and the willingness to deviate from what's expected. It often means making something that looks "worse" by conventional standards. Less polished. Less immediately appealing. More specific.
I made an Arabic comic book a few years ago. The art wasn't polished. It was deliberately rough, influenced by underground European comics more than mainstream American ones. That roughness was a choice. It communicated something about the story's world. If I'd generated those pages with Midjourney, they would have looked "better" and said less.
The same principle applies at scale. When every AI-generated image in your campaign looks like it was lit by the same cinematographer, graded by the same colorist, and shot in the same conceptual studio, you haven't created a campaign. You've created wallpaper.
Art Direction Matters More Than Ever
Here's the part that should worry agencies and excite art directors.
If the tools produce a converging aesthetic by default, then the value of the person who can push against that default just went up. Art direction, the skill of defining and maintaining a distinct visual identity, is the thing that separates AI-assisted creative work from AI-generated slop.
This has always been the case with any production tool. After Effects didn't make every motion graphics piece look the same. Talented motion designers used it to create wildly different work. The lens flare plugin in After Effects did make a lot of work look the same, because people used it without intention. The parallel to AI image generation is almost too obvious.
The practitioners who will thrive are the ones who treat these tools the way a film director treats a camera: as an instrument that executes a vision, not as a vision in itself. That means starting with a clear aesthetic intention. It means knowing enough about art history, design movements, photography styles, and visual culture to describe what you want in terms the model can interpret. It means knowing when the "good" output is wrong for the project.
I've been using Stable Diffusion with custom models to develop visual styles for concept work. The images are often rougher than what Midjourney produces. They're also more interesting, because they don't look like everything else. The extra effort is the point. If it were easy, everyone would do it, and we'd be back to the same convergence problem.
The Multiplier Effect
The tool is a multiplier. If you multiply zero by ten, you still get zero.
AI image generators multiply your creative input. If your creative input is "make something that looks cool," you'll get something that looks cool in exactly the same way as everyone else's cool-looking thing. If your creative input is specific, informed, and intentional, you'll get something that reflects that specificity.
This is not a technology problem. It's a taste problem. It's a knowledge problem. It's a "most people using these tools have never heard of Saul Bass or Tadanori Yokoo or April Greiman" problem.
The fix isn't better AI. The fix is better art direction applied to the AI we already have.
What Comes Next
I don't know where this goes. Runway's Gen-2 is starting to do for video what Midjourney did for images, and I expect the same convergence dynamics to play out. The first wave of AI video will be stunning. The second wave will be generic. The people who break through will be the ones who already know what they want before they open the tool.
For now, we're in the "ooh, shiny" phase. Every AI-generated image gets attention just for being AI-generated. That won't last. When the novelty wears off, the only thing that will matter is whether the image communicates something specific to a specific audience in a way that no other image could.
That's always been the job. The tools changed. The job didn't.
Omar Kamel is an AI creative lead with two decades of experience in TV, film, and content production, based in Dubai.
Mar 1, 2023
The Morning After: AI Just Got Good Enough to Matter
Four major AI releases in a single week—Midjourney V5, GPT-4, Claude, and Runway Gen-1—represent a threshold moment. For production professionals, the question isn't whether these tools matter anymore. It's how fast you adapt.
Apr 1, 2026
The Gap Between AI Hype and Production Reality
Between Sora's spectacular demo and its shutdown lies a story about AI's real production cost. The gap between hype and deliverables isn't just time—it's a fundamental mismatch between what demos promise and what production demands.
Mar 15, 2026
What AI Can and Can't Do in Creative Production Right Now
AI image generation, video, and voice tools are production-ready today—but only for practitioners who understand where they actually deliver versus where they bluff. Here's what actually works on real client projects.