DALL-E. Meta. Firefly. Secure Diffusion. Prefer it or not, it is simple that the AI picture technology market is certainly oversaturated now. Nonetheless, there has at all times been one standout.
To me, it is apparent that Midjourney was the very best AI picture generator within the enterprise. Nonetheless, I do acknowledge that it nonetheless has some flaws, notably with producing sensible photos and ones with lengthy prompts and textual content.
That is why I have been patiently ready for Midjourney V6, and final night time, it lastly got here. I rapidly hopped on Discord and began producing as many photos as I might. Let me let you know a fast spoiler: it is well worth the wait.
Listed here are among the finest photos I created utilizing Midjourney V6 together with the identical immediate however utilized to Midjourney 5.2:
Midjourney v5 and v6 Output Comparability
It’s been somewhat over 24 hours since Midjourney v6 got here out and let me let you know: the hype is actual. This has been, by far, my favourite picture generator. It by some means fastened each single one among my issues with the earlier model. Listed here are a few of my favourite examples:
Portraits
a lady mendacity in mattress along with her eyes closed, golden hour, closeup
My greatest gripe with Midjourney was that it could not actually generate sensible photos on par with DALL-E or Meta. The discharge of V6 appears to have solved that downside. Their realism is on a complete completely different degree now. No extra waxy faces and exaggerated options. V6’s output is so good that, even should you zoom in, you may see the imperfections that make us human. That is an immense enchancment.
Landscapes
panorama, an autumn within the lake throughout nightfall, tranquility
Do not get me incorrect: V5.2’s picture is fairly good, nevertheless it’s not precisely the look I am going for. I am on the lookout for sensible lake photos, one thing that V6 was capable of give me. This upgraded model can create authentic-looking photos with out sacrificing creative high quality. It is method higher than DALL-E 3 on this entrance, for my part.
Product Renders
product pictures, a fragrance, studio lighting, shadow play, jasmine, tender
I will admit: I am not too positive about this one. The important thing distinction is that the product photos I am getting from V5.2 seems processed and market-ready, whereas V6 seems extra uncooked, prefer it’s taken straight out of a digicam. It might have one thing to do with the phrasing of my prompts since I’ve gotten used to cluttered V5 prompts, one thing that I have to work on as a result of V6’s advanced nuance.
I’ll say this although: should you’re a seasoned editor on the lookout for detailed, well-shot uncooked photos, V6 is lots higher than V5.2.
Film Stills (Animated)
animated film nonetheless, a younger lady following a magical cat to a tree,
impressed by hayao miyazaki, whimsical, magical realism, clear strains, detailed, 8k
This can be a nice time to speak about nuance. In my immediate, I particularly requested a movie nonetheless that appears like Hayao Miyazaki’s work. V5.2’s output did not comply with this in any respect, as an alternative going for a generic 3D DreamWorks type of animation. Alternatively, V6 adopted this instruction to a tee. It seems straight out of Howl’s Transferring Fort.
I additionally extremely counsel you to zoom in and have a look at these particulars in V6’s output. The nonetheless is a lot extra vivid and energetic. It is genuinely mindblowing how good Midjourney has improved during the last couple of months.
Film Stills (Stay Motion)
movie nonetheless, again shot of a person in a inexperienced jacket, symmetrical,
muted colours, directed by wes anderson
Midjourney V5 undoubtedly had an issue with oversimplifying or overcomplicating prompts, particularly ones with numerous context. Have a look at the instance above: I stored it minimal however nonetheless, V5 wasn’t capable of be artistic with the immediate he is given. V6 solves this downside by filling within the gaps of my immediate whereas retaining its authentic thought.
PS. Sure, I do know. The man is lacking his proper ear however hey, it is V6’s first week!
Flat Illustrations
brand for a shoe firm, clear background, paul rand
I by no means actually had any situation with producing logos with V5.2 however, after seeing these photos side-by-side, I might actually inform that there was room for enchancment in hindsight. V6’s output retains the minimalism of V5.2 whereas including its distinctive spin to the illustrations that offers them extra identification.
Surrealism
the planets within the galaxy as hatching eggs of lovecraftian entities,
surrealism, cosmic, lovecraftian, ethereal, celestial our bodies
I’ve at all times praised Midjourney’s surrealist photos as one among their robust factors. Nonetheless, it tends to overpopulate its outputs with topics that you simply generally cannot determine what is going on on — one thing you could see above.
V6, with its improved nuance, manages to strike a steadiness between fulfilling the immediate and being artistic. Now you can clearly see what they’re making an attempt to painting, even with little to no details about the topic.
Textual content Technology
a restaurant in a quiet stylish neighborhood with a neon signal that claims “Closed”,
night time, streetlights
One in all Midjourney’s greatest guarantees earlier than V6 got here out was that it should repair its textual content technology, which is a large downside throughout all AI turbines. The one one I’ve tried that is first rate on that finish is DALL-E 3, nevertheless it seems like Midjourney’s subsequent in line.
It completely wrote “Closed” within the V6 picture, even including its personal aptitude. As for V5.2, properly, except you have obtained a restaurant known as “CORSTARB,” I do not assume it is reduce out for textual content technology.
Nonetheless, it is nonetheless not good, as you may see right here:
comedian panel, panicked captain america yelling “Get out of right here”, speech bubbles, gritty
This simply reveals that Midjourney nonetheless would not acknowledge letters because it’s nonetheless lacking a phrase from my immediate. For my part, this works finest with single or two-word texts solely. However hey, it is miles higher than its rivals. Even DALL-E 3 is not this good.
Excessive Context
An in depth oil portray of an previous sea captain, steering his ship by way of a storm.
Saltwater is splashing in opposition to his weathered face, willpower in his eyes.
Twirling malevolent clouds are seen above and stern waves threaten to submerge
the ship whereas seagulls dive and twirl by way of the chaotic panorama.
Thunder and lights embark within the distance,
illuminating the scene with an eerie inexperienced glow
Only a heads-up, I borrowed this immediate from OpenAI’s DALL-E 3 web page. It is generally arduous to consider components so as to add to a immediate. That is additionally a immediate that OpenAI used to check DALL-E’s nuance, so I might additionally check it with V5 and V6, after which examine.
V5.2 truly did a fairly good job, however nonetheless missed a few components just like the eerie inexperienced glow, seagulls, and thunder. V6 adopted all the things besides seagulls, however there’s nonetheless one solitary seagull within the background, so this one passes the scent check.
So, Did It Enhance?
It did enhance, by lots.
I could not present you each check I’ve finished but (I am reserving some for my subsequent article) nevertheless it’s already 100 instances higher than V5.2 in my ebook. It managed to resolve the textual content technology and nuance points whereas concurrently enhancing its creativity. Each picture I’ve created up to now with V6 is crisp, detailed, and correct.
What else is there to ask for?
The Backside Line
When V5 got here out, some stated that it was a backward step from V4.
Progressively, the group listened to the group and improved its creativity, even including some functionalities within the course of. The end result was Midjourney V5.2, which was already my favourite AI picture generator out there.
Midjourney V6 is a big enchancment on V5.2. It took all the things that was already good with V5.2 and considerably tweaked its mannequin to create extra detailed and correct photos. All the pieces that I’ve complained about with V5.2 — nuance, textual content, realism — they’ve fastened that after which some.
One of the best factor is that we will solely anticipate it to get higher from right here on out. The Midjourney group is already crowdsourcing person picture opinions by way of A/B testing to enhance its mannequin.
Mark my phrases: Midjourney V6 is a turning level in AI picture technology. In a single day tons of graphic designer jobs would possibly’ve simply been worn out. This would possibly’ve been the very first realization of that.