Just a few days in the past, we had an early Christmas current from the Midjourney workforce with the sudden launch of V6’s base mannequin, promising higher immediate comprehension and textual content era than its earlier mannequin. Per week earlier than that, Meta additionally dropped a brand new AI picture generator, which I imagine is the perfect free mannequin proper now.
So, it is that point of the yr once more.
No, I am not speaking concerning the vacation season. It is time for a serious comparability between the market’s hottest AI picture mills: Midjourney, DALL-E, Firefly, Secure Diffusion, and Meta.
Which one will come out on prime this time? Spoiler Alert: The reply might not shock you.
The Final Output Comparability
That is the largest comparability we have ever made, so I will use the identical immediate for every picture to keep up equity. I will additionally prominently show those I like essentially the most, however don’t be concerned: I will label every image to keep away from confusion.
Lifelike (Portraits)
close-up portrait of a weathered fisherman, wrinkles round his eyes,
salt-spray on his beard, hyperrealistic textures, cinematic lighting
Among the many 5 picture mills, solely Midjourney and Meta managed to create photos that might go the odor check. Firefly’s portrait is just too waxy and the fisherman’s beard appears pretend. Secure Diffusion does not look life like in any respect, however extra like an oil portray. DALL-E 3 may’ve been good, however it overemphasizes on the wrinkles.
Have a look at the small print on Midjourney’s picture. When for those who zoom in, you may see each strand of hair, the age strains, even the reflection on his eyes. It additionally has constant lighting and depth of discipline. Meta is an in depth second, however it nonetheless has that “softened” impact which is a trademark for AI picture mills at this level.
Lifelike (Panorama)
a rugged shoreline eroded by relentless waves,
towering cliffs that is been sculpted into dramatic arches and hidden coves,
seabirds soar above, mist swirls alongside the horizon, realism
As soon as once more, Midjourney wins this spherical. V6 actually has been a gamechanger in terms of life like photos. The pictures it outputs remains to be slightly stylized and vivid, however it could now go as an actual picture. Nevertheless, for those who’re simply on the lookout for a panorama inventory picture, then Firefly may be the higher choice for you.
As for the opposite three: Secure Diffusion and Meta had been truly fairly respectable, however the cliffs appear like a lump of clean clay when zoomed in. DALL-E 3 opted to make digital artwork, which is not what I am on the lookout for.
3D Product Renders
business images, a fragrance bottle,
pastel blue background, dreamy, delicate lighting, centered, flowers
I am truly impressed as a result of all of those turned out to be good. Nevertheless, Midjourney V6 continues to be on a league of its personal with one other stunning entry. It is dreamy, well-shot, and has nice contrasts. Meta is, as soon as once more, an in depth second. The one letdown is the dangerous textual content era.
Digital Artwork
pixel artwork scene, a quiet and empty grocery store at evening,
atmospheric, 16-bit
It is a matter of private desire however I vastly most well-liked Midjourney and DALL-E’s model of this immediate as a result of it completely emulated the “atmospheric” vibe I used to be on the lookout for. That is additionally the primary time that Midjourney comes at second for me, principally as a result of the “pixel artwork” phantasm goes away if you zoom in.
Secure Diffusion truly had an incredible entry, however the meals on the cabinets aren’t correctly rendered upon nearer look. Firefly did not crack the highest two as a result of it generated meals market stalls inside a grocery, which reveals that it lacks nuance. Meta is, by far, the worst in pixel artwork, failing in each contextual understanding and pixel artwork impersonation.
Brand
a brand for a barbershop, by paul rand, clear background, minimalist
It is a win for Midjourney, and it isn’t even shut. Everybody else went for a generic brand, however Midjourney did one thing new by taking a barber’s pole and turning the colours into one thing that resembles brush strokes. It is so easy but so efficient and distinctive. Other than utterly fulfilling an extended immediate, that is most likely the perfect case for Midjourney’s improved nuance.
DALL-E 3 additionally deserves a point out right here as a result of it managed to create a well-designed brand, albeit frequent. The largest drawback I’ve with it although is that it created two totally different logos after I requested just for one.
Textual content Era
a comic book panel of a distraught Tony Stark saying “Captain is useless.”
It ought to come as no shock that DALL-E 3 is in our High 2 this spherical, however for the primary time ever since I’ve began evaluating AI picture mills, I do not discover it the perfect for textual content era. However let’s begin with the Secure Diffusion, Meta, and Firefly first — all of which did not even try to create legible textual content. Oh, and I do not assume Firefly is aware of who Tony Stark is.
When Midjourney V6 got here out, they put an emphasis on their textual content era enhancements and it actually reveals. Have a look at the accuracy of that textual content. That is not even edited. I’ve mentioned it earlier in my V5 vs. V6 comparability, however Midjourney actually is the perfect at textual content now.
Now, let’s go to DALL-E 3. It might not be nearly as good as V6 however it’s nearly there. Virtually. It definitely did not assist that Tony Stark is shouting “Captan’s useless” whereas Captain America is behind him.
Excessive Context
A middle-aged lady of Asian descent, her darkish hair streaked with silver, seems fractured and splintered, intricately embedded inside a sea of damaged porcelain. The porcelain glistens with splatter paint patterns in a harmonious mix of shiny and matte blues, greens, oranges, and reds, capturing her dance in a surreal juxtaposition of motion and stillness. Her pores and skin tone, a light-weight hue just like the porcelain, provides an nearly mystical high quality to her kind.
This one’s truly spectacular. If we’re solely speaking about comprehension, then all of those photos handed this check. So, we have now to issue during which one fulfilled it the perfect.
I took this immediate from DALL-E 3’s announcement web page so there is no query that their output is the perfect. From there, it is robust to rank the others 1 to 4.
Secure Diffusion and Midjourney had the perfect wanting outputs, however it tearing does not appear like “damaged porcelain” to me, extra like a crumbling wallpaper. Firefly was nearly excellent, however it missed the “splatter paint patterns.” In the meantime, Meta fulfilled each facet of the immediate, however it generated a subpar picture, for my part.
So, What Are They Good At?
Midjourney V6 is a tremendous enchancment from V5.2, fixing each drawback that its earlier era had. For my part, it is now the perfect for each life like and digital artwork, in addition to textual content era. It is also the perfect at mimicking sure artwork kinds, which different AI picture mills cannot do attributable to insurance policies and tips. |
Midjourney could also be the perfect at it, however it nonetheless has bother producing lengthy texts. The educational curve for prompts can be a lot larger with the discharge of V6. |
|
DALL-E 3 remains to be the perfect for immediate comprehension and an incredible different to Midjourney for producing texts. It is also the perfect at creating pixel artwork. |
DALL-E may use some work in producing life like photos, particularly ones with individuals. |
|
Meta does life like photos very well, particularly portraits and panorama photographs. It is also the perfect free AI picture generator available in the market. |
Meta nonetheless cannot do textual content era reliably. In all my testing, I’ve additionally discovered that it struggles lots with pixel artwork. |
|
Firefly is finest utilized by digital artists who use the Adobe suite for enhancing. |
Like most mills, Firefly nonetheless cannot generate textual content. It additionally struggles with creating paintings primarily based on present characters. |
|
Secure Diffusion is an effective AI picture generator for those who’re wanting that may fulfill lengthy prompts free of charge. |
Secure Diffusion cannot generate life like portraits with out overemphasizing sure options. |
Closing Ideas
With the discharge of Midjourney V6, it is getting more durable and more durable to make a case for different AI picture mills. The bottom mannequin is on a league of its personal proper now, and it is solely going to get higher once they formally launch it particularly since they’re taking consumer opinion to enhance their mannequin.
Oh, and we have not even touched on its strong customization options, like improved upscaling, variations, and different immediate parameters. That is how good it’s.
Nevertheless, for those who’re only a informal consumer, Meta is an effective different because it’s free. And for those who’re on the lookout for a mannequin with superb comprehension, DALL-E (with ChatGPT) remains to be the perfect one available in the market.
V6 is an actual turning level for AI artwork. The one query is, the place do they go from right here?