Midjourney vs DALL-E for Textual content Era – Who Does It Higher?

Let’s take a fast historical past lesson and look again on the state of AI picture era a yr in the past. We could not reliably generate faces, DALL-E 2 had simply been launched a couple of months prior and had combined outcomes, Midjourney V4 was beginning to make some noise, and Secure Diffusion’s main the best way with 2.0.

In only a yr, AI artwork has been almost excellent besides for 2 vital roadblocks: nuance and textual content era.

Quick ahead to right this moment: we simply had DALL-E 3 a couple of months again, and earlier this week, Midjourney V6 was lastly launched. Can these lastly be the AI picture mills that deal with textual content completely? Let’s discover out.

Why Midjourney and DALL-E 3?

For some time now, DALL-E 3 has been the one AI picture generator that may persistently create pictures with textual content. It is considered one of their essential promoting factors, together with improved creativity and nuance. It is even showcased on their announcement web page with this photograph:

Just lately, Midjourney unveiled its latest mannequin: V6. And what have you learnt, they’re additionally highlighting higher nuance, creativity, and, most significantly, minor textual content drawing as their enhancements. I’ve all the time averted utilizing textual content era when evaluating Midjourney in opposition to different mills as a result of it could be unfair, however now that we’re getting this characteristic, it solely is smart to pit it in opposition to one of the best.

Head-to-Head: Midjourney vs. DALL-E for Textual content Era

Every comparability will deal with textual content, however we’ll additionally analyze their nuance and creativity in making use of the textual content. So, with out additional ado, this is a direct comparability of Midjourney and DALL-E 3 utilizing the identical prompts:

Easy Textual content

Textual content: “That is textual content.”

When it comes to the textual content itself, Midjourney carried out higher than DALL-E 3 due to a small mistake the latter made when writing the final a part of the textual content. Nevertheless, DALL-E exhibits extra cohesion as a picture as a result of the trainer in Midjourney is utilizing a pen to jot down on a chalkboard.

Winner: Midjourney V6

Lengthy Textual content

Textual content: “The short brown fox jumps over a lazy canine, and promptly tripped over the canine’s tail, incomes a disgruntled grumble.”

Each tried so as to add their very own aptitude to a easy immediate (a chunk of paper with writing on it), however neither really made readable textual content. This exhibits that AI picture mills can write brief phrases or sentences, however they worsen as you add extra phrases.

Winner: None.

Keyboard

For this one, I did not ask both mannequin to jot down a selected phrase or sentence, however I tasked them to generate an correct QWERTY keyboard. Clearly, neither is definitely appropriate, however DALL-E could not even prepare the letters correctly, whereas Midjourney in some way received the proper placement for greater than half the letters.

Winner: Midjourney V6

Brand

Textual content: “Matcha.”

Each of those pictures reveal an excellent understanding of my unique immediate (a inexperienced espresso mug emblem) and showcase creativity. There’s nothing incorrect with both textual content both, and it even matches the artwork type every generator created for his or her emblem.

Winner: Tie spherical.

Postcards

Textual content: “Joyful Halloween.”

As AI picture fashions evolve, I’ve to be extraordinarily nitpicky with how I choose their textual content era prowess. Working example: I’d like to make this a tie spherical, however the minor errors on DALL-E’s output (triple Ls in “Halloween” and inconsistent coloring in “Joyful”) prevents me from doing so.

I’ll say this although: I want DALL-E’s postcard over Midjourney.

Winner: Midjourney V6

Indicators

Textual content: “Bacon and Eggs.”

It is a clear win for DALL-E. Midjourney V6 tried its greatest, however the pointless and out-of-place yellow “and” signal stops this spherical from changing into a tie.

DALL-E additionally exhibits superb nuance this spherical by turning “and” to an ampersand and making a separate “Diner” neon signal with out me asking. It is not simply readable; it is also artistic, distinctive, and immersive.

Winner: DALL-E 3

E book Covers

Textual content: “Shapes and Stuff.”

I will admit: DALL-E 3 created a significantly better guide cowl than V6. Nevertheless, the guide title generated by DALL-E has far too many errors, so I’ve to provide this level to Midjourney, which completely rendered “Shapes and Stuff” in a constant font. V6’s cowl design additionally showcases its improved comprehension by highlighting the textual content’s key phrases.

Winner: Midjourney V6.

Comedian Panel

Textual content: “Knock knock!”

Midjourney V6 and DALL-E 3 each made minor errors in writing the textual content. Since each of those are nonetheless readable and their art work is amazingly achieved, I am declaring this spherical one other tie.

Winner: Tie spherical.

Surreal Settings

Textual content: “To infinity”

Simply to supply slightly background: my immediate for this spherical explicitly states that the textual content needs to be composed of stars. Though I discussed that the main target can be on the textual content itself, which Midjourney did higher this spherical, DALL-E’s minor mistake will not stop me from awarding this level to them as a result of they did, in truth, create the textual content utilizing stars.

Winner: DALL-E 3

The Ultimate Tally and Observations


	Virtually excellent textual content, and showcases a excessive degree of nuance and creativity.	Good textual content, and showcases an excellent degree of nuance and creativity.

	Letters aren’t positioned in the precise order.	Round half of the letters are positioned within the appropriate order.
	Good textual content, and showcases a excessive degree of nuance and creativity.	Good textual content, and showcases a excessive degree of nuance and creativity.
	Virtually excellent textual content, and showcases a excessive degree of nuance and creativity.	Good textual content, and showcases a excessive degree of nuance and good creativity.
	Good textual content, and showcases an extremely excessive degree of nuance and creativity.	Virtually excellent textual content with a noticeable mistake. Showcases good degree of nuance and creativity.
	A superb try with a couple of noticeable errors. Showcases nice degree of creativity.	Good textual content, and showcases an excellent degree of nuance and creativity.
	Virtually excellent textual content, and showcases an extremely excessive degree of nuance and creativity.	Virtually excellent textual content, and showcases an extremely excessive degree of nuance and creativity.
	Virtually excellent textual content, and showcases a excessive degree of nuance and creativity.	Good textual content however exhibits low understanding of the immediate.

One issues I’ve observed on this testing is that DALL-E 3 seems to have the next error price in comparison with Midjourney. Alternatively, Midjourney tends to lack the identical degree of creativity and nuance when tasked with producing pictures that particularly asks for textual content. I consider that V6 is compromising a portion of its creativity when fed with prompts that explicitly focuses on textual content era.

Wrapping Up

This face to face is quite a bit nearer than I anticipated, however Midjourney V6 pulls by way of with a win. Nevertheless, like I stated earlier, V6’s improved however nonetheless restricted nuance is stopping it from producing textual content whereas making full use of its creativity.

Nevertheless, that is to be anticipated as a result of this is not the ultimate model of V6 but. Midjourney is barely going to get higher from right here as they progressively enhance the mannequin behind it. There isn’t any concrete information on DALL-E 4 but, however we are able to anticipate the identical enhancements for that mannequin too. However for now, Midjourney’s the one main the house in textual content era no doubt.

That is it for this direct comparability. Should you’re searching for extra articles about V6 and DALL-E 3, I extremely recommend studying this text. Good luck!

Midjourney vs DALL-E for Textual content Era – Who Does It Higher?

Why Midjourney and DALL-E 3?

Head-to-Head: Midjourney vs. DALL-E for Textual content Era

Easy Textual content

Lengthy Textual content

Keyboard

Brand

Postcards

Indicators

E book Covers

Comedian Panel

Surreal Settings

The Ultimate Tally and Observations

Wrapping Up

Leave a comment Cancel reply

You May Also Like

DeepSeek AI and the World Energy Shift: Hype or Actuality?

Visible Instruction Tuning for Pixel-Degree Understanding with Osprey

Open the door to a new universe Terra Cyborg

Newsletter Signup

My Account

Main Features

Get Us On