AI image generation, still a ways to go.

In early 2024, AI image generation has reached a fascinating paradox. At first glance, the technology seems almost magical – capable of creating stunning, photorealistic scenes that blend reality with imagination. Take, for instance, the task of generating a hybrid classic car design: when prompted to combine a ’67 Mustang with a ’57 MGA Roadster, AI can produce images with remarkable attention to automotive detail, set against perfectly rendered mountainous backgrounds with dramatic lighting. The chrome gleams, the curves flow, and the setting sun casts just the right shadows on the cliffs.

Yet upon closer inspection, the limitations become apparent. While AI excels at certain elements – like the mechanical precision of car bodies or the natural textures of landscapes – it still struggles with human subjects. In the classic car challenge, many generators either omitted the requested driver entirely or produced unconvincing human figures that break the illusion of photorealism. These inconsistencies reveal that while AI image generation has made tremendous strides, it remains a technology in development, particularly when it comes to naturally integrating human elements into complex scenes.

Here is a comparison of several free AI image generators, that show both the impressive capabilities and persistent challenges of this rapidly evolving technology. All used the prompt: “A hybrid car that combines elements of a 1967 Ford Mustang and a 1957 MGA Roadster. The design features the sleek, muscular stance of the Mustang with the graceful curves and open-top design of the MGA. The car is shown fully (not cropped) with a young woman driving. The driver’s position and the steering wheel are clearly on the left side of the car, with the top of the steering wheel visible. The car is driving on a winding mountain road, with lush greenery and rugged cliffs in the background under a setting sun, creating a dramatic and adventurous atmosphere.”

Here’s a list of AI image generators mentioned in the article,

  • Deepimg.AI
  • Perchance
  • DALL E3
  • DeepAI
  • Microsoft Designer
  • FlatAI

Perchance

https://perchance.org/ai-photo-generator

While a nice image, there is no driver and not much, if any, influence from a 1957 MGA roadster. The site did offer multiple images, but the interface was very crude looking.

DALL E3

https://openai.com/index/dall-e-3

This image does a better job with merging the Mustang and MGA, but the driver shows that AI still has difficulties with generating realistic humans. Also I was not pleased with the Mustang logo’s in the image. However, there was an edit option that let me prompt: remove the mustang logos and text.

This removed the logo and text, but gave me a new image that was far worse than the first. The car was less of a combination between a Mustang and a MGA. The driver still is unrealistic and is not positioned correctly. I tried one more time, asking “No, keep the first image, just remove the logo and text”.

And Dall E told me “Here is the updated version of the original image with the logos and text removed. Let me know if you need further modifications!” To which I responded “ARRRGGGG!!!!!”

DeepAI

https://deepai.org/machine-learning-model/text2img

This did a good job rendering the car, but still without much MGA influence. They solved the unrealistic human issue by obscuring the driver. But overall I like this one. It reminds me of driving through the canyon on the way to the John Day river in Oregon.

Microsoft Designer

https://designer.microsoft.com/image-creator

I think they hired the same model as Dall E3 :), but other than that this is not too bad. The rendering is good, and there are elements from both the Mustang and MGA.

Deepimg.AI

https://deepimg.ai/ai-image-generator/

This is one of the best, in my opinion. The rendering is very good, the driver is the most realistic.

FlatAI

https://flatai.org/ai-image-generator-free-no-signup/

This is also very good.

Conclusions

As this comparison demonstrates, AI image generation has reached an intriguing inflection point in early 2024. While each platform showed remarkable capabilities in certain areas – particularly in rendering vehicles, landscapes, and atmospheric lighting – they also revealed consistent limitations that highlight the technology’s current state of development. Deepimg.AI and FlatAI emerged as standout performers, producing the most cohesive and convincing results, especially in their handling of human subjects – traditionally one of AI’s greatest challenges.

The varying success in merging the distinctive characteristics of the 1967 Mustang and 1957 MGA Roadster serves as a microcosm of AI’s current capabilities. While some generators produced beautiful vehicles, they often favored one model’s features over the other, suggesting that AI still struggles with truly creative hybrid designs that require deep understanding of multiple reference points.

Perhaps most tellingly, the experiment revealed the importance of iterative refinement in AI image generation. As seen with DALL-E 3’s attempts at logo removal, current AI systems can be surprisingly rigid when asked to make selective modifications to their outputs. This highlights a key area for future development: the ability to maintain desired elements while precisely adjusting others.

For users seeking to leverage these tools effectively, this comparison suggests that success lies in choosing the right platform for specific needs and being strategic with prompts. While no single generator proved perfect across all criteria, each demonstrated unique strengths that could be valuable in different contexts. As these technologies continue to evolve, we can expect to see improvements in their ability to handle complex requests and generate more consistent, customizable results.

Leave a Reply

Your email address will not be published. Required fields are marked *