What is DALL-E 2 and will this robot transform the world?


Have you ever thought what a tiger would look like wearing a lab coat with a 1980s Miami vibe? Probably not, and you’re unlikely to find anything that fits that exact description on Google. The only way would be to pay a skilled artist to spend tens of hours painting it. Or perhaps there is another way, and this is where we introduce DALL-E 2. DALL-E 2 is a truly revolutionary tool from OpenAI that can generate a unique and accurate image by providing it with some simple text description like “a tiger dressed in a Miami vice suit.”

Source: Open AI

Dall-E can also edit and touch up existing photos in a very realistic way. It is based on a simple natural language description that can fill in gaps or replace attributes of an image in a seamless way such that the changes blend in perfectly with the original image.

This technique is referred to as “in-painting.”

Before, Source: Open AI
After, Source: Open AI

The first edition of Dall-E was introduced in January 2021 and was capable of generating images from text input like this Avocado Armchair:

DELL-E, (first edition), Source: Open AI
DELL-E-2 (latest edition), Source: Open AI

The latest edition, released in April this year, takes the technology a step further with higher resolution and a much greater level of comprehension. DALL-E 2 also has enhanced features like in-painting. It can even take an existing picture as an input and generate variations of that image with different attributes, angles, and artistic styles.

It’s not enough for the technology to understand objects like motorcycles and koala bears independently; it needs to learn how they are related. DALL-E 2 achieves this through Deep Learning. Dall E was developed by training a neural network on images and the associated text descriptions. By having the ability to understand how objects and actions are related, DALL-E can understand and produce a meaningful image of, for example, a koala bear riding a motorcycle, even if that image has never existed before.

Source: Open AI

DALL-E-2 has three main objectives. The first is to help people express themselves visually in ways that were impossible before. The second is to gain an understanding from an AI-generated image of how the technology interprets us. In other words, is DAL-E 2 just repeating what it has been taught. Finally, the development of this new technology is intended to show humans how advanced AI systems have become by demonstrating how they see our world.

Source: Open Ai

Every new revolutionary technology has limitations that must be understood to avoid misuse, and DEL-E is no exception. If enough images have been incorrectly labeled on the internet, Dall-E may produce flawed output. For example, if a plane has been labeled a “car” enough times, Dall-E may return an image of an aircraft in response to a user requesting a photo with a car. Think of it as talking to someone who was taught the incorrect word for something.

Another limitation is training gaps. For example, if you enter the text “baboon,” and DALL-E has learned what a baboon is through all the accurately labeled images on the internet, it will generate a lot of great baboons.

Source: Open AI

But what if you typed something DALL-E learned incorrectly due to inaccurate labeling. For example, if you asked for a “howler monkey” and it hasn’t learned what a howler monkey is, DALL-E, instead, will do its best and might give you a “Howling Monkey”.

Source: Open AI

What is fascinating about the techniques used to train DALL-E 2 is that it can take everything it has been taught for various objects and apply it to a new image. For example, you might provide a picture of a monkey and ask DALL-E to alter the appearance to something unique, like generating a photo of the monkey doing his taxes and wearing a funny hat.

Source: Open AI
Source: Open AI

Dall-E-2 is an example of how imaginative humans and clever systems can work together to make new things – amplifying our creative potential.

Thank you for reading.


Leave a Reply