Have you ever thought what a tiger would look like wearing a lab coat with a 1980s Miami vibe? Probably not, and you’re unlikely to find anything that fits that exact description on Google. The only way would be to pay a skilled artist to spend tens of hours painting it. Or perhaps there is another way, and this is where we introduce DALL-E 2. DALL-E 2 is a truly revolutionary tool from OpenAI that can generate a unique and accurate image by providing it with some simple text description like “a tiger dressed in a Miami vice suit.”

Dall-E can also edit and touch up existing photos in a very realistic way. It is based on a simple natural language description that can fill in gaps or replace attributes of an image in a seamless way such that the changes blend in perfectly with the original image.

This technique is referred to as “in-painting.”


The first edition of Dall-E was introduced in January 2021 and was capable of generating images from text input like this Avocado Armchair:


The latest edition, released in April this year, takes the technology a step further with higher resolution and a much greater level of comprehension. DALL-E 2 also has enhanced features like in-painting. It can even take an existing picture as an input and generate variations of that image with different attributes, angles, and artistic styles.
It’s not enough for the technology to understand objects like motorcycles and koala bears independently; it needs to learn how they are related. DALL-E 2 achieves this through Deep Learning. Dall E was developed by training a neural network on images and the associated text descriptions. By having the ability to understand how objects and actions are related, DALL-E can understand and produce a meaningful image of, for example, a koala bear riding a motorcycle, even if that image has never existed before.

DALL-E-2 has three main objectives. The first is to help people express themselves visually in ways that were impossible before. The second is to gain an understanding from an AI-generated image of how the technology interprets us. In other words, is DAL-E 2 just repeating what it has been taught. Finally, the development of this new technology is intended to show humans how advanced AI systems have become by demonstrating how they see our world.

Every new revolutionary technology has limitations that must be understood to avoid misuse, and DEL-E is no exception. If enough images have been incorrectly labeled on the internet, Dall-E may produce flawed output. For example, if a plane has been labeled a “car” enough times, Dall-E may return an image of an aircraft in response to a user requesting a photo with a car. Think of it as talking to someone who was taught the incorrect word for something.
Another limitation is training gaps. For example, if you enter the text “baboon,” and DALL-E has learned what a baboon is through all the accurately labeled images on the internet, it will generate a lot of great baboons.

But what if you typed something DALL-E learned incorrectly due to inaccurate labeling. For example, if you asked for a “howler monkey” and it hasn’t learned what a howler monkey is, DALL-E, instead, will do its best and might give you a “Howling Monkey”.

What is fascinating about the techniques used to train DALL-E 2 is that it can take everything it has been taught for various objects and apply it to a new image. For example, you might provide a picture of a monkey and ask DALL-E to alter the appearance to something unique, like generating a photo of the monkey doing his taxes and wearing a funny hat.


Dall-E-2 is an example of how imaginative humans and clever systems can work together to make new things – amplifying our creative potential.
Thank you for reading.