Can AI Help Design Our Mascot?

Background

We ordered some designs on a well known freelancer site.

As reference we added instructions in the orders. "A kind, helpful looking elf with a santa hat", "I'd like it to be in the style of a mascot logo or combination logo", "it should be cute" etc and attached a few images of various robots and then I drew a very rough sketch.

Results from humans

Results from "AI"

This is going to change the way graphic design is done. It will probably only take a few years since most of the components are here already. It's just putting them together into a nice service that's missing.

How did we do it?

We started experimenting with CLIP and BigGAN through Ryan Murdock's "Big Sleep". We were testing what kind of images we could get by asking it to draw a green robot rocket with big cute eyes. At first the results were not very promising, but then something clicked — by nudging it with multiple different instructions it would not get stuck at a certain spot. Creating an endless stream of robots.

Around 02:00 I realized that it probably would be able to understand what I was after!

Phase 1

A frenzy of generation attempts and various methods followed from all three of us. We added reference images for the model to look at and also wrote some instructions such as "a rocket-robot" and "a robot with two cute eyes". We realized this was a viable method to get inspiration for our logo.

What is creativity? What is design?

I'd argue that compression is a big part of it. Take cubism. The real world exists, cubes exist. Compress them and a creative piece is born. If there is something neural networks and transformers are good at, it is compression. So if creativity is partly compression we would see evidence of "creativity" in these experiments. Searching and mixing between styles and objects to reach a goal.

Let's have a look at Filip's meme-moment. LEMON DWARF!

Filip had multiple instructions. Such as "a robot with big eyes", an image of a robot, and another image of a dwarf. Then once in a while the instruction "A lemon" is added. So now it's forced to change its normal struggle to output a mix of a robot and a dwarf. Now it has to add a lemon into the mix. The result is a LEMON DWARF ROBOT!

Even if this is not true "AI-creativity" the lines are definitely turning blurrier. The model drawing is not trained on dwarfs or robots. It's just mixing its ability to draw a cassette player, a car and a dog to create something different, in between those objects.

Phase 2

After generating over 200,000 various cute robots we wanted something different. I created a new method for the interaction between BigGAN and CLIP to get different results. We switched around the reference images and started experimenting a bit, creating reference images ourselves.

Here we used the old instructions and reference images but added minimalism as instruction. With the new method the results changed around a bit.

Can we make our current logo more awesome?

Let's add these two images to the model and some other instructions such as an astronaut's helm and a big visor among other things.

I'd say it works very very well. Scary well. This is a fantastic tool for creativity.

Sometimes it's a hit or miss:

Thought experiment

We "show" it the images from above and some names of our favourite artists and this stunning image is generated.

In the end, we ask ourselves the final question: in what stage of all this did the creativity happen?

Was it the invention of artificial neural nets?
Was it the vast resources of images and text available online?
Was it DALL-E or VAEs?
Was it transformers and visual transformers?
Was it when Alec Radford et al designed CLIP?
Was it when Ryan Murdock put it all together?
Was it when we prompted it with images and texts and gave it a purpose?

Summary

One problem remains

How can we possibly pick a favourite among everything that is generated? We really did not solve the problem of designing our logo. We've made this even more difficult for ourselves, when discovering an almost endless stream of robot designs.

The future of infinite content is approaching.

Thanks to:

BigGAN — Large Scale GAN Training for High Fidelity Natural Image Synthesis Andrew Brock, Jeff Donahue, Karen Simonyan arxiv.org/abs/1809.11096

CLIP — Learning Transferable Visual Models From Natural Language Supervision Alec Radford, Jong Wook Kim et al OpenAI Paper

DALL-E — Zero-Shot Text-to-Image Generation Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, Ilya Sutskever arxiv.org/abs/2102.12092

Methods inspired and developed from Ryan Murdock's initial method @advadnoun