Generating novel images using CLIP in Python.

Track: Scientific Computing

Type: Talk

Room: Main Hall

Time: Oct 14 (Fri): 11:30

Duration: 0:45

Generating images from text has been a long horizon task until recently DALL-E, PARTI, STABLE DIFFUSION and other models have shown that given a text prompt it is able to generate realistic and novel images. These models have a neural network in common called CLIP which stands for Contrastive language-image pre-training which is the backbone of various text to image models. It is trained on various image text pairs and generates a relevant image acoording to the text snippet without optimizing any task. Anyone can use the given ClIP models given its popularity but not many can generate the best images given the text prompts and this rises to prompt engineering. We propose to use a stable CLIP model which is written in python and show the audience how to do prompt engineering as well as modify CLIP models to do more than just generate images. We would like to show how python is used in easy coding of such complex generation models and easily modifiable and deployable.

URLs