Skip to content

A simple tool for text to image generation, using CLIP and a BigGAN. Technique was created by https://twitter.com/knowrohit07

Notifications You must be signed in to change notification settings

knowrohit/Text-to-Image-Bigger-GAN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

it was fun till it lasted, never knew the myriad of multimodals and image gen models coming within an year : making this project obsolete

Install

$ pip install big-sleep

Usage

$ dream "amber turd "

Images will be saved to wherever the command is invoked

Advanced

You can invoke this in code with

from big_sleep import Imagine

dream = Imagine(
    text = "fire in the sky",
    lr = 5e-2,
    save_every = 25,
    save_progress = True
)

dream()

You can now train more than one phrase using the delimiter "|"

Train on Multiple Phrases

In this example we train on three phrases:

  • an armchair in the form of pikachu
  • an armchair imitating pikachu
  • abstract
from big_sleep import Imagine

dream = Imagine(
    text = "an armchair in the form of pikachu|an armchair imitating pikachu|abstract",
    lr = 5e-2,
    save_every = 25,
    save_progress = True
)

dream()

Penalize certain prompts as well!

In this example we train on the three phrases from before,

and penalize the phrases:

  • blur
  • zoom
from big_sleep import Imagine

dream = Imagine(
    text = "an armchair in the form of pikachu|an armchair imitating pikachu|abstract",
    text_min = "blur|zoom",
)
dream()

You can also set a new text by using the .set_text(<str>) command

dream.set_text("a quiet pond underneath the midnight moon")

And reset the latents with .reset()

dream.reset()

To save the progression of images during training, you simply have to supply the --save-progress flag

$ dream "a bag full of genitals screaming the sentence " my dog stepped on a bee" --save-progress --save-every 100

Due to the class conditioned nature of the GAN, Big Sleep often steers off the manifold into noise. You can use a flag to save the best high scoring image (per CLIP critic) to {filepath}.best.png in your folder.

$ dream "a room with a view of the ocean" --save-best

Larger model

If you have enough memory, you can also try using a bigger vision model released by OpenAI for improved generations.

$ dream "pink storm clouds rolling in over a white suv" --larger-model

Experimentation

You can set the number of classes that you wish to restrict Big Sleep to use for the Big GAN with the --max-classes flag as follows (ex. 15 classes). This may lead to extra stability during training, at the cost of lost expressivity.

$ dream 'a scarecrow dancing with grannies in a field full of parrots' --max-classes 15

About

A simple tool for text to image generation, using CLIP and a BigGAN. Technique was created by https://twitter.com/knowrohit07

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors