Skip to content

In this repository, I have developed a CycleGAN architecture with embedded Self-Attention Layers, that could solve three different complex tasks. Here the same principle Neural Network architecture has been used to solve the three different task. Although truth be told, my model has not exceeded any state of the art performances for the given ta…

Notifications You must be signed in to change notification settings

victor369basu/CycleGAN-with-Self-Attention

Repository files navigation

CycleGAN with Self-Attention Layers

architecture

In this repository, I have developed a CycleGAN architecture with embedded Self-Attention Layers, that could solve three different complex tasks. Here the same principle Neural Network architecture has been used to solve the three different tasks such as Colorize sketch, shader and glass remover and turning male face to female. Although truth be told, my model has not exceeded any state of the art performances for the given task, but the architecture was powerful enough to understand the task that has been given to solve and produce considerably good results.

About the architecture

The concept of CycleGAN used in this project is the same as the original. The novel approach that I have added is adding the self-attention layers to the U-net generator and discriminator. The concept of self attention is inspired from the research paper Self-Attention Generative Adversarial Networks. I have modified the self-attention layer discussed in the research paper for better results. In my case, the base formula for attention is shown below.

attention

Source - Attention Is All You Need

The base code for the self-attention layer is built around this formula. The self-attention layers added at the bottleneck of the u-net and right before the output Convolution Layer. In the case of the discriminator, the self-attention layers are added right before the zero-padding layer and right before the output layer.

Technologies used:

  1. The entire architecture is built with tensorflow.
  2. Matplotlib has been used for visualization.
  3. Numpy has been used for mathematical operations.
  4. OpenCV have used for the processing of images.

Tasks Solved by the Architecture

I have trained and validated the model with an image size of 256 and trained over 800 epochs. The default parameters mentioned in the config.py file are the baseline parameters used for training over three different tasks.

Colorize Sketch

The given task is to colorize a input facial sketch image.
Over Training examples
sketch1 sketch2 sketch3 sketch4 sketch5 sketch6 sketch7 sketch8 sketch9 sketch10
Over Validation examples
sketchx sketchy sketch11 sketch12 sketch13 sketch14 sketch15 sketch16 sketch17 sketch18 sketch19 sketch20 sketch21 sketch22 sketch23 sketch24

Gender Bender

The given task is to Transform a Male face into a female face.
Over Training examples
gender1 gender2 gender3 gender4 gender5 gender6 gender7 gender8 gender9 gender10
Over Validation examples
gender11 gender12 gender13 gender14 gender15 gender16 gender17 gender18 gender19 gender20 gender21 gender22 gender23 gender24 gender25 gender26 gender27 gender28

Shades and Glass Remover

The given task is to remove glass and sun-glass from an input facial image. While training the model to solve this task the alpha parameter of LeakyReLU was set to 0.4 instead of the default 0.1 for the above two tasks.
Over Training examples
glass1 glass2 glass3 glass4 glass5 glass6 glass7 glass8 glass9 glass10
Over Validation examples
glass11 glass12 glass13 glass14 glass15 glass16 glass17 glass18 glass19 glass20 glass21 glass22


Implimentation

Training

python main.py --height 256 --width 256 --epoch 300 --dataset "./dataset/" --subject 1

Validation

python main.py --train False --dataset "./validate/" --validate "face-1001.png" --subject 1

Saves the predicted image with the name "Gan_Output_face-1001.png".

Special cases

As I have mentioned above the a principle architecture thave solved all three tasks, but I have also found out that modifying the self-attention layer architecture by

   def hw_flatten(self, x) :  
        return layers.Reshape(( -1, x.shape[-2], x.shape[-1]))(x)

instead of

   def hw_flatten(self, x) :  
        return layers.Reshape(( -1, x.shape[-2]* x.shape[-1]))(x)

have improved the outcomes of the model for solving a particular individual case. Also removing the LayerNormalization and Dropout from the self-attention layer have improved the performance for individual cases.

CycleGAN with attention Architecture

The Self-Attention layer has been used in both generator and discriminator network.

Generator

gen

Discriminator

dis

Future Scopes

  1. The model could be further improved with the further turning of Convolution layers or other layers.
  2. Creating a deeper u-net architecture could also have helped in improving the performance of the model.

About

In this repository, I have developed a CycleGAN architecture with embedded Self-Attention Layers, that could solve three different complex tasks. Here the same principle Neural Network architecture has been used to solve the three different task. Although truth be told, my model has not exceeded any state of the art performances for the given ta…

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages