Skip to content

wheynelau/pix2pix-GAN-Experiment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

86 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UNET GAN pix2pix with VGG19 generator and discriminator

Table Of Contents

Motivation and goals

  • Explore the use cases of GAN
  • Experiment with cloud computing resources
  • Implement a GAN model with tensorflow
  • Use pretrained models as a discriminator and generator
  • Create a training pipeline for GAN models

Pre-requisites

Tensorflow was built from source using the following configuration:

python=3.10
tensorflow=2.9.3
cudatoolkit=11.2
cudnn=8.1

Have checked that this works on docker:

FROM tensorflow/tensorflow:2.9.3-gpu

RUN pip install hydra-core tqdm scipy matplotlib

A conda environment file will be provided in the root directory of this repository. It was only tested on a windows machine.

Getting started

If you would like to use this project, follow these steps:

  1. Clone this repository
git clone https://github.com/wheynelau/VGG19-gan-experiment.git
  1. Install the requirements, before this you would need conda installed on your machine. You can install conda from here
conda env create -f environment.yml
  1. Setup conf/config.yaml

All available options are in the config.yaml file

  1. Setup the folders and files

If your image is in the format of two images combined together, you can use the 'preprocess.py' file to split them into two images.

Here is an example of the image:

Your directory should look like this:

python src/preprocess.py
├───data
│   ├───mask (optional)
│   ├───test
│   └───train

Running the 'preprocess.py' file will create a new directory based on the preprocess_path in the configuration and split the images into two images. It assumes that the images in the mask and test/train are the same names. This is how it would appear after running the 'preprocess.py' file:

$ python src/preprocess.py
├───$preprocess_path
│   ├───test
│   │   ├───image
│   │   └───target
│   └───train
│       ├───image
│       └───target

Training

$ python train.py

Inferencing

At the end of train.py, I've added a statement to save the generator of the GAN model. This is the generator that will be used for inferencing.

Run the below command to infer on a folder containing images:

Note: There is no exception handling for non-image files, please input only image files In addition, all images will be resized to 256x256

$ python infer.py

Results are in the RESULTS.md file

Takeaways

  1. Implemented a GAN model from scratch in tensorflow.
  2. Used pretrained models as the generator and discriminator.
  3. Improved the training pipeline by method overridding the compile and train_step method.
  4. Custom callbacks for Tensorboard, checkpointing and optimiser learning rate scheduling.

Problems

  1. GANs are hard to train, especially when the generator and discriminator are not balanced.
  2. Encountered mode collapse, where the generator generated almost the same image for all the images in the dataset.
  3. Difficult to achieve equilibrium between the generator and discriminator.

Future enhancements / TODO

  1. Explore other loss functions
  2. Introduce noise to the input images
  3. Use cGAN instead of GAN
  4. Implement pytorch version of the model
  5. Step discriminator less than generator
  6. Experiment if cycleGAN works better.

Contributing

Feedback and contributions are welcome. As this is a learning project, I may not be active in maintaining this repository and accepting pull requests.

Sources

pix2pix

APDrawingGAN # Found the datasets from this repository

Releases

No releases published

Packages

No packages published

Languages