This repository contains code for training an image generator using a slight variant of the pixelCNN architecture as described in Conditional Image Generation with PixelCNN Decoders
Most of the code is in core theano. 'keras' has been used for loading data. Optimizer implementation from 'lasagne' has been used.
Dependencies:
You can use experiments.sh to train the model and install_dependencies.sh to install the dependencies.
Notes on results:
-
Images with 2-bit depth has been considered for training as well as generation e.g. every pixel is quantized into four levels and then used for training. Four-way softmax has been used to predict pixel quantization.
-
Following is the result after 60 epochs of training which got completed in about 10 hrs on K6000 Gpu. No hyper parameter search has been performed.
Generated images
Training images
Salient features: No blind spots, efficient implemenattion of vertical stacks and horizontal stacks, residual connections and good generation results :D
For any comments/feedback, feel free to email me at [email protected] or open an issue here.
TODO: Implement gated activation and conditional generation.
If you have GPU resources, feel free to train on CIFAR10. I have provided training script for that. Let me know how it goes. Also, one can train with 256-way softmax and perform hyperparameter search on MNIST dataset.