Skip to content

Latest commit

 

History

History
 
 

pix2pix

Pix2Pix (CVPR'2017)

Image-to-Image Translation with Conditional Adversarial Networks

Abstract

We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

Results and models

We use FID and IS metrics to evaluate the generation performance of pix2pix.

Method FID IS Download
official facades 119.135 1.650 -
ours facades 127.792 1.745 model | log
official maps-a2b 149.731 2.529 -
ours maps-a2b 118.552 2.689 model | log
official maps-b2a 102.072 3.552 -
ours maps-b2a 92.798 3.473 model | log
official edges2shoes 75.774 2.766 -
ours edges2shoes 85.413 2.747 model | log
official average 111.678 2.624 -
ours average 106.139 2.664 -

Note: we strictly follow the paper setting in Section 3.3:

"At inference time, we run the generator net in exactly the same manner as during the training phase. This differs from the usual protocol in that we apply dropout at test time, and we apply batch normalization using the statistics of the test batch, rather than aggregated statistics of the training batch."

i.e., use model.train() mode, thus may lead to slightly different inference results every time.

Citation

@inproceedings{isola2017image,
  title={Image-to-image translation with conditional adversarial networks},
  author={Isola, Phillip and Zhu, Jun-Yan and Zhou, Tinghui and Efros, Alexei A},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  pages={1125--1134},
  year={2017}
}