\thesection Introduction
Abstract

Recent image-to-image translation tasks attempt to extend the model from one-to-one mapping to multiple mappings by injecting latent code. Based on the mathematical formulation of networks with existing way of latent code injection, we show the role of latent code is to control the mean of the feature maps after convolution. Then we find common normalization strategies might reduce the diversity of different mappings or the consistency of one specific mapping, which is not suitable for the multi-mapping tasks. We provide the mathematical derivation that the effects of latent code are eliminated after instance normalization and the distributions of the same mapping become inconsistent after batch normalization. To address these problems, we present consistency within diversity design criteria for multi-mapping networks and propose central biasing normalization by applying a slight yet significant change to existing normalization strategies. Instead of spatial replicating and concatenating into the input layers, we inject the latent code into the normalization layers where the offset of feature maps is eliminated to ensure the output consistency for one specific mapping and the bias calculated by latent code is appended to achieve the output diversity for different mappings. In this way, not only is the proposed design criteria met, but the modified generator network has much smaller number of parameters. We apply this technique to multi-modal and multi-domain translation tasks. Both quantitative and qualitative evaluations show that our method outperforms current state-of-the-art methods. Code and pretrained models are available at \inlinecodehttps://github.com/Xiaoming-Yu/cbn.

{IEEEkeywords}

Normalization, image-to-image translation, multiple mappings.

\thesection Introduction

\IEEEPARstart

Many image processing and computer vision problems can be framed as image-to-image translation tasks [isola2017pix2pix], mapping an image from one specific domain to another, \eg, facial attributes transform, grayscale image colorization and edge maps to photos. Although recent studies have shown remarkable success in image-to-image translation for two domains [isola2017pix2pix, zhang2016colorful, CycleGAN2017, Yi2017DualGAN, kim2017disco, liu2017UNIT], these one-to-one mapping methods are not suitable for multi-mapping problem since different models need to be built for every pair of mapping, even though some mappings share common semantics. To address the problem, latent code is introduced to indicate different mappings and applied to multi-domain translation tasks [lample2017fader, choi2017stargan] and multi-modal translation tasks [zhu2017multimodal].

\includegraphics

[height=3.6cm]domain_modal

Figure \thefigure: An illustration of multi-mapping indicated by latent code . (a) Multi-domain translation indicated by limited domain latent code. (b) Cross-domain translation indicated by potential attribute latent code.
\includegraphics

[height=

Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
206470
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description