Awesome Typography: Statistics-Based Text Effects Transfer
In this work, we explore the problem of generating fantastic special-effects for the typography. It is quite challenging due to the model diversities to illustrate varied text effects for different characters. To address this issue, our key idea is to exploit the analytics on the high regularity of the spatial distribution for text effects to guide the synthesis process. Specifically, we characterize the stylized patches by their normalized positions and the optimal scales to depict their style elements. Our method first estimates these two features and derives their correlation statistically. They are then converted into soft constraints for texture transfer to accomplish adaptive multi-scale texture synthesis and to make style element distribution uniform. It allows our algorithm to produce artistic typography that fits for both local texture patterns and the global spatial distribution in the example. Experimental results demonstrate the superiority of our method for various text effects over conventional style transfer methods. In addition, we validate the effectiveness of our algorithm with extensive artistic typography library generation.
Typography is the technology to design the special text effects to render the character into an original and unique artwork. These amazing text styles include basic effects such as shadows, outlines, colors and sophisticated effects such as burning flames, flowing smokes, multicolored neons, as shown in Fig. 1. Texts decorated by well-designed special effects become much more attractive. It can also better reflect the thoughts and emotions from the designer. The beauty and elegance of text effects are well appreciated, making it widely used in the publishing and advertisement. However, creating vivid text effects requires a series of subtle processes by an experienced designer using the editing software: determine color styles, warp textures to match text shapes and adjust the transparency for visual plausibleness, etc. These advanced editing skills are far beyond the abilities of most casual users. This practical requirement motivates our work: We investigate an approach to automatically transfer various fantastic text effects designed by artists onto raw plain texts, as shown in Fig. 1.
Text effects transfer is a brand new sub-topic of style transfer. Style transfer can be related to color and texture transfer, respectively. Color transfer matches global  or local  color distributions of the target and source images. Texture transfer relies on texture synthesis technologies, where the texture generation is constrained by guidance images. Meanwhile, texture synthesis can be divided into two categories: non-parametric methods [8, 7, 16, 36] and parametric methods [15, 10, 18, 12]. The former generates new textures by resampling pixels or patches from the original texture, while the latter models textures using statistical measurements and produces a new texture that shares the same parametric results with the original one.
From a technical perspective, it is quite challenging and impractical to directly exploit the traditional style transfer methods to generate new text effects. The challenges lie in three aspects: (i) The extreme diversity of the text effects and character shapes: The style diversity makes the transfer task difficult to model uniformly. Further, the algorithm should be robust to the tremendous variety of characters. (ii) The complicated composition of style elements: A text effects image often contains multiple intertwined style elements (we call them text sub-effects) that have very different textures and structures (see denim fabric example in Fig. 1) and need specialized treatments. (iii) The simpleness of guidance images: The raw plain text as guidance gives few hints on how to place different sub-effects. Textures in the white text and black background regions may not hold the stationarity. This makes the traditional non-parametric texture-by-numbers method  fail, which has assumed textures to be stationary in each region of the guidance map. Meanwhile, the plain text image provides little semantic information. This makes the recent successful parametric deep-based style transfer methods [11, 18] lose their advantages of representing high-level semantic information. For these reasons, conventional style transfer methods for general styles perform poorly on text effects.
In this paper, we propose a novel text effects transfer algorithm to address these challenges. The key idea is to analyze and model the distance-based essential characteristics of high-quality text effects and to leverage them to guide the synthesis process. The characteristics are summarized based on the analytics over dozens of well-designed text effects into a general prior. This prior guides our style transfer process to synthesize different sub-effects adaptively and to simulate their spatial distribution. All measurements are carefully designed to achieve certain robustness to the character shape. In addition, we further consider the psycho-visual factor to enhance image naturalness. In summary, our contributions are threefold:
We raise a brand new topic of text effects transfer that turns plain texts into fantastic artworks, which enjoys wide application scenarios such as picture creation on social networks and commercial graphic design.
We perform investigation and analysis on well-designed typography and summarize the key distance-based characteristics for high-quality text effects. We model these characteristics mathematically to form a general prior that can be used to significantly improve the style transfer process for texts.
We propose the first method to generate compelling text effects, which share both similar local texture patterns and the global spatial distribution with the source example, while preserving image naturalness.
2 Related Work
Color Transfer. Pioneering color transfer methods [28, 26] transfer color between images by matching their global color distributions. Subsequently, local color transfer is achieved based on segmentation [32, 33] or user interaction  and it is further improved using fine-grained patch  or pixel [30, 25] correspondences. Recently, color transfer  and colorization [17, 38] using deep neural networks have drawn people’s attentions.
Non-Parametric Texture Synthesis and Transfer. Efros and Lueng  proposed a pioneering pixel-by-pixel synthesis approach based on sampling similar patches. The subsequent works improve it in quality and speed by synthesizing patches rather than pixels. To handle the overlapped regions of neighboring patches for seamlessness, Liang et al.  proposed to blend patches, and Efros and Freeman  used dynamic programming to find an optimal separatrix in overlapped regions, which is further improved via graph cut . Unlike previous methods that synthesize textures in a local manner, recent techniques synthesize globally using objective functions. A coherence-based function  is proposed to synthesize textures in an iterative coarse-to-fine fashion. This method performs patch matching and voting operations alternately and achieves good local structures. It is then extended to adapt to non-stationary textures through patch geometric and photometric transformations [2, 6].
Texture transfer, also known as Image Analogies , generates textures but also keeps the structure of the target image. Structures are usually preserved by reducing the differences between the source and target guidance maps [13, 24]. In , texture boundaries are synthesized in priority to constrain the structure. Frigo et al.  proposed an adaptive patch partition to precisely capture source textures and preserve target structures, followed by a Markov Random Field (MRF) function for global texture synthesis.
Parametric Texture Synthesis and Transfer. The idea of modelling textures using statistical measurements has led to the development of textons and its variants [15, 27]. Nowadays, deep-based texture synthesis  starts trending due to the great descriptive ability of deep neural networks. Gatys et al. proposed to use Gram-matrix in the Convolutional Neural Networks (CNNs) feature space to represent textures  and adapt it to style transfer by incorporating content similarities . This work presented the remarkable generic painting transfer technique and attracted many follow-ups in loss function improvement [21, 29] and algorithm acceleration [14, 34]. Recently, methods that replace the Gram-matrix by MRF regularizer is proposed for photographic synthesis  and semantic texture transfer . Meanwhile, Generative Adversarial Networks (GANs)  provide another idea for texture generation by using discriminator and generator networks, which iteratively improve the model by playing a minimax game. Its extension, the conditional GANs , fulfils the challenging task of generating images from abstract semantic labels. Li and Wand  further showed that their Markovian GANs has certain advantages over the Gram-matrix-based methods [11, 34] in coherent texture preservation.
3 Proposed Method
In this section, we first formulate our text effects transfer problem. Visual analytic is then presented on our observation of the high correlation between patch patterns (i.e. color and scale) and their spatial distributions in text effects images (Sec. 3.1). Based on this observation, we extract text effects statistics from the source images (Sec. 3.2) and employ it to adapt the texture synthesis algorithm for high-quality text effects transfer (Sec. 3.3).
3.1 Problem Formulation and Analysis
Text effects transfer takes as input a set of three images, the source raw text image , the source stylized image and the target raw text image , then automatically produces the target stylized image with the text effects such that .
It is a quite challenging task to transfer arbitrary text effects automatically, due to the variety of text effects, the complex composition of sub-effects and the simpleness of guidance maps. To address this problem, we investigate the preferable text effects on the following two aspects: (i) how to determine the essential characteristics of text effects and (ii) how to characterize them mathematically. We start with a basic observation on text effects that the patch patterns are highly dominated by their locations. We develop to represent the pattern of a patch by two optimal factors: the pixel color and the patch scale. As shown intuitively in Figs. 2(a)-(d), the patches at similar locations (marked with the same color) tend to have similar patterns.
To quantitatively evaluate the locations of patches, we divide a text effects image into classes, namely, partitions. The modes of partition are extremely diverse and thus it is impractical to compare all of them. In this work, we compare five typical partition modes: (i) random: all pixels are randomly divided into equal partitions; (ii) grid: all partitions are evenly distributed according to their horizontal and vertical coordinates on the image; (iii) angle: all partitions are evenly distributed according to their angular coordinate, where the center of polar coordinate system is at the geometric center of the image; (iv) ring: all partitions are evenly distributed according to their radial coordinate, where the center of polar coordinate system is at the geometric center of the image; and (v) distance: all partitions are evenly distributed according to their geometric distance (the distance calculation will be given in Sec. 3.2.2) to the skeleton of text on the image. In Fig. 2(e), the partitions modes of grid, angle, ring and distance have been intuitively illustrated, where all partitions are tinged differently.
Then for each partition mode, we investigate the relationship between these partitions and the distributions of corresponding patterns. For the factor of color, we represent its reliability by its classification accuracy of partitions:
where is the training error or empirical risk obtained by training SVM  to classify the color given a type of partition. We have tested on text effects images created by designers to obtain their reliability on color classification. The average reliability are then shown in Table 1, where only the relative values are instructive in our design. From this table, the distance is demonstrated to be the most reliable factor to depict pixel colors, with a value of on average. In Fig. 2(g), pixels of the flame image are tinged according to their distance (see the top left image of Fig. 2(e)) in RGB space. We note that the points with the same class-color appear in the neighborhood. It is also intuitively shown that the color and distance are highly correlated in text effects.
The distance has also shown its importance in characterizing the scale of patterns. Firstly, for different patch sizes, we calculate the average patch difference between all patches in a partition and their best matches on the same image, which forms a response curve of scale. Then, for all the partitions with the same partition mode, we have response curves that show the impacts of scales. Two examples of response curves for denim fabric image are shown in Figs. 2(h) and (i), where each point shows its average and standard deviation of patch differences under the same partition and scale. To compare the reliability of all partition modes, two terms are utilized: (i) inter curve standard deviation : the average of the scale-wise standard deviations of average responses at same partitions; and (ii) intra curve standard deviation : the average of point-wise standard deviations for all scales and partitions. A higher implies that sub-effects are easier to be distinguished by their locations, while a lower implies patches in the same partition react uniformly to scale changing and possibly share common optimal scale for description. Therefore, we evaluate the reliability by
The reliability of all the five partition modes are then given in Table 1 where the factor of distance achieves highest to characterize the patch scales.
As a conclusion, there exist high correlations between patch patterns (i.e. color and scale) and their distances to text skeletons. These are reasonable essential characteristics for high-quality text effects.
3.2 Text Effects Statistics Estimation
We now convert the aforementioned analysis into patch statistics that can be directly used as the transfer guidance. For our patch-based algorithm, in the following we use and to denote the pixels in and , respectively, and use and to represent the patches centered at in and , respectively. The same goes for patches and in and .
3.2.1 Optimal Patch Scale Detection
Inspired by , we propose a simple yet effective approach to detect the optimal patch scale to depict texture patterns round . Given a predefined downsample factor , we start from the max (roughest) scale to filter source patches and let the screened patches pass to a finer scale.
We use a fixed patch size of and resize the image to accomplish multiple scales. Let be the downsampled source with a scale rate of and be the patch centered at in . and are similarly defined. If is the correspondence of at scale such that
then our filter criterion at scale is
where . Patches that satisfy the filter criterion pass through to finer scale , while the filter residues set as their optimal scales. The optimal patch scale detection is summarized in Algorithm 1. An example of the optimal scales for the flame image is shown in Fig. 3(a). It is found that the textured region near the character requires finer patch scales than the outer flat region. For better visualization, we show the optimal scale of the patch by resizing it at a scale rate of in Fig. 3(b).
3.2.2 Robust Normalized Distance Estimation
Here we first define some concepts. In the text image, its text region is denoted by . The skeleton is a kernel path within . We use to denote the distance between and its nearest pixel in set . We are going to calculate . For on the text contour , the distance is also known as the text width or radius . Fig. 4(b) gives the visual interpretation.
We extract from using morphology operations. To ensure the distance invariant to the text width, we aim to normalize the distance so that the normalized text width equals to . Simply dividing the distance by the text width is unreliable because the inaccurate of the obtained leads to errors both in the numerator and denominator as well. To address this issue, we estimate corrected text width based on statistics and use the accurate to derive normalized .
Specifically, we sort and obtain their rankings . We observe that the relation between and can be well modelled by linear regression, as shown in Figs. 4(d). From Figs. 4(b)(d), we discover that outliers assemble at small values. We empirically assume the leftmost points are outliers and eliminate them by
where are linear regression coefficients, is the pixel number of . Finally, the normalized distance is obtained,
where is the nearest pixel to along and is the mean text width.
For simplicity, we omit and use to refer to in the following.
3.2.3 Optimal Scale Posterior Probability Estimation
In this section, we derive the posterior probability of the optimal patch scale to model the aforementioned high correlation between patch patterns and their spatial distributions.
We uniformly quantify all distances into bins and denote as the bin belongs to. Then, a 2-d histogram is computed:
where is when the argument is true and otherwise. And the joint probability of the distance and the optimal scale can be estimated as,
Finally, the posterior probability for being the appropriate scale to depict the patches with distances corresponding to can be deduced:
We assume the target images share the same posterior probability with the source image. And we will use this probability to select patch scales statistically for texture synthesis to adapt extremely various text effects.
3.3 Text Effect Transfer
In this section, we describe how we adapt conventional texture synthesis method to dealing with the challenging text effects. We build on the texture synthesis method of Wexler et al.  and its variants  using random search and propagation as in PatchMatch [1, 2]. We refer to these papers for details of the base algorithm.
We apply character shape constrains to the patch appearance measurement to build our baseline, and further incorporate estimated text effects statistics to accomplish adaptive multi-scale style transfer (Sec. 3.3.2). Then a distribution term is introduced to adjust the spatial distribution of the text sub-effects (Sec. 3.3.3). Finally, we propose a psycho-visual term that prevents texture over-repetitiveness for naturalness (Sec. 3.3.4).
3.3.1 Objective Function
We augment the texture synthesis objective function in  by including a distribution term and a psycho-visual term. And our objective function takes the following form,
where is the center position of a target patch in and , is the center position of the corresponding source patch in and . The three terms , and are the appearance, distribution and psycho-visual terms, respectively, which are weighted by and to together make up the patch distance.
3.3.2 Appearance Term: Texture Style Transfer
The original texture synthesis algorithm of Wexler et al.  minimizes the Sum of the Squared Differences (SSD) of two patches sampled from texture image pair . We adapt it to texture transfer tasks by applying additional SSD of two patches sampled from the text image pair :
where is a weight that compromises between the color difference and character shape difference. We take the objective function that only minimizes the appearance term in Eq. (11) as our baseline.
Stylized texts often contain multiple sub-effects with different optimal representation scales. Thus, in addition to the baseline, we propose the adaptive scale-aware patch distance by incorporating the estimated posterior probability,
The posterior probability helps to explore patches through multiple appropriate scales for better textures synthesis.
3.3.3 Distribution Term: Spatial Style Transfer
The distribution of sub-effects highly correlates with their distances to the text skeleton. Based on this prior, we introduce a distribution term,
which encourages the text effects of the target to share similar distribution with the source image, thereby realizing a spatial style transfer. To ensure that the cost is invariant to the image scale, we add the denominator .
3.3.4 Psycho-Visual Term: Naturalness Preservation
Texture over-repetitiveness can seriously reduce human subjective evaluation in the aesthetics. Therefore, we aim to penalize certain source patches to be selected repetitiously.
Let be the set of pixels that currently finds as its correspondence and be the size of the set. We define the psycho-visual term as,
From the perspective of , we can better understand this repetitiveness penalty:
Since is constant, Eq. (15) reaches the minimum when all equals. It means our psycho-visual term encourages source patches to be used evenly.
3.3.5 Function Optimization
We follow the iterative coarse-to-fine matching and voting steps as in . In the matching step, PatchMatch algorithm [1, 2] is adopted. We update after each iteration of search and propagation for the psycho-visual term. Meanwhile, the initialization of plays an important role in the final results, since our guidance map provides very few constraints on textures. We vote the source patches that are searched to only minimize Eq. (13) to form our initial guess of . This simple strategy improves the final results significantly as shown in Fig. 6.
Appearance Term. The advantages of the proposed appearance term lie in two aspects: (i) Preserve coarse grained texture structures. (ii) Preserve texture details. We show in Figs. 5(a) and (b) the denim fabric style generated using single-scale and patches, respectively. Small patches capture very limited contextual information, thus it cannot guarantee the structure continuity. As can be seen in Fig. 5(a), sewing threads look cracked and are not along the uniform directions. However, choosing large patches leads to smoothing out tiny thread residues as in Fig. 5(b). These problems are well solved by jointly using patches over scales as in Fig. 5(c), where the overall shape is well preserved and the details like sewing threads look more vivid.
Distribution Term. The distribution term ensures the sub-effects in the target image and the source example are similarly distributed, which is the basis of our assumption in Sec. 3.2.3 that the posterior probabilities in and are the same. Fig. 6 shows the effects of the distribution term on the flame style. Without distribution constraints, the flames appear randomly in the black background. The distribution term adjusts the flames to better match their spatial distribution as that in the source example.
Psycho-Visual Term. The effects of our psycho-visual term are shown in Fig. 7. The lava textures synthesized without the psycho-visual penalty (Fig. 7(a)) densely repeat the red cracks in several regions, which causes obvious unnaturalness. By increasing the penalty, the reuse of the same source textures is greatly restrained (Fig. 7(b)) and our method tends to agilely combine different source patches to create brand-new textures (Fig. 7(c)). Thus, the psycho-visual term can effectively penalize texture over-repetitiveness and encourage new texture creation.
Combination of the Three Term. It is worth noting that the proposed three terms are complementary: First, the appearance and distribution terms emphasize local texture patterns and global text sub-effects distributions, respectively. The former depicts low-level color features while the latter exploits complementary mid-level position features. Second, the appearance and distribution terms jointly evaluate objective patch similarities. Meanwhile, the psycho-visual term complements these two terms by incorporating aesthetic subjective evaluations.
5 Experimental Results
In the experiment, the patch size is and the max scale . We build an image pyramid of levels with a fixed coarsest size (). At level , joint patches over scales from to are used. The weights , and to balance different terms are set to , and , respectively. The parameter for the filter criterion is . In addition to the examples in this paper, all of our results and comparisons are included in the supplementary material.
In Fig. 8, we present a comparison of our algorithm with three state-of-the-art style transfer techniques as well as our baseline. The first method is the pioneering Image Analogies . The textures in their results repeat locally and look disordered globally with evident patch boundaries. The second method is our implementation of Split and Match , which synthesizes textures using adaptive patch sizes. The original method directly transfers the style in to without the help of . To make a fair comparison, we incorporate the guidance by using instead of in the split stage. This method fails to generate textures in the background and produces plain stylized results. The third method, Neural Doodle , is based on the combination of MRF and CNN  and incorporates semantic maps for analogy guidance. While the color palette of the example text effects is transferred, fine textures are poorly synthesized. The text shape is lost as well. Our baseline transfers fine textures but fails to keep the overall sub-effects distribution and generates artifacts in the background. By comparison, the proposed method outperforms state-of-the-art methods, preserving both local textures and the global sub-effects distribution.
In Fig. 9, we present an illustration of style transfer from six very different text effects to three representative characters (Chinese, alphabetic, handwriting). This experiment covers challenging transformations between styles, languages and fonts. Thanks to distance normalization and multi-scale strategy, our algorithm accomplishes to transfer the text effects regardless of character shapes and texture scales, providing a solid tool for artistic typography.
Finally, we show our flame typography library including as much as frequently used Chinese characters. Due to the space limitation, only the first of them are presented in Fig. 10. The whole library as well as the other typography libraries are included in our supplementary material. The extensive synthesis results demonstrate the robustness of our method to varied character shapes.
In this paper, we raise the text effects transfer problem and propose a novel statistics-based method to solve it. We convert the high correlation between the sub-effects patterns and their relative spatial distribution to the text skeletons into soft constraints for text effects generation. An objective function with three complementary terms is proposed to jointly consider the local multi-scale texture, global sub-effects distribution and visual naturalness. We validate the effectiveness and robustness of our method by comparisons with state-of-the-art style transfer algorithms and extensive artistic typography generations. Future work will concentrate on the composition of the stylized texts and the background photos.
-  C. Barnes, E. Shechtman, A. Finkelstein, and D. B. Goldman. Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Transactions on Graphics, 28(3):341–352, August 2009.
-  C. Barnes, E. Shechtman, D. B. Goldman, and A. Finkelstein. The generalized patchmatch correspondence algorithm. In Proc. European Conf. Computer Vision, pages 29–43, 2010.
-  C. Barnes, F.-L. Zhang, L. Lou, X. Wu, and S.-M. Hu. Patchtable: Efficient patch queries for large datasets and applications. In ACM Transactions on Graphics, 2015.
-  A. J. Champandard. Semantic style transfer and turning two-bit doodles into fine artworks. 2016. Arvix preprint; https://arxiv.org/abs/1603.01768.
-  C. C. Chang and C. J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1–27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
-  S. Darabi, E. Shechtman, C. Barnes, D. B. Goldman, and P. Sen. Image melding: combining inconsistent images using patch-based synthesis. ACM Transactions on Graphics, 31(4):82:1–82:10, July 2012.
-  A. A. Efros and W. T. Freeman. Image quilting for texture synthesis and transfer. In Proc. ACM Conf. Computer Graphics and Interactive Techniques, pages 341–346, 2001.
-  A. A. Efros and T. K. Leung. Texture synthesis by non-parametric sampling. In Proc. IEEE Int’l Conf. Computer Vision, 1999.
-  O. Frigo, N. Sabater, J. Delon, and P. Hellier. Split and match: example-based adaptive patch sampling for unsupervised style transfer. In Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, 2016.
-  L. A. Gatys, A. S. Ecker, and M. Bethge. Texture synthesis using convolutional neural networks. In Advances in Neural Information Processing Systems, 2015.
-  L. A. Gatys, A. S. Ecker, and M. Bethge. Image style transfer using convolutional neural networks. In Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, 2016.
-  I. Goodfellow, J. Pougetabadie, M. Mirza, B. Xu, D. Wardefarley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Advances in Neural Information Processing Systems, pages 2672–2680, 2014.
-  A. Hertzmann, C. E. Jacobs, N. Oliver, B. Curless, and D. H. Salesin. Image analogies. In Proc. Conf. Computer Graphics and Interactive Techniques, pages 327–340, 2001.
-  J. Johnson, A. Alahi, and F. F. Li. Perceptual losses for real-time style transfer and super-resolution. In Proc. European Conf. Computer Vision, 2016.
-  B. Julesz and J. R. Bergen. Textons, the fundamental elements in preattentive vision and perception of textures. Bell Labs Technical Journal, 62(6):243–256, 1983.
-  V. Kwatra, A. Schödl, I. Essa, G. Turk, and A. Bobick. Graphcut textures: image and video synthesis using graph cuts. ACM Transactions on Graphics, 22(3):277–286, 2003.
-  G. Larsson, M. Maire, and G. Shakhnarovich. Learning representations for automatic colorization. In Proc. European Conf. Computer Vision, 2016.
-  C. Li and M. Wand. Combining markov random fields and convolutional neural networks for image synthesis. In Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, 2016.
-  C. Li and M. Wand. Precomputed real-time texture synthesis with markovian generative adversarial networks. In Proc. European Conf. Computer Vision, 2016.
-  L. Liang, C. Liu, Y. Xu, B. Guo, and H. Shum. Real-time texture synthesis by patch-based sampling. ACM Transactions on Graphics, 20(3):127–150, 2001.
-  T. Lin and S. Maji. Visualizing and understanding deep texture representations. In Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, 2016.
-  M. Lukáč, J. Fišer, J. C. Bazin, O. Jamriška, A. Sorkine-Hornung, and D. Sýkora. Painting by feature: texture boundaries for example-based image creation. ACM Transactions on Graphics, 32(4):96–96, 2013.
-  M. Mirza and S. Osindero. Conditional generative adversarial nets. Computer Science, pages 2672–2680, 2014.
-  F. Okura, K. Vanhoey, A. Bousseau, A. A. Efros, and G. Drettakis. Unifying color and texture transfer for predictive appearance manipulation. Computer Graphics Forum, 34(4):53–63, 2015.
-  J. Park, Y. Tai, S. N. Sinha, and I. S. Kweon. Efficient and robust color consistency for community photo collections. In Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, 2016.
-  F. Pitié, A. C. Kokaram, and R. Dahyot. Automated colour grading using colour distribution transfer. Computer Vision and Image Understanding, 107(1):123–137, 2007.
-  J. Portilla and E. P. Simoncelli. A parametric texture model based on joint statistics of complex wavelet coefficients. Int’l Journal of Computer Vision, 40(1):49–70, 2000.
-  E. Reinhard, M. Ashikhmin, B. Gooch, and P. Shirley. Color transfer between images. IEEE Computer Graphics and Applications, 21(5):34–41, 2001.
-  A. Selim, M. Elgharib, and L. Doyle. Painting style transfer for head portraits using convolutional neural networks. ACM Transactions on Graphics, 35(4):1–18, 2016.
-  Y. Shih, S. Paris, C. Barnes, W. T. Freeman, and F. Durand. Style transfer for headshot portraits. ACM Transactions on Graphics, 33(4):1–14, 2014.
-  Y. Shih, S. Paris, F. Durand, and W. T. Freeman. Data-driven hallucination of different times of day from a single outdoor photo. ACM Transactions on Graphics, 32(6):2504–2507, 2013.
-  Y. W. Tai, J. Jia, and C. K. Tang. Local color transfer via probabilistic segmentation by expectation-maximization. In Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, pages 747–754, 2005.
-  Y. W. Tai, J. Jia, and C. K. Tang. Soft color segmentation and its applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(9):1520–1537, 2007.
-  D. Ulyanov, V. Lebedev, A. Vedaldi, and V. Lempitsky. Texture networks: feed-forward synthesis of textures and stylized images. In Proc. IEEE Int’l Conf. Machine Learning, 2016.
-  T. Welsh, M. Ashikhmin, and K. Mueller. Transferring color to greyscale images. ACM Transactions on Graphics, 21(3):277–280, 2002.
-  Y. Wexler, E. Shechtman, and M. Irani. Space-time completion of video. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(3):463–476, March 2007.
-  Z. Yan, H. Zhang, B. Wang, S. Paris, and Y. Yu. Automatic photo adjustment using deep neural networks. ACM Transactions on Graphics, 35(1), 2016.
-  R. Zhang, P. Isola, and A. A. Efros. Colorful image colorization. In Proc. European Conf. Computer Vision, 2016.
As supplementary material of our paper, we present the following contents:
Appendix A Correlations Between Patch Patterns and Different Modes
01: http://www.zcool.com.cn/work/ZMTg1MTgzMDA=.html 02: http://www.zcool.com.cn/work/ZMTc4MTM5MDQ=.html
03: http://www.zcool.com.cn/work/ZMTc1MzI4MzI=.html 04: http://www.zcool.com.cn/work/ZMTc0NDIxMDg=.html
05: http://www.zcool.com.cn/work/ZMTcwNjEwMTI=.html 06: http://www.zcool.com.cn/work/ZNTE3MTAxMg==.html
07: http://www.zcool.com.cn/work/ZMTU0Mzk3ODg=.html 08: http://www.zcool.com.cn/work/ZMTUxMDc0MDQ=.html
09: http://www.zcool.com.cn/work/ZMTQ0MjY4OTI=.html 10: http://www.zcool.com.cn/work/ZMTQ0MjU3NDA=.html
11: http://www.zcool.com.cn/work/ZNjExMzA0NA==.html 12: http://www.zcool.com.cn/work/ZNjg1MTg1Ng==.html
15: http://www.zcool.com.cn/work/ZMTg5NDk0NDg=.html 16: http://www.zcool.com.cn/work/ZMTMwMzM5NzI=.html
17: http://www.68ps.com/jc/big_ps_wz.asp?id=3773 18: http://www.zcool.com.cn/work/ZMTI5MDczMjQ=.html
21: http://www.zcool.com.cn/work/ZNTE3MTAxMg==.html 22: http://www.zcool.com.cn/work/ZNTEwNDA5Mg==.html
23: http://www.zcool.com.cn/work/ZNDEyMzc0NA==.html 24: http://www.zcool.com.cn/work/ZNzM3NDA5Ng==.html
25: http://www.zcool.com.cn/work/ZNDIzMDMzMg==.html 26: http://www.zcool.com.cn/work/ZMzM3NjQ4NA==.html
29: http://www.68ps.com/jc/big_ps_wz.asp?id=3916 30: http://www.68ps.com/jc/big_ps_wz.asp?id=3854
Appendix B Comparisons with State-of-the-Art Methods
Appendix C Text Effects Transfer between Different Styles, Languages and Fonts
rust: Created by us under the design tutorial of http://photo.renren.com/photo/249458089/photo-3396512370/v7
blink: Created by us under the design tutorial of http://www.zcool.com.cn/article/ZNTYxNTY=.html
drop: Created by us under the design tutorial of http://www.zcool.com.cn/work/ZNDk1MzQxMg==.html
silver Created by us under the design tutorial of http://www.zcool.com.cn/work/ZMTc2NzU0OA==.html for experiments. Text effects: water, flame, rust, blink, drop, silver.