# Alive Caricature from 2D to 3D

## Abstract

Caricature is an art form that expresses subjects in abstract, simple and exaggerated view. While many caricatures are 2D images, this paper presents an algorithm for creating expressive 3D caricatures from 2D caricature images with a minimum of user interaction. The key idea of our approach is to introduce an intrinsic deformation representation that has a capacity of extrapolation enabling us to create a deformation space from standard face dataset, which maintains face constraints and meanwhile is sufficiently large for producing exaggerated face models. Built upon the proposed deformation representation, an optimization model is formulated to find the 3D caricature that captures the style of the 2D caricature image automatically. The experiments show that our approach has better capability in expressing caricatures than those fitting approaches directly using classical parametric face models such as 3DMM and FaceWareHouse. Moreover, our approach is based on standard face datasets and avoids constructing complicated 3D caricature training set, which provides great flexibility in real applications.

## 1 Introduction

Caricature is a pictorial representation or description that deliberately exaggerates a person’s distinctive features or peculiarities to create an easily identifiable visual likeness with a comic effect [33]. This vivid art form contains the concepts of abstraction, simplification and exaggeration. It has been shown that the effect of producing caricature can increase face recognition rates [32, 35, 14]. Since Brennan presented the first interactive caricature generator in 1985 [6], many approaches and computer-assisted caricature generation systems have been developed [24, 27, 25, 40]. Most of these works focus on 2D caricature generation. Our goal is to develop techniques for creating 3D caricatures from 2D caricature images. Such expressive 3D models of caricatures are interesting and useful, for example, in cartoon and social media.

Creating 3D caricatures from 2D images is a problem of image-based modeling. A closely-related and very interesting problem is face reconstruction which is widely studied in computer vision. Due to diverse geometric and texture variations and a large variety of identities and expressions, face reconstruction is a nontrivial task. Recently face reconstruction has achieved great progress. Many excellent works have been proposed for reconstructing real faces and their expressions as well. Among them, example-based methods first build a low-dimensional parametric representation of 3D face models from an example set and then fit the parametric model to the input 2D image [4, 9]. The shape-from-shading approach reconstructs faces from image(s) using shading variationï½[21, 22].

Compared to normal face reconstruction, caricature modeling is much more difficult. The challenges lie at least in several aspects outlined below. First, the diversity of caricatures is much more severe than normal faces, which means example-based methods in real face reconstruction cannot be simply transferred. For example, 3DMM [4] and FaceWareHouse [9] are very successful in modeling normal faces, but the space they define is not large enough for modeling caricatures in our experiments. Second, caricatures are by nature artwork and may not reflect the real physical environment such as lighting information, which implies caricature images may not provide accurate shading cues. Third, creating caricatures is an artistic process. Different caricaturists can develop very different styles for caricatures. Thus the construction process should also consider the individual styles of exaggeration.

Inspired by rapid advance and power of machine learning techniques, learning-based approaches have also been proposed to create 3D caricature models [26, 17]. These approaches require a caricature dataset for training. However, creating a caricature dataset is time-consuming because this usually involves caricaturists and caricatures have more information such as deformation possibility and different styles.

Note that caricatures have two basic characteristics. The first one is that they have face constraint. That is, we can tell they are a face. The second one is that the features of the face have been exaggerated. These characteristics suggest that a caricature can be viewed as a deformation from a standard face that keeps inherent features of the original face. The effect of exaggeration implies large and nonuniform deformation and usually extrapolation is needed. While classical parametric face models focus on the position of each vertex of 3D faces and they usually use interpolation to create new faces, they have difficulty in producing largely exaggerated faces. We borrow the concept of differential coordinates from mesh deformation and introduce a new deformation representation that is suitable for local and large deformation in a natural way. This representation allows a data-driven approach to generating a deformation space from normal face dataset, which is flexible and maintains the face-like target. Moreover, we propose to use a set of facial landmarks to capture the exaggeration style of the input 2D caricature image and formulate an optimization problem based on landmark constraints to make sure that the generated 3D caricature has the similar exaggeration style (see Fig .1). This avoids the need of creating a 3D caricature dataset with the same style.

The main contributions of the paper are twofold. First, we propose a new intrinsic deformation representation that uses local differential coordinates and allows expressing face-like targets with nonuniform, large local deformation. This deformation representation could also be useful for other applications. Second, we formulate our 3D caricature generation into an optimization problem whose solution delivers a 3D caricature satisfying the face constraint and exaggeration styles.

## 2 Related Work

Face reconstruction and recognition [7, 16, 19, 20] are closely relevant to our work. For face reconstruction, data-driven approaches are becoming popular. For example, Blanz and Vetter proposed 3D morphable model (3DMM) [4] that were built on an example set of 200 3D face models describing shapes and textures. Based on 3DMM, Convolutional Neural Network (CNN) was constructed to generate 3D face models [19, 16]. Cao \etal[9] used RGBD sensors to develop FaceWareHouse, a large face database with 150 identities and 47 expressions for each identity. Using FaceWareHouse, [20] regressed the parameters of the bilinear model of [39] to construct 3D faces from a single image. We also use FaceWareHouse to construct our standard face dataset.

Following Brennan’s work [6], many attempts have been made to develop computer-assisted tools or systems for creating 2D caricatures. Akleman \etal[1] developed an interactive tool to make caricatures using morphing. Liang \etal[24] used a caricature training database and learned exaggeration prototypes from the database using principal component analysis. Chiang \etal[25] developed an automatic caricature generation system by analyzing facial features and using one existing caricature image as the reference.

Relatively there is much less work on 3D caricature generation [30, 29]. Clarke \etal[10] proposed an interactive caricaturization system to capture deformation style of 2D hand-drawn caricatures. The method first constructed a 3D head model from an input facial photograph and then preformed deformation for generating 3D caricatures. Liu \etal[26] proposed a semi-supervised manifold regularization method to learn a regressive model for mapping between 2D real faces and the enlarged 3D caricatures in the training set. With the development of deep learning, Han \etal[17] developed a sketch system using CNN to model 3D caricatures from simple sketches. In their approach, the FaceWareHouse [9] was enlarged to handle the variation of 3D caricature models since the lack of 3D caricature samples made it challenging to train a good model. Different from these work, our approach does not require 3D caricature samples.

In geometric modeling, deformation is a common technique. Many surface based deformation techniques are related to the underlying geometric representation. Local differential coordinates are a powerful representation that encodes local details and can benefit the deformation by preserving shape details [41][36]. To provide high-level or semantic control, data-driven techniques learn deformation from examples. Sumner \etal[38] proposed a deformation method by blending the deformation gradient of example shapes. Baran \etal[3] proposed a semantic deformation transfer method using rotation-invariant coordinates. Gao \etal[12] proposed a deformation representation by blending rotation differences between adjacent vertices and scaling/shear at each vertex and developed a sparse data-driven deformation for large rotation [13]. Our work borrows the concept of local differential coordinates to build the deformation representation.

## 3 Intrinsic Deformation Representation

To produce 3D caricatures from 2D images, we first build a new 3D representation for 3D caricature faces. Unlike previous methods that rely on a large set of carefully designed 3D caricature faces for training, our method takes *standard* 3D faces, and exploits the capability of extrapolation of an intrinsic deformation representation. Standard 3D faces are much easier to obtain and are readily available from standard datasets. By contrast, caricature faces are much richer. Particularly, different artists may have different styles for caricatures. Providing a full coverage is extremely difficult.

### 3.1 Deformation representation for 2 models

To make it easy to follow, we first introduce our intrinsic representation of the deformation between two models, which will then be extended to a collection of shapes. In particular, one model is chosen as the reference model and the other is the deformed model. We assume they have been globally aligned. Let us denote by the position of the vertex on the reference model, and by the position of on the deformed model. The deformation gradient in the 1-ring neighborhood of from the reference model to the deformed model is defined as the affine transformation matrix that minimizes the following energy:

(1) |

where is the 1-ring neighborhood of vertex , , , and is the cotangent weight depending only on the reference model to cope with irregular tessellation [5]. The matrix can be decomposed into a rotation part and a scaling/shear part using polar decomposition: .

To allow effective linear combination, we take the axis-angle representation [11] to represent the rotation matrix of vertex . The rotation matrix can be represented using a rotation axis and rotation angle pair with the mapping , specifically , where and . Given two rotations in the axis-angle representation, it is not suitable to blend them linearly, so we convert the axis and angle to the matrix logarithm representation:

(2) |

The logarithm of rotation matrices allows effective linear combination [2], e.g., two rotations and can be blended using . Rotation matrix can be recovered by matrix exponential .

If the deformed model is the same as the reference model, and for all , where is the set of vertices, and and are identity and zero matrices. Thus we define our deformation representation as

(3) |

By subtracting the identity matrix from , the deformation representation of “no deformation” becomes a zero vector which builds a natural coordinate system.

### 3.2 Deformation representation for shape collections

We can extend the previous definition to a collection of models. Suppose we have 3D face models. In our experiments, we select face models from *FaceWareHouse* [9], a 3D face dataset with large
variation in identity and expression. At first, we mark some facial landmarks on the 3D face models. With the help of the landmarks, we apply rigid alignment to 3D face models to remove global rigid transformation between them.

We similarly choose one model as the reference model and let the others be the deformed models. Given the deformed models, we can obtain deformation representations with

(4) |

and for simpler expression.

The deformation representation actually defines a deformation space. To generate a new deformed mesh based on , we formulate the deformation gradient of a deformed mesh as a linear combination of the basis deformations :

(5) |

where is the combination weight vector, consisting of weights of rotation and weights of scaling/shear . and correspond to the rotation and scaling/shear of the vertex in the basis . Given the representation basis , different faces can be obtained by varying . By introducing two sets of weights for rotation and scaling/shear, our representation is flexible to cope with exaggerated 3D faces, as we will demonstrate later.

We give a simple example in Fig. 2 where (and denoted as for simplicity). We have two deformed example models, so only has 2 dimensions. The golden face located in the origin is the reference model. Two deformed models are shaded golden and located at (with the mouth open), and (with a different subject). and the golden open-mouth face at as a deformed model and the different identity face in (0,1) as another one. By setting to , we can let the mouth open wider. If we set to , we can let the face at open his mouth. When is set to , it will exaggerate the face at . By a linear combination of deformation basis using different weights , we can generate deformed meshes in the deformation space.

### 3.3 Deformation weight extraction and model reconstruction

We can define the deformation energy as follows:

(6) |

where represents the positions of deformed vertices. By minimizing this energy, we are able to determine the position of each vertex on the deformed mesh given weights , or obtain the combination weights given the deformed mesh .

#### Model reconstruction from weights

Given , the deformation gradient can be directly obtained using Eq. 5. Then model reconstruction is done by finding the optimal that minimizes:

(7) |

For each , we tackle it by solving , which leads to:

(8) |

with , . The resulting linear system can be written in the form where the matrix is fixed and sparse since only entries where the corresponding vertices are associated with the edge are non-zero. By specifying the position of one vertex, we can get single solution to the equations. This initial specification will not change the shape of the output. The models shown in Fig. 2 are obtained using this optimization.

#### Optimizing for a given deformed model

Given a deformed 3D model , the optimal weights to represent the deformation can be obtained by minimizing:

(9) |

This is a non-linear least squares problem because of . To solve it, we first compute Jacobian matrix w.r.t. example model derived as two components:

(10) |

which are the derivatives of w.r.t. the rotation weight and scaling/shear , respectively.
The optimal for a given deformed model can be calculated using the *Levenberg-Marquardt* algorithm [28].

## 4 Generation of 3D Caricature Models

Built upon our deformation representation, we now describe our algorithm to construct a 3D caricature model from a 2D caricature image. Assume that we have already had a 3D reference face model and a deformation representation based on it. To capture exaggerated facial expressions, we use a set of landmark points in the 2D image and 3D model, which correspond to the landmark points marked on the reference face.

Reconstructing the 3D model from a 2D image is the inverse process of observing a 3D object by projecting it to an imaging plane. Therefore, this process is affected by view parameters. For simplicity, we choose orthographic projection to set the relationship between 3D and 2D. Without loss of generality, we assume that the projection plane is the -plane and thus the projection can be written as

(11) |

where and are the locations of vertex in the world coordinate system and in the image plane, respectively, is the scale factor, is the rotation matrix constructed from Euler angles *pitch, yaw, roll*, and is the translation vector. For convenience we introduce as:

(12) |

Then we map 3D landmarks onto the image plane by . The landmark fitting loss can be defined as

(13) |

where and are the set of 3D landmarks and 2D landmarks.

To generate a 3D caricature model that looks like a human face and meanwhile matches 2D landmarks for the effects of exaggeration, we utilize our deformation representation and the projection relationship. The problem is formulated as an optimization problem:

(14) |

where and are defined in Eq. 6 and Eq. 13, and is the tradeoff factor controlling the relative importance of the two terms.

To solve the above optimization problem, we initialize by simply letting all weights be zero and then alternately solve for and using the following -step and -step. The process continues until convergence or reaching the maximum number of iterations. The whole algorithm is outlined in Algorithm 1.

-step: We use the similar approach described in Sec. 3.3.1 to obtain . Let be the overall energy to be optimized. We set which gives the following equations:

(15) |

where . The former equation applies to landmark vertices () and the latter equation applies to non-landmark vertices ().

-step: In this step, we fix and optimize . Since is independent of , we use the *Levenberg-Marquardt* algorithm described in Sec. 3.3.2 to solve the non-linear least squares problem. After solved , we first update project parameters then go back to optimize -step. We exit loop to get generated .

## 5 Experiments

### 5.1 Implementation Details

Our algorithm is implemented in C++, using the Eigen library [15] for all linear algebra operations. All the examples are run on a desktop PC with 4GB of RAM and a hexa-core CPU at 1.6 GHz. We set in Eq. 14 to be 0.01 and the initial value of is set to be a zero vector. For all the test models, we run our solver for 4 iterations in Alg. 1 and , which are sufficient to get satisfying results. The number of vertices is 11510. The average time for manually adjusting the landmarks is about min. To solve the optimization problem, each iteration takes about s including -step and -step where -step takes most of the time. Overall, it takes less than 40s to produce the result with our unoptimized implementation.

To construct the deformation representations , we choose models from FaceWareHouse dataset [9], which contains 150 identities and 47 expressions for each identity. The construction pipeline is shown in Fig. 3. We first compute the average shape of each expression and then select expressions with large difference. Meanwhile we choose neutral expression of each identity and select models with large difference to the mean neutral face. Merging these two parts together, we obtain our dataset, which includes face models. The mean neutral face is set as the reference model and the others are set as deformed models.

In our method, 68 facial landmarks are applied to constrain the 3D face shape. To detect the facial landmarks on caricature images, we use Dlib library [23]. However, as Dlib library is trained with *standard* face images, some of the detected landmarks on caricature images might be not accurate. We design an interactive system that allows the user to adjust the landmarks as shown in Fig. 4.

### 5.2 Baselines

We compare the proposed method with the reconstruction methods by 3DMM [31, 42] and FaceWareHouse dataset [9, 7]. For all the methods, the 3D face model is reconstructed by minimizing the residuals between the projected 3D landmarks and the corresponding 2D landmarks. In the following, we introduce the implementation details of these baseline methods.

3DMM with different regularization weights: In [42], the 3DMM representation is adopted to represent any 3D face model:

(16) |

where is a 3D face, is the mean face, and and are the principal axes on identities and expressions. Since and are generated by principal components analysis, the fitted model should satisfy the distribution of face space, and thus a regularization term is added:

(17) |

where and are the standard deviations of each principal component for identity and expression, respectively. By representing projected 3D landmarks with , the objective energy functional in their method is:

(18) |

When gets larger, the reconstructed model would get towards mean face. We test with two values and , and denote the corresponding reconstruction results as 3DMM, 3DMM(-) respectively.

FaceWareHouse: In [8, 7, 9], the bilinear model represents 3D face as:

(19) |

where is the core tensor, and and are the coefficients of identity and expression. By minimizing the residuals between projected landmarks and 2D landmarks, we can obtain optimal and and thus the reconstructed face model.

Caricatured 3DMM: In [17], the FaceWareHouse dataset is expanded by adding more exaggerated face models to enhance the representation capability of the original dataset, where the exaggerated face models are generated via the method in [34]. However, the expanded FaceWareHouse dataset is not publicly available. Thus, we follow the same method [34] to expand *Basel Face Model* [31] dataset, and some examples are shown in Fig. 5. After constructing expanded 3DMM, we apply principal component analysis to generate the parametric model, similar to the method in [17] to produce the bilinear model.

### 5.3 Results

Fig. 6 shows some visual results of the generated 3D caricatures using different methods. It can be observed that the face shape by 3DMM is too regular, and thus not exaggerated enough to match the input face images. Although the 3D models by 3DMM(-) are exaggerated, the shapes are distorted, not regular to be face shapes. Since the linear model is applied in FaceWareHouse, its exaggeration capability is also limited. As for the method of caricatured 3DMM, although some caricatured face models generated by using [34] are included in the database, it contains only limited styles of exaggeration. If the exaggerated face is beyond the expanded database, it cannot be properly expressed. In contrast, the reconstruction results by our method are quite close to the shape of the input images, and model quality are better.

To quantitatively compare different methods, we define the average fitting error as the root-mean-square fitting error:

(20) |

We compared 3DMM, 3DMM(-), FaceWareHouse, Caricatured 3DMM and our method over all test data including 50 annotated caricature images, and the statistics values are shown in Tab. 1. It can be observed that existing linear 3D face representation methods can not fit the landmarks of the caricature images well even by setting , while our method can achieve quite small fitting error.

3DMM | 3DMM(-) | FWH | C-3DMM | Ours |
---|---|---|---|---|

To further satisfy the landmark constraints while preserving a reasonable face shape, we apply as-rid-as-possible (ARAP) [37] deformation to the reconstruction results of existing methods. With the help of ARAP deformation, the fitting error of other methods are reduced to 0.01. Although ARAP deformation could help fit the landmarks, it still preserves the shape structure of the original reconstruction results which cannot fit the input caricature image well as shown in Fig. 7.

As the landmark fitting error is not sufficient for evaluation, we conduct a user study including 15 participants from different backgrounds. Each participant is given 10 randomly selected images and their corresponding reconstruction results by each method (Our method and ARAP deformation on 3DMM, Caricature 3DMM and FaceWareHouse) in random order. Each participant sorts the reconstructed models from best to worst by measuring their similarities with the input caricature image. The statistical result indicates that our models get voting for top-1 and for top-2. We also investigate the subjective feeling of each participant after knowing the corresponding method of each model. According to the following investigation, all participants agree that our method captures the shape of caricature much better than other three methods. However, in around experiments, the output meshes by our method are too smooth compared with other methods, which seems lack detail information of caricature. Our reconstructed mesh may exist self-intersection if the expression is exaggerated too much, while the reconstructed meshs by other methods still have the same problem even for caricature image with small exaggerated expressions.

More results by our method are shown in Fig.8, where the caricature images are selected from the database [18] and Internet. All the tested caricature images, their corresponding landmarks and the reconstructed meshes are available at https://github.com/QianyiWu/Caricature-Data.

## 6 Conclusion

We have presented an efficient algorithm for generating a 3D caricature model from a 2D caricature image. As we can see from the experiments, our algorithm can generate appealing 3D caricatures with the similar exaggeration style conveyed in the 2D caricature images. One of the key techniques of our approach is a new deformation representation which has the capacity of modeling caricature faces in a non-linear and more nature way. As a result, compared to previous work, our approach has a unique advantage that we just use normal face models to create caricatures. This enables our approach to be more suitable for various applications in real scenarios.

Acknowledgement We thanks Luo Jiang, Boyi Jiang, Yudong Guo, Wanquan Feng, Yue Peng, Liyang Liu and Kaiyu Guo for their helpful discussion during this work. Thanks to Thomas Vetter \etcand Kun Zhou \etcfor allowing the use of 3D face dataset. Thanks to all artists who create caricatures used in this work. Special thanks to all participants in the user study. This work was supported by the National Key R&D Program of China (No. 2016YFC0800501), and the National Natural Science Foundation of China (No. 61672481), NTU CoE Grant and MOE Tier-2 Grants (2016-T2-2-065, 2017-T2-1-076).

### References

- E. Akleman. Making caricatures with morphing. In ACM SIGGRAPH 97 Visual Proceedings: The art and interdisciplinary programs of SIGGRAPH’97, page 145. ACM, 1997.
- M. Alexa. Linear combination of transformations. In ACM Transactions on Graphics (TOG), volume 21, pages 380–387. ACM, 2002.
- I. Baran, D. Vlasic, E. Grinspun, and J. Popović. Semantic deformation transfer. In ACM Transactions on Graphics (TOG), volume 28, page 36. ACM, 2009.
- V. Blanz and T. Vetter. A morphable model for the synthesis of 3d faces. In Proceedings of the 26th annual conference on Computer graphics and interactive techniques, pages 187–194. ACM Press/Addison-Wesley Publishing Co., 1999.
- M. Botsch and O. Sorkine. On linear variational surface deformation methods. IEEE transactions on visualization and computer graphics, 14(1):213–230, 2008.
- S. Brennan. The dynamic exaggeration of faces by computer. Leonardo, 18(3):170 – 178, 1985.
- C. Cao, Q. Hou, and K. Zhou. Displaced dynamic expression regression for real-time facial tracking and animation. ACM Transactions on graphics (TOG), 33(4):43, 2014.
- C. Cao, Y. Weng, S. Lin, and K. Zhou. 3d shape regression for real-time facial animation. ACM Transactions on Graphics (TOG), 32(4):41, 2013.
- C. Cao, Y. Weng, S. Zhou, Y. Tong, and K. Zhou. Facewarehouse: A 3d facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics, 20(3):413–425, 2014.
- L. Clarke, M. Chen, and B. Mora. Automatic generation of 3d caricatures based on artistic deformation styles. IEEE transactions on visualization and computer graphics, 17(6):808–821, 2011.
- J. Diebel. Representing attitude: Euler angles, unit quaternions, and rotation vectors. Matrix, 58(15-16):1–35, 2006.
- L. Gao, Y.-K. Lai, D. Liang, S.-Y. Chen, and S. Xia. Efficient and flexible deformation representation for data-driven surface modeling. ACM Transactions on Graphics (TOG), 35(5):158, 2016.
- L. Gao, Y.-K. Lai, J. Yang, L.-X. Zhang, L. Kobbelt, and S. Xia. Sparse data driven mesh deformation. arXiv preprint arXiv:1709.01250, 2017.
- D. B. Graham and N. M. Allinson. Norm-based face recognition. In 1999 Conference on Computer Vision and Pattern Recognition (CVPR ’99), 23-25 June 1999, Ft. Collins, CO, USA, pages 1586–1591, 1999.
- G. Guennebaud, B. Jacob, et al. Eigen v3. http://eigen.tuxfamily.org, 2010.
- Y. Guo, J. Zhang, J. Cai, B. Jiang, and J. Zheng. Cnn-based real-time dense face reconstruction with inverse-rendered photo-realistic face images. arXiv preprint arXiv:1708.00980, 2017.
- X. Han, C. Gao, and Y. Yu. Deepsketch2face: a deep learning based sketching system for 3d face and caricature modeling. ACM Trans. Graph., 36(4):126:1–126:12, 2017.
- J. Huo, W. Li, Y. Shi, Y. Gao, and H. Yin. Webcaricature: a benchmark for caricature face recognition. arXiv preprint arXiv:1703.03230, 2017.
- A. S. Jackson, A. Bulat, V. Argyriou, and G. Tzimiropoulos. Large pose 3d face reconstruction from a single image via direct volumetric cnn regression. International Conference on Computer Vision, 2017.
- L. Jiang, J. Zhang, B. Deng, H. Li, and L. Liu. 3d face reconstruction with geometry details from a single image. arXiv preprint arXiv:1702.05619, 2017.
- I. Kemelmacher-Shlizerman and R. Basri. 3d face reconstruction from a single image using a single reference face shape. IEEE transactions on pattern analysis and machine intelligence, 33(2):394–405, 2011.
- I. Kemelmacher-Shlizerman, R. Basri, and B. Nadler. 3d shape reconstruction of mooney faces. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1–8. IEEE, 2008.
- D. E. King. Dlib-ml: A machine learning toolkit. Journal of Machine Learning Research, 10(Jul):1755–1758, 2009.
- L. Liang, H. Chen, Y. Xu, and H. Shum. Example-based caricature generation with exaggeration. In 10th Pacific Conference on Computer Graphics and Applications, pages 386–393, 2002.
- P.-Y. C. W.-H. Liao and T.-Y. Li. Automatic caricature generation by analyzing facial features. In Proceeding of 2004 Asia Conference on Computer Vision (ACCV2004), Korea, volume 2, 2004.
- J. Liu, Y. Chen, C. Miao, J. Xie, C. X. Ling, X. Gao, and W. Gao. Semi-supervised learning in reconstructed manifold space for 3d caricature generation. In Computer Graphics Forum, volume 28, pages 2104–2116. Wiley Online Library, 2009.
- Z. Liu, H. Chen, and H. Shum. An efficient approach to learning inhomogeneous gibbs model. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 425–431, 2003.
- J. J. Moré. The levenberg-marquardt algorithm: implementation and theory. In Numerical analysis, pages 105–116. Springer, 1978.
- A. J. O’Toole, T. Price, T. Vetter, J. C. Bartlett, and V. Blanz. 3d shape and 2d surface textures of human faces: The role of ?averages? in attractiveness and age. Image and Vision Computing, 18(1):9–19, 1999.
- A. J. O’toole, T. Vetter, H. Volz, and E. M. Salter. Three-dimensional caricatures of human heads: Distinctiveness and the perception of facial age. Perception, 26(6):719–732, 1997.
- P. Paysan, R. Knothe, B. Amberg, S. Romdhani, and T. Vetter. A 3d face model for pose and illumination invariant face recognition. In Advanced video and signal based surveillance, 2009. AVSS’09. Sixth IEEE International Conference on, pages 296–301. Ieee, 2009.
- G. Rhodes, S. Brennan, and S. Carey. Identification and ratings of caricatures: Implications for mental representations of faces. Cognitive Psychology, 19(4):473 – 497, 1987.
- S. B. Sadimon, M. S. Sunar, D. B. Mohamad, and H. Haron. Computer generated caricature: A survey. In International Conference on CyberWorlds, pages 383–390, 2010.
- M. Sela, Y. Aflalo, and R. Kimmel. Computational caricaturization of surfaces. Computer Vision and Image Understanding, 141:1–17, 2015.
- M. A. Shackleton and W. J. Welsh. Classification of facial features for recognition. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 573–579, 1991.
- O. Sorkine. Laplacian mesh processing. In Eurographics (STARs), pages 53–70, 2005.
- O. Sorkine and M. Alexa. As-rigid-as-possible surface modeling. In Proceedings of the Fifth Eurographics Symposium on Geometry Processing, pages 109–116, 2007.
- R. W. Sumner and J. Popović. Deformation transfer for triangle meshes. ACM Transactions on Graphics (TOG), 23(3):399–405, 2004.
- D. Vlasic, M. Brand, H. Pfister, and J. Popović. Face transfer with multilinear models. In ACM transactions on graphics (TOG), volume 24, pages 426–433. ACM, 2005.
- S. Wang and S. Lai. Manifold-based 3d face caricature generation with individualized facial feature extraction. Computer Graphics Forum, 29(7):2161–2168, 2010.
- Y. Yu, K. Zhou, D. Xu, X. Shi, H. Bao, B. Guo, and H.-Y. Shum. Mesh editing with poisson-based gradient field manipulation. In ACM Transactions on Graphics (TOG), volume 23, pages 644–651. ACM, 2004.
- X. Zhu, Z. Lei, J. Yan, D. Yi, and S. Z. Li. High-fidelity pose and expression normalization for face recognition in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 787–796, 2015.