SynSig2Vec: Learning Representations from Synthetic Dynamic Signatures for Realworld Verification
Abstract
An open research problem in automatic signature verification is the skilled forgery attacks. However, the skilled forgeries are very difficult to acquire for representation learning. To tackle this issue, this paper proposes to learn dynamic signature representations through ranking synthesized signatures. First, a neuromotor inspired signature synthesis method is proposed to synthesize signatures with different distortion levels for any template signature. Then, given the templates, we construct a lightweight onedimensional convolutional network to learn to rank the synthesized samples, and directly optimize the average precision of the ranking to exploit relative and finegrained signature similarities. Finally, after training, fixedlength representations can be extracted from dynamic signatures of variable lengths for verification. One highlight of our method is that it requires neither skilled nor random forgeries for training, yet it surpasses the stateoftheart by a large margin on two public benchmarks.
Introduction
Handwritten signatures are the most socially and legally accepted means of personal authentication. They are generally regarded as a formal and legal means to verify a person’s identity in administrative, commercial and financial applications, for example, when signing credit card receipts. Over the last forty years, research interest in automatic signature verification (ASV) has grown steadily, and a number of comprehensive survey papers have summarized the stateoftheart results in the field till 2018 [Plamondon and Lorette1989, Plamondon and Srihari2000, Impedovo and Pirlo2008, Diaz et al.2019]. Nowadays, building an ASV system to separate genuine signatures and random forgeries (produced by a forger who has no knowledge about the authentic author’s name or signature) can be considered a solved task, while to separate genuine signatures and skilled forgeries (produced by a forger after unrestricted practice) still remains an open research problem.
Among the literatures over years, great research effort has been devoted to obtaining good representations for signatures by developing new features and feature selection techniques. With the popularity of deep learning in recent years, several researches have also applied deep learning models to learn representations for dynamic signature sequences or static signature images, and have achieved certain improvements in reducing the verification error against skilled forgeries. However, these methods have several limitations. First of all, they require skilled forgeries as training samples to achieve good performance. One should know that, as a biometric trait and a special kind of private data, handwritten signatures are nontrivial to collect; skilled forgeries, which require the forgers to practise once and again, are even more difficult to acquire. Therefore, these methods can hardly perform equally well when there are only genuine signatures for training. Second, these methods generally lack an appropriate data augmentation method, which is fundamental in training deep learning models. The main reason lies in the following questions: Which type of data augmentation can essentially capture the variance of the underlying signing process? To what extent can an augmented genuine signature maintain its “genuineness”? Third, existing loss functions do not consider finegrained signature similarities and tend to overfit. For example, in Siamese networks, positive and negative signature pairs are always labelled with 1s and 0s, respectively, regardless of the actual visual similarities.
In this paper, we focus on dynamic signature verification and propose a novel ASV system without the abovementioned limitations. A basis of our method is the kinematic theory of rapid human movements and its Sigma Lognormal () model [Plamondon1995]. The model hypothesizes that the velocity of a neuromuscular system can be modelled by a vector summation of a number of lognormal functions, each of which is described by six parameters. Rooted in this model, we extract the underlying neuromuscular parameters of genuine signatures and synthesize new signatures by introducing perturbations to the parameters. The level of parameter perturbation controls the level of signature distortion; based on one genuine signature, we can synthesize signatures with various distortion levels, as shown in Fig. 1. Thereafter, many representation learning methodologies can be considered, such as metric learning. In this study, we propose to learn dynamic signature representations through optimizing the average precision (AP) of signature similarity ranking based on the direct loss minimization framework proposed by [Song et al.2016]. This learning strategy has two benefits. First, as a listwise ranking method, AP optimization can preserve and exploit finegrained signature similarities in the ranking list. Second, it is expected to improve the performance since AP is closely related to verification accuracy. Signature similarities are computed as cosine similarities of representations extracted from onedimensional convolutional neural network (CNN).
The main contributions of this paper are threefold. First, the application of model to signature synthesis not only eliminates the need for skilled forgeries, but also serves as a data augmentation technique. Second, we introduce AP optimization and demonstrate its effectiveness for dynamic signature representation learning. Third, we design a simple yet stateoftheart CNN structure to extract fixedlength representations from dynamic signatures of variable lengths.
Related Work
Application of deep learning to dynamic signature verification has not been explored too much due to the difficulty of collecting large datasets. Existing studies in the field can be roughly divided into two categories. The first category learns local representations [Lai and Jin2019, Wu et al.2019]. It maintains the temporal information of the input and applies dynamic time warping (DTW) to the learned feature sequence; to this end, specific techniques may be needed during training, such as a modified gated recurrent unit [Lai and Jin2019] and signature prewarping [Wu et al.2019]. The second category learns fixedlength global representations [Tolosana et al.2018, Ahrabian and Babaali2018, Park, Kim, and Choi2019]. For example, [Park, Kim, and Choi2019] used CNN and time interval embedding to extract features from dynamic signature strokes, and then utilized recurrent neural networks to aggregate over the strokes. As mentioned in the introduction, the above methods require training with skilled forgeries to enhance the performance when verifying this type of samples, which may be impractical in many situations. Our method only requires genuine signatures because of the introduced synthesis method, and learns global representations using a lightweight CNN.
Two relevant studies to ours also use based synthetic signatures to train dynamic ASV systems. [Diaz et al.2018] synthesized auxiliary template signatures to enhance several nondeep learning systems, including DTWbased, HMMbased and Manhattanbased systems. Whether synthetic signatures are effective in deep learning was not validated in their study. [Ahrabian and Babaali2018] trained and tested on fully synthetic signatures using recurrent autoencoder and the Siamese network. Whether synthetic data helps to verify real world signatures was not investigated. Also, different to these two studies, our method learns to rank the signatures synthesized with different distortion levels, which is a novel idea in the field of ASV.
Methodology
Parameters  Admissible Ranges  Distortion Levels  

=0.1000, =0.1000.^{*}  
=0.0825, =0.0850.  
=0.3950, =0.3775.  
=0.3250, =0.2875.  
Not used  
Not used 

[Bhattacharya et al.2017] did not work on ’s admissible ranges. We decide this range based on our visual tests.
based Signature Synthesis
The kinematic theory of rapid human movements, from which the model was developed, suggests that human handwriting consists in controlling the pentip velocity with overlapping lognormal impulse responses, called strokes, as illustrated in Fig. 2. The magnitude and direction of the velocity profile of the th stroke is described as:
(1) 
(2) 
where is the amplitude of the stroke, the time occurrence, the log time delay, the log response time, and respectively the starting angle and ending angle of the stroke. And the velocity of the complete handwriting movement is considered as the vector summation of the individual stroke velocities:
(3) 
being the number of strokes. In short, each stroke is defined by six parameters: , and a complete handwriting component is defined by .^{1}^{1}1The parameters are extracted based on our implementation of [O’Reilly and Plamondon2009]. In this study, one complete “component” refers to the trajectory of a pendown movement, and each component, i.e. pendown, in the signature is analyzed individually. Although the entire signature can be viewed as a single component by considering the penup movements, this practice is less preferred due to two reasons. First, it complicates the parameter extraction process. Second, many current devices and datasets do not record penups.
Based on the extracted parameters in , the component can be reconstructed as follows:
(4) 
(5) 
where and are the residual trajectories that are considered (by the parameter extraction algorithm) to contain no valid strokes. By introducing perturbations to the parameters in , new synthetic components can be generated. A new signature can thus be generated by synthesizing every component in the template signature.
, , and are distorted as follows:
(6) 
(7) 
(8) 
(9) 
while , are distorted as follows:
(10) 
(11) 
, , , , , are uniform random variables that decide the signature distortion level, and are fixed for all strokes across a component. A previous study [Bhattacharya et al.2017] carried out visual Turing test on synthetic characters and worked out the admissible ranges (in percentage terms) of parameter variation: varying parameters out of these ranges will make the character unrecognizable. We borrow the result for , and ; as for , , , they are empirically restricted in a small range. On this basis, for each genuine signature, two groups of signatures, denoted below as and respectively, are generated with two different distortion levels as shown in Table I. Signatures in have lower distortion levels and should rank higher according to the similarity to the template signature, as compared to those in . In this context, one can regard as augmented genuine signatures, and as synthetic skilled forgeries. These two distortion levels correspond to the low and median distortion levels as illustrated in Fig. 1, and are determined through a simple coarse grid search, leaving room for future improvements.
To preserve and exploit finegrained signature similarities, we construct onedimensional CNNs to learn to rank these synthesized signatures, and optimize the AP of the ranking, as described in the following section.
Average Precision Optimization
Given one genuine signature and its synthetic samples, we compute and rank their similarities and incorporate the AP of the ranking into the loss function for optimization. Because AP is nondifferential to the CNN’s outputs, we resort to the General Loss Gradient Theorem proposed by [Song et al.2016], which provides us with the weight update rule for the optimization of AP. Specifically, a neural network can be viewed as a composite scoring function , which depends on the input , the output , and some parameters . The theorem (cited here for completeness) states that:
Theorem 1
When given a finite set , a scoring function , a data distribution, as well as a taskloss , then, under some mild regularity conditions, the direct loss gradient has the following form:
(12)  
with
(13) 
(14) 
In Eq. 12, two directions () of optimization can be used. The positive direction steps away from worse parameters , while the negative direction moves toward better ones.
In the context of this paper, refers to the concatenation of the two synthetic signature groups for one given genuine signature , namely , and . And is the collection of all pairwise comparisons, where if is ranked higher than , , and otherwise. We define the scoring function as follows:
(15) 
where
(16) 
and is the embedding function parameterized by a CNN with learnable parameters . The scoring function is inherited from [Yue et al.2007] and measures the cosine similarity of representations from samples and . We notice that similar definitions to Eqs. 15 and 16 have been used for fewshot learning [Triantafillou, Zemel, and Urtasun2017].
Further, let be a vector constructed by sorting the data points according to the ranking defined by , such that if the th data point belongs to and otherwise. Then, given ground truth and predicted configurations and , the AP loss is
(17) 
where and Prec@ is the percentage of samples belonging to that are ranked above position . To compute the AP loss gradient, we need to infer and in Eqs. 13 and 14. For , the solution is simple:
(18) 
While can be inferred via a dynamic programming algorithm [Song et al.2016]. To prevent overconfidence of the scoring function, we add a regularization term to the AP loss, and obtain the following gradient
(19)  
Besides the AP loss, we also employ the standard crossentropy loss for signature classification. Intuitively, these two losses protect the ASV system from skilled forgery and random forgery attacks, respectively. The overall optimization process is given in algorithm 1.
Framework of SynSig2Vec ASV Method
In this part, we describe our overall framework, including signature preprocessing, CNN structure, and how we construct the verifier based on CNN representations.
Signature synthesis and preprocessing. For signature synthesis, each signature component is resampled at 200 Hz, which is the suggested sampling rate for parameter extraction [O’Reilly and Plamondon2009]. Then, a Butterworth lowpass filter with a cutoff frequency of 10 Hz is applied to the resampled trajectory to enhance the signal. The filtered component is then used for parameter extraction; based on extracted parameters, a synthetic component is generated and resampled at 100 Hz, which is the sampling rate of most existing dynamic signature datasets. As for real handwritten signatures used in this study (all collected at 100 Hz), they are also filtered with the Butterworth lowpass filter to be consistent with synthetic ones.
As we have omitted the penup components, we use a straight line to connect the end of a pendown component and beginning of the next one. These lines can be viewed as virtual penups and have a constant speed equaling the average speed of the pendowns. Because the essential difference between genuine signatures and the synthetic ones lies in their velocity profiles, we extract three feature sequences as follows:
(20) 
(21) 
(22) 
where and are the coordinate sequence. These three feature sequences are normalized to have zero mean and unit variance, and then used as inputs for CNN.
Network structure. A onedimensional CNN with six convolution layers and scaled exponential linear units (SELUs) [Klambauer et al.2017], as shown in Table 2, is employed to learn fixlength representations from signature sequences. Batch normalization is not applied, because during training each batch consists of only one genuine signature and its synthesized samples, which are noni.i.d samples. Nevertheless, SELU provides an alternative normalization effect, and is found to work surprisingly well in our study. Because the signature length may vary after synthesis, we pad all signatures inside a batch with zeros to the maximal length. And a corresponding mask is generated to perform masked average pooling from feature sequences coming out of the sixth convolutional layer.
The receptive field of the sixth convolutional layer is 54. For dynamic signatures sampled at 100 Hz, such a receptive field covers a time interval of 0.5 second and captures several lognormal strokes. We have experimented with deeper networks and larger receptive fields, but found no significant further improvements. We have also explored the residual connections, but found they degraded the performance.
A 256dimensional feature vector is obtained from the masked average pooling layer, and then goes into two branches. The first branch is a fully connected layer with softmax activation, and the crossentropy loss is computed on top of it. The second branch is a fully connected layer with 512 neurons, on top of which the AP loss is computed. The AP loss and the crossentropy loss are used together to optimize the network parameters as described in Algorithm 1. After training, only the second branch is kept. Then, for any given dynamic signature, a 512dimensional feature vector can be extracted as its representation.
th layer  Configurations  

1  1D Convolution, 64, k7, s1, p3  
2  1D Max Pooling, k2, s2  
3  1D Convolution, 64, k3, s1, p1  
4  1D Convolution, 128, k3, s1, p1  
5  1D Max Pooling, k2, s2  
6  1D Convolution, 128, k3, s1, p1  
7  1D Convolution, 256, k3, s1, p1  
8  1D Max Pooling, k2, s2  
9  1D Convolution, 256, k3, s1, p1  
10  Masked 1D Average Pooling  
11  FC with softmax  FC, 512 
activation, classes  
Crossentropy  AP loss 
Verifier. We use a distancebased verifier with the same normalization technique as in [Lai and Jin2019]. Specifically, given two signatures and , we compute the Euclidean distance of their normalized vectors:
(23) 
where is the 512dimensional feature vector extracted from the CNN. Given template signatures from client , we compute the average pairwise distances as ( if ). Then, for a test signature claimed to be client , we compute the following scores:
(24) 
From these scores, the average score and minimum score are computed, leading to a 2D scatter plot on which we fit a line and make the decision. In practice, we find simple already works well. By varying the threshold , we can obtain the equal error rates (EERs) to assess the system performance. Unless mentioned otherwise, a global threshold for all individuals is used.
Experiments
Datasets and protocols
Two benchmark dynamic signature datasets were used in this study, namely MCYT100 [OrtegaGarcia et al.2003] and SVCTask2 [Yeung et al.2004]. The MCYT100 dataset consists of 100 individuals with 25 genuine signatures and 25 skilled forgeries for each individual. The SVCTask2 dataset contains 40 individuals with 20 genuine and 20 forged signatures per individual.
For MCYT100, we used a 10fold cross validation. The th fold corresponded to the th ten individuals, and we trained the models in a roundrobin fashion on all but one of the folds. Skilled forgeries were not included in the training set. In the testing stage, we considered two scenarios, namely T5 and T1. In scenario T5, five genuine signatures were randomly selected as templates for each individual in the test fold, while the rest 20 genuine and 25 forged signatures were used for testing; EERs were computed and averaged over 50 trials. In scenario T1, each genuine signature was considered as a single template to test against the rest signatures. Finally, for both scenarios, EERs were averaged again over 10 test folds.
For SVCTask2, similarly we used a 10fold cross validation and considered scenario T5 and scenario T1. The training set only included real signatures, therefore both the network and learning algorithm should be very dataefficient.
Implementation details
For signature synthesis, to accelerate training, synthetic signatures were first generated offline to create two data pools and for each genuine signature. Then, during training, and were drawn from and respectively. We set , and . Therefore, the batch size was .
For AP optimization, we chose the positive direction and was set as 5 for MCYT100 and 10 for SVCTask2. A larger value of for SVCTask2 led to better generalization because of the small dataset scale. The models were optimized using stochastic gradient descent. The learning rate, momentum and weight decay were set as 0.001, 0.9 and 0.001, respectively. We trained for batches, where was the number of classes in the training set (90 for MCYT100 and 36 for SVCTask2). For the proposed method, the final models were evaluated to report the EERs. For the methods to be compared, the best models were evaluated to see their capacities.
Results
First, based on synthetic signatures, we compared AP loss with triplet loss and binary crossentropy (BCE) to examine its properties. For triplet loss, the distance metric in Eq. 23 was used. Eight hardest triplets were mined from a total of triplets, and the margin was set as 0.25; pairwise distances within were also added to the loss to minimize the intraclass variance. For BCE, cosine similarity in Eq. 16 was computed, rescaled, and activated by the sigmoid function. Within a batch, there were positive and negative pairs, which were labeled as (0.9, 0.1) and (0.5, 0.5) respectively. The loss of positive pairs was doubled for balance.
Losses  Datasets  Scenarios  
T5  T1  
BCE  MCYT100  2.85  7.90 
Triplet  2.83  8.51  
AP loss  1.71  5.50  
BCE  SVCTask2  6.26  14.01 
Triplet  6.93  15.37  
AP loss  4.65  11.96 
The EER curves (using a global threshold), as functions of the number of trained batches, are shown in Fig. 3. First, we can see that the AP loss consistently outperformed the other two losses. Second, the AP loss was more robust against overfitting, and continued to decrease the EERs in late training iterations. Third, somewhat surprisingly, BCE and the triplet loss presented different behaviors on two datasets. A possible reason is that, the triplet loss involved much complexity in hard sample mining and determining a proper margin, which should be carefully treated for different datasets. Detailed EERs are given in Table 3. On the SVCTask2 dataset, previous best EERs are 7.80% in scenario T5 and 18.25% in scenario T1, and our method reduces the EERs by 40.4% and 34.5%, respectively. On the MCYT100 dataset, our method reduces the EERs by 5.6% and 59.4% in scenarios T5 and T1, respectively. The great performance improvement demonstrates that our model could extract intrinsic and robust representations from realworld signatures via learning from synthesized ones.
We further compared synthetic signatures with real handwritten ones. Specifically, we compared the following cases:

Signatures in and were replaced with genuine signatures and skilled forgeries, respectively;

Signatures in were replaced with skilled forgeries;

Signatures in were replaced with genuine signatures;

Signatures in both and were synthetic.
All models were trained in exactly the same way using the AP loss, and the EER curves are shown in Fig. 4. There were two important observations. First, we can see that synthesized signatures were even more effective than real handwritten signatures for training, because they were constructed from disturbed parameters and therefore tightly bounded the template signatures. Second, case 3 converged the fastest, but led to slight overfitting on SVCTask2 and severe overfitting on MCYT100. Therefore, when using case 3, a representative validation set is necessary; when using case 4, we can simply train for a large number of batches and use the final models, which are generally also the bestperforming ones. Detailed EERs are given in Table 4.
Synthetic?  Datasets  Scenarios  

T5  T1  
X  X  MCYT100  1.99  6.21 
X  1.92  6.37  
X  2.16  6.07  
1.71  5.50  
X  X  SVCTask2  5.04  13.62 
X  4.98  13.83  
X  5.05  12.58  
4.65  11.96 
Comparison with stateoftheart
Datasets  Methods  Number of  Threshold  
templates  Global  Userspecific  
MCYT100  SRSS based on model [Diaz et al.2018]  1  13.56   
SynSig2Vec (Ours)  1  5.50  2.15  
Symbolic representation [Guru et al.2017]  5  5.70  2.20  
DTW warping path score [Sharma and Sundaram2018]  5  2.76  1.15  
DTW with SCC [Xia et al.2017]  5    2.15  
Recurrent adaptation networks [Lai and Jin2019]  5  1.81    
SynSig2Vec (Ours)  5  1.71  0.93  
SVCTask2  SRSS based on model [Diaz et al.2018]  1  18.25   
SynSig2Vec (Ours)  1  11.96  7.34  
DTW warping path score [Sharma and Sundaram2018]  5  7.80  2.53  
DTW with SCC [Xia et al.2017]  5    2.63  
SynSig2Vec (Ours)  5  4.65  2.63 
In Table 5, we compare our method with stateoftheart methods on the MCYT100 and the SVCTask2 datasets. Our method has achieved substantial improvements over the previous methods, especially in scenario T1 where only one template is available. In scenario T1, our method reduces the EERs by 59.4% (=(13.565.50)/13.56*100%) and 34.5% (=(18.2511.96)/18.25*100%) on the MCYT100 and the SVCTask2 datasets, respectively.
Limitations and future work
Some issues in SynSig2Vec need further study, such as the effects of signature distortion levels and the number of synthetic signatures. Besides, SynSig2Vec only uses the pendown components and is expected to have further improvements in future work by considering the penups if available.
Conclusion
In this paper, we propose to learn dynamic signature representations through ranking synthesized signatures. The model is introduced to synthesize two groups of signatures for each given genuine signature. Signatures in the first group have lower distortion levels and should rank higher according to the similarity to the template signature, as compared to those in the second group. We construct a lightweight onedimensional CNN to learn to rank these synthesized samples, and incorporate the AP of ranking into the loss function for optimization. Our method only requires genuine signatures for training, yet substantially improves the stateoftheart performance on two public benchmarks. Particularly, when only one template signature is available for the verifier, our method surpasses the stateoftheart on the MCYT100 benchmark by 8.06%, and on the SVCTask2 benchmark by 6.29%, showing its significant effectiveness and great potential.
References
 [Ahrabian and Babaali2018] Ahrabian, K., and Babaali, B. 2018. Usage of autoencoders and Siamese networks for online handwritten signature verification. Neural Computing and Applications 1–14.
 [Bhattacharya et al.2017] Bhattacharya, U.; Plamondon, R.; Chowdhury, S. D.; Goyal, P.; and Parui, S. K. 2017. A sigmalognormal modelbased approach to generating large synthetic online handwriting sample databases. International Journal on Document Analysis and Recognition (IJDAR) 20(3):155–171.
 [Diaz et al.2018] Diaz, M.; Fischer, A.; Ferrer, M. A.; and Plamondon, R. 2018. Dynamic signature verification system based on one real signature. IEEE Transactions on Cybernetics 48(1):228–239.
 [Diaz et al.2019] Diaz, M.; Ferrer, M. A.; Impedovo, D.; Malik, M. I.; Pirlo, G.; and Plamondon, R. 2019. A perspective analysis of handwritten signature technology. ACM Computing Surveys (CSUR) 51(6):117.
 [Guru et al.2017] Guru, D.; Manjunatha, K.; Manjunath, S.; and Somashekara, M. 2017. Interval valued symbolic representation of writer dependent features for online signature verification. Expert Systems with Applications 80:232–243.
 [Impedovo and Pirlo2008] Impedovo, D., and Pirlo, G. 2008. Automatic signature verification: The state of the art. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 38(5):609–635.
 [Klambauer et al.2017] Klambauer, G.; Unterthiner, T.; Mayr, A.; and Hochreiter, S. 2017. Selfnormalizing neural networks. In Advances in Neural Information Processing Systems, 971–980.
 [Lai and Jin2019] Lai, S., and Jin, L. 2019. Recurrent adaptation networks for online signature verification. IEEE Transactions on Information Forensics and Security 14(6):1624–1637.
 [OrtegaGarcia et al.2003] OrtegaGarcia, J.; FierrezAguilar, J.; Simon, D.; Gonzalez, J.; FaundezZanuy, M.; Espinosa, V.; Satue, A.; Hernaez, I.; Igarza, J.J.; Vivaracho, C.; et al. 2003. MCYT baseline corpus: A bimodal biometric database. IEE ProceedingsVision, Image and Signal Processing 150(6):395–401.
 [O’Reilly and Plamondon2009] O’Reilly, C., and Plamondon, R. 2009. Development of a Sigma–Lognormal representation for online signatures. Pattern Recognition 42(12):3324–3337.
 [Park, Kim, and Choi2019] Park, C.Y.; Kim, H.G.; and Choi, H.J. 2019. Robust online signature verification using longterm recurrent convolutional network. In IEEE International Conference on Consumer Electronics (ICCE), 1–6. IEEE.
 [Plamondon and Lorette1989] Plamondon, R., and Lorette, G. 1989. Automatic signature verification and writer identification—the state of the art. Pattern Recognition 22(2):107–131.
 [Plamondon and Srihari2000] Plamondon, R., and Srihari, S. N. 2000. Online and offline handwriting recognition: A comprehensive survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(1):63–84.
 [Plamondon1995] Plamondon, R. 1995. A kinematic theory of rapid human movements. Biological Cybernetics 72(4):295–307.
 [Sharma and Sundaram2018] Sharma, A., and Sundaram, S. 2018. On the exploration of information from the dtw cost matrix for online signature verification. IEEE Transactions on Cybernetics 48(2):611–624.
 [Song et al.2016] Song, Y.; Schwing, A.; Urtasun, R.; et al. 2016. Training deep neural networks via direct loss minimization. In International Conference on Machine Learning, 2169–2177.
 [Tolosana et al.2018] Tolosana, R.; VeraRodriguez, R.; Fierrez, J.; and OrtegaGarcia, J. 2018. Exploring recurrent neural networks for online handwritten signature biometrics. IEEE Access 6:5128–5138.
 [Triantafillou, Zemel, and Urtasun2017] Triantafillou, E.; Zemel, R.; and Urtasun, R. 2017. Fewshot learning through an information retrieval lens. In Advances in Neural Information Processing Systems, 2255–2265.
 [Wu et al.2019] Wu, X.; Kimura, A.; Uchida, S.; and Kashino, K. 2019. Prewarping Siamese network: Learning local representations for online signature verification. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2467–2471. IEEE.
 [Xia et al.2017] Xia, X.; Chen, Z.; Luan, F.; and Song, X. 2017. Signature alignment based on gmm for online signature verification. Pattern Recognition 65:188–196.
 [Yeung et al.2004] Yeung, D.Y.; Chang, H.; Xiong, Y.; George, S.; Kashi, R.; Matsumoto, T.; and Rigoll, G. 2004. SVC2004: First international signature verification competition. In International Conference on Biometric Authentication, 16–22. Springer.
 [Yue et al.2007] Yue, Y.; Finley, T.; Radlinski, F.; and Joachims, T. 2007. A support vector method for optimizing average precision. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 271–278. ACM.