A New Classification Approach for Robotic Surgical Tasks Recognition
Automatic recognition and classification of tasks in robotic surgery is an important stepping stone toward automated surgery and surgical training. Recently, technical breakthroughs in gathering data make data-driven model development possible. In this paper, we propose a framework for high-level robotic surgery task recognition using motion data. We present a novel classification technique that is used to classify three important surgical tasks through quantitative analyses of motion: knot tying, needle passing and suturing. The proposed technique integrates state-of-the-art data mining and time series analysis methods. The first step of this framework consists of developing a time series distance-based similarity measure using derivative dynamic time warping (DDTW). The distance-weighted -nearest neighbor algorithm was then used to classify task instances. The framework was validated using an extensive dataset. Our results demonstrate the strength of the proposed framework in recognizing fundamental robotic surgery tasks.
A New Classification Approach for Robotic Surgical Tasks Recognition
Mehrdad J. Bani, and Shoele Jamali
00footnotetext: Manuscript received June 26, 2017; revised June 26, 2017.00footnotetext: M. J. Bani and S. Jamali are with the Department of Computer Science, Shahid Beheshti University, Tehran, IRAN e-mail: (email@example.com).
Classification, Derivative dynamic time warping (DDTW), -nearest neighbor, Robotic-assisted surgery, Task recognition.
T He hospital operating room is a challenging work environment. Recently, some of these challenges have been addressed by introducing technological innovations such as Robotic Surgery [1, 2], which promises to improve patient treatment by enabling shorter hospital stays, shortening recovery time and reducing the risk of infection. Current implementations operate in a tele-operation mode where the robotic surgery system relies exclusively on direct surgeon input.
Future advances will automate more aspects of robotic surgery procedures [3, 4]. It is, however, quite clear that to develop such an autonomous systems, a more rigorous model of surgical procedures is needed. Surgical motions need to be modeled and quantified to make them amenable for further study. Goal-oriented human motion and human language are analogous as both of them consist of a low-level elements that, when combined in meaningful sequences, result in an emergent meaning or higher-level task. Hence, techniques that have effectively been applied in the analysis of human speech and language are natural candidates to apply to surgical motion modeling. Consequently, the “Language of Surgery” has been defined as a systematic description of surgical activities and rules for decomposition . More specifically, the language of surgical motion includes describing particular activities that are performed by surgeons with their instruments or hands to accomplish a planned surgical objective. Current systems like da Vinci (Intuitive Surgical, Sunnyvale, CA)  record motion and video data, enabling development of computational models to recognize and analyze surgical performance through data-driven approaches. Recent advances in data mining research for uncovering concealed patterns in huge dataset, like kinematic and video data, offer the possibility to better understand surgical procedures from a system point of view. Thus, the key step for advance research in surgical task recognition is to develop techniques that are capable of accurately recognizing fundamental surgical tasks such as suturing, knot tying and needle passing.
In this paper we extend the  by present a new framework to classify robotic-assisted surgical tasks based on Derivative Dynamic Time Warping (DDTW) with the well-known distance-weighted -nearest neighbor (NN) classification method.
Ii related work
In recent years, recognizing and understanding surgical procedures at different levels of granularity has been a focus of research [8, 9]. Surgical procedures can be generally broken down to four main levels, from higher to lower: phases, steps, tasks and motions . At the higher level of surgical process modeling, statistical models have been proposed using recorded force and motion data [10, 11], surgical tool usage  and video data  to classify surgery phases. Most existing work has addressed the recognition of activities using different techniques such as neural networks and Hidden Markov Models (HMM) [14, 15]. At the lower level, effort has been applied to detect surgical motion [16, 17] or model surgical gestures and classify them using different methods such as HMM and Linear Discriminant Analysis (LDA) . A common drawback in these methods is that they are time consuming and require significant human interaction and pre-processing.
While many of the studies in the literature focused on detecting surgical motion at the more granular level [19, 20], developing quantitative classification techniques that can be used as a framework to differentiate important tasks during surgical procedures needs to be investigated. Here, a task is defined as a sequence of activities used to achieve a surgical objective . This work focuses on three fundamental tasks during robotic-assisted minimally invasive surgery: suturing, knot tying and needle passing. These tasks are commonly part of a surgical skills training curriculum . With the advent of robotic surgery devices, a huge amount of data, including temporal kinematic signal, can be captured during surgeries. Our work seeks to take this information and build a framework to recognized three main robotic surgery tasks by measuring similarities between their temporal data and underlying signatures.
Dynamic Time Warping (DTW)  is a well-known technique for time series classification . In the surgical procedure application, it has been used to classify surgical processes  and surgical gestures . While DTW has been successfully used in many domains, it may however, fail to find the obvious natural alignments between two sequences when they have significant difference in their signal function over time. Thus, Derivative Dynamic Time Warping (DDTW) was proposed in  and it has been shown to provide promising results to address this issue. The similarity that has been derived from DTW or DDTW, can be used as an input to the -Nearest Neighbors algorithm (NN), a popular classification method, to classify a new data based on its similarity to other sample data [25, 26].
The main focus of our work is to investigate the feasibility of task classification during robotic-assisted surgery . This is in contrast to most work in this domain, which used video data or observation-based methods . We develop distance-weighted NN classification method using similarity measure derived from DTW and DDTW. Our work differs from previous studies in the sense that we use only Cartesian data of both right and left hand tool position with minimum pre-processing that results in simple, straightforward and accurate framework.
The aim of our work is to recognize robotic surgery tasks. As noted before, this work focus on three important fundamental robotic surgery tasks: knot tying, needle passing and suturing. These tasks are part of a fundamental laparoscopic surgery (FLS) skills training program [28, 29]. The classification framework that is developed in this study, contains of three key components. The first component is quantitative measures of the different tasks. We analyze motion data from robotic surgery device to extract multivariate time series datasets that represent different tasks. After preprocessing and normalization of data, the subsequent step is measuring the similarity between different surgical tasks. In this study we employ DDTW to measure similarity between multidimensional time series data. The third component is the classification algorithm, which is based on the distance-weighted -nearest neighbor approach. The combination of these three steps results in a novel task classification framework for robotic surgery data. Figure 1 shows the summary of our proposed framework. In the following sections, each step in the framework will be discussed in detail.
Iii-a Quantification of robotic surgery Tasks
In this study, we implement our model using “JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS)”  where data have been gathered using a da Vinci surgical system . This surgical activity data includes different sorts of robotic-assisted surgery features, such as surgeon kinematic and video data during surgery procedures that has been captured by an Application Programming Interface (API). Using a da Vinci, a surgeon operates passive master tool manipulators (MTMs), directing resultant teleoperated movement in active patient-side manipulators (PSMs). Time series data for each of the robot arms (MTMs and PSMs) has been gathered for three fundamental tasks: knot tying, needle passing and suturing.
Iii-B Similarity Measures
The choice of method for measuring (dis)similarity is a critical step in achieving valid classification results. One of the primary issues to measure the similarity between two time series using a distance measurement methods such as Euclidean distance is that the outcomes can, in some cases, be exceptionally unintuitive due to sensitivity to distortion in the time axis (Fig. 2). If, for instance, two time series are indistinguishable, however slightly out of phase with one another, then a distance measure such as the Euclidean distance will give an extremely poor similarity measure. Dynamic Time Warping (DTW) has been developed to overcome this problem . In this work, we propose a novel implementation of DTW and a related method, DDTW, for time series data of robotic surgery tasks.
Dynamic time warping is a common approach to measure the dissimilarity between two sets of time series data, even if the lengths of the time series do not match. DTW can find an optimal alignment between two time-dependent sequences under specific constraints. Essentially, the sequences are warped in a nonlinear fashion to match each other. Given two -dimensional time series and where and have and dimension respectively, these two sequences can be arranged as matrix like the sides of a grid (Fig. 3) in which the distance between every possible combination of time instances and is stored. Both sequences start on the bottom left of the grid. For multidimensional DTW, we use the well-known Euclidean distance measure to find a distance between two -dimensional sequences (Eq. III-B).
To find the best match between two sequences, a path through the grid that minimizes the overall distance between them is needed. In order to compute overall distance, all possible routes through the grid must be found. Then, the overall distance is calculated to be the minimum of the sum of the distances between the individual elements on the path divided by the sum of the weighting function. It is evident that for long sequences the quantity of conceivable ways through the network will be very large. Several constraints such as monotonicity, continuity, boundary, slope constraint and warping window constraint apply to limit the moves that can be produced from any point in the path. Among those, warping window which can be defined as subset of the matrix that is available, should be provided as an input parameter to the model.
The power of the DTW algorithm is that rather than exploring every conceivable path through the grid keeps track of the cost of the best path. Thus, DTW distance can be formulated as a dynamic programming problem. Using a dynamic programming approach, the warp path must either be incremented by one unit or stay at the same -axis or -axis. Therefore, one can formulate it as recurrence of cumulative distance, defined as:
where can be calculated using Equation (III-B) and . DDTW is a modification of DTW to consider higher-level features of a sequence's shape instead of Y-axis values of data points. In some application when a feature such as peak or valley in one sequence is little higher or lower than corresponding feature in another sequence, DTW may neglect to discover this type of alignment (Fig. 2) . Thus DTW may fail to find obvious natural alignments between time series data of two instances of the same sequence. To address this issue, Derivative Dynamic Time Warping (DDTW) algorithm has been proposed . The proposed framework adapts DDTW by taking the first derivative of the sequence of time series data for different robotic surgery tasks. Considering simplicity and generality, the following estimate for the derivative of each point in time series is used:
This estimate is the average of the slope of the line through the point and its left neighbor , and the slope of the line through the left neighbor and the right neighbor . Like DTW, an matrix is constructed that contains the distance between and using the square of difference of and , the estimated derivative of and .
Iii-C Weighted -Nearest Neighbor Classification
The -Nearest Neighbors algorithm (NN) is a non-parametric instance-based method used for classifying a new data based on the majority label of its nearest neighbors in the training set . The most significant difference between instance-based classifiers and other classification methods is that unlike other sophisticated methods in this domain, it does not require knowledge of underlying patterns in data. It is intuitive that observations which are close together based on some appropriate metric will have the same class label. Thus, simplicity, effectiveness, intuitiveness and accuracy of NN suggests its use in many areas. A refinement of this classification algorithm is distance-weighted NN in which each of the neighbors weight the evidence of a neighbor close to an unclassified observation more heavily than others with the greater distance to the query observation [30, 31]. Let us define the nearest neighbor of query as and as a distance between th nearest neighbor and . Then a weight attributed to th nearest neighbor can be defined as
Thus, the classification result of the query can be made as
According to the Eq. (III-C), a neighbor with smaller distance has more weight than the one with greater distance. The balance of simplicity on one hand and accuracy on the other hand led us to choose this method for our time series robotic surgery task classification. The only parameter that needs to be provided is . In general, a small value of means that any noise present with the data will have a higher influence on the result, however, a large value lets the samples of the other classes get included in the neighborhood of test data, resulting in poor classification and high computational expense. In order to find the best value of to maximize the classification performance, we will train the model by examining the accuracy as a function of .
Iv experimental setup
In this section, we will describe the dataset that is used for evaluating the proposed time series classification framework for robotic-assisted surgical tasks along with detail of implementation and performance evaluation.
Iv-a Dataset Description
For each of the three tasks, we analyze kinematic data captured using the API of the da Vinci at 30 Hz. The data includes 19 kinematic variables for Cartesian position, rotation matrix, linear velocities, angular velocities and a gripper angle. The left and right MTMs, and the left and right PSMs are included in the 76-dimensional dataset. We build our model using 3D Cartesian position () data from both right and left PSMs. The JIGSAWS includes data from eight right-handed surgeons where all of them repeated each surgical task five times (i.e. trials) .
Iv-B Implementation Details
Separating data into training and testing sets is a vital step of any model evaluation. For classification methods, the training set is used to discover initial patterns in data, while the testing set helps us evaluate whether or not the recognized patterns hold. One of the popular methods in this regard is stratified -fold cross validation with an equal proportion of classes in each fold to reduce the bias of training and test data . In each run, -1 out of folds are used for training and the remaining one fold is used for testing. We chose the widely accepted 10-fold cross validation method, and for the sake of comparison also used the Leave-One-Out (LOO), which is a special case of -fold cross validation when = and is the total number of data points. In each fold all but one observation is used for training and the left out observation is tested. One hundred replications were conducted for each method to get more robust results and the average and standard deviation is reported in the results section. It should also note that in preliminary analysis the choice of different warping window size does not affect the results significantly. Hence, we set the window size to 100 for all analyses which resulted in minimum parameter tuning for the DTW method .
Iv-C Performance Evaluation
The first step in performance evaluation is to tabulate the results of all classifications into a corresponding confusion matrix (Table I). The correctness of a classification can be assessed by calculating the number of correctly classified instances, called true positives (TPs). True negatives (TNs) are the number of correctly classified instances that do not belong to the class. If a data point is incorrectly assigned to the class it is a false positive (FPs), and if it is not classified as class instances it is a false negative (FNs).
|Task 1||Task 2||Task 3|
Based on the values in the confusion matrix, different classification performance measurements are widely used, such as accuracy, sensitivity and specificity. Accuracy measures the fraction of correctly classified data. Sensitivity measures proportion of positive instances that are classified as positive and specificity measures proportion of negative instances that are classified as negative. In this paper we evaluate our classification framework for three class of tasks . One important factor to consider is the number of data points in each class. Since we do not have an equal number of cases of each task, we modified the measurement for multi-class classification by adding which is and is number of instances in class and is total number of instances in dataset.
where in above equations, C refers to number of classes.
V Results and Discussion
In this section we provide the experimental results from using the proposed classification framework on three robotic surgery tasks. Two similarity methods, DTW and DDTW, were used to measure the pairwise distance between tool-tip paths during surgical tasks. Before that we need to train our classifier for the best value of -nearest neighbors.
Figure 5 represents the accuracy as a function of using different similarity measures and validation techniques. As mentioned earlier, 10-fold and LOO cross validations are employed in this study. We can observe that the NN classification using DTW similarity measurement achieves its best performance when =6 for both 10-fold and LOO cross validation techniques. However, DDTW best performance achieves when =3. Also it can be observed that DDTW is more sensitive to the value of compared to DTW. This may be due to the smoothing properties of the derivative, which may mask unique features in the data that would be required to distinguish a task among a larger number of potential classes.
We obtain the result of accuracy as function of to identify the best classification scheme on robotic surgery task dataset. The accuracy of the best scheme for three different scenarios of using only right hand, left hand or both data with 10-fold and LOO cross validations are listed in Table II. For two-handed Cartesian data, DTW-NN achieved a top accuracy of 99.4%, while for DDTW-NN the highest accuracy was 93.6%.
From Table II, DTW is shown to consistently out-perform DDTW. This implies that the DTW method is capable of capturing specific patterns in surgical tool tip time series path. Thus, despite the promising result from DDTW in other domains, it might not give a higher accuracy compared to DTW for robotic surgery data. This will lead us to conclude that the local differences in position of surgical device tool tip over time for each task is very important. Knowing that DDTW is designed to not be sensitive to sudden peaks and valleys (compared to DTW), we can conclude that these peaks and valleys are a meaningful feature of robotic surgery tasks. Removing those features resulted in losing some information required for proper classification.
It is worth noting that the NN classification method is sensitive to training set size. The size of the training set increases with a higher number of folds in -fold cross validation. Consequently, we would expect NN to perform better with LOO compared to 10-fold in terms of both better accuracy and lower standard deviation (Table II).
Figure 6 compares DTW and DDTW for each surgical task using data from both tool tips. It clearly shows that DTW gives the best performance for all tasks. All suturing and needle passing tasks can be correctly classified while only one of the knot tying is misclassified as needle passing. Also, knot tying and suturing have the best specificity, which means that fewer tasks were misclassified as suturing or knot tying. One can conclude that all these tasks have the unique features that make them recognizable among different surgeons with different expertise.
In this study we pursued the open question of the classifiability of fundamental surgical tasks in robotic-assisted minimally invasive surgery. We proposed a three-step classification framework for RMIS task recognition. Our method analyzes motion trajectory data obtained from the API of a da Vinci robotic surgery device. We developed distance-weighted -nearest neighbor classification approach that use similarity measures obtained from DTW and DDTW for each task. The performance of the proposed framework based on the experimental results are encouraging with 99.4% accuracy. This result establishes the feasibility of applying time series classification methods on RMIS tool tip position data to recognize the three fundamental tasks during robotic minimally invasive surgery (i.e., suturing, knot tying and needle passing). A key advantage of our approach is its simplicity by using only 3D Cartesian movement path of the right and left hand tool tips. Despite the high accuracy that achieved in this study, DTW has polynomial time complexity where is the number of sample in the data and is the length of time series. Thus, the proposed method might not be very efficient as an option when quick task classification is desired. Therefore, future work should investigate for more computationally efficient methods to measure similarity between motion paths.
Furthermore, reliable classification is possible in light of the fact that time series features of these three tasks are differentiable from each other. This approach can be applied in a straightforward manner for development of an online gesture recognition system during robotic-assisted surgery. It can also facilitate robotic surgical skill assessment and training curriculum . Perhaps most excitingly, this framework can lay the groundwork towards development of semi-autonomous robot behaviors, such as automatic camera control during robotic-assisted surgery by detecting the task that is being performed. However, a prior step to that is to test the performance of our model in a real surgical environment in the present of other tasks or possible noise. Thus, there may be utility in extending our work by adding noise or other tasks (beside those in the training set) to the data in order to build a more robust task recognition method.
-  F. Lalys and P. Jannin, “Surgical process modelling: a review,” International journal of computer assisted radiology and surgery, vol. 9, no. 3, pp. 495–511, 2014.
-  M. J. Fard, S. Ameri, R. B. Chinnam, and R. D. Ellis, “Soft boundary approach for unsupervised gesture segmentation in robotic-assisted surgery,” IEEE Robotics and Automation Letters, vol. 2, no. 1, pp. 171–178, Jan 2017.
-  A. Pandya, L. A. Reisner, B. King, N. Lucas, A. Composto, M. Klein, and R. D. Ellis, “A review of camera viewpoint automation in robotic and laparoscopic surgery,” Robotics, vol. 3, no. 3, pp. 310–329, 2014.
-  M. J. Fard, S. Ameri, and R. D. Ellis, “Toward personalized training and skill assessment in robotic minimally invasive surgery,” in Proceedings of the World Congress on Engineering and Computer Science, vol. 2, 2016.
-  Y. Gao, S. S. Vedula, C. E. Reiley, N. Ahmidi, B. Varadarajan, H. C. Lin, L. Tao, L. Zappella, B. Béjar, D. D. Yuh et al., “JHU-ISI gesture and skill assessment working set (JIGSAWS): A surgical activity dataset for human motion modeling,” in Modeling and Monitoring of Computer Assisted Interventions (M2CAI)â MICCAI Workshop, 2014.
-  G. Guthart and J. K. Salisbury Jr, “The intuitivetm telesurgery system: Overview and application.” in ICRA, 2000, pp. 618–621.
-  M. J. Fard, A. K. Pandya, R. B. Chinnam, M. D. Klein, and R. D. Ellis, “Distance-based time series classification approach for task recognition with application in surgical robot autonomy,” The International Journal of Medical Robotics and Computer Assisted Surgery, pp. n/a–n/a, 2016, rCS-16-0026.R2. [Online]. Available: http://dx.doi.org/10.1002/rcs.1766
-  M. Jahanbani Fard, “Computational modeling approaches for task analysis in robotic-assisted surgery,” 2016.
-  C. E. Reiley, H. C. Lin, D. D. Yuh, and G. D. Hager, “Review of methods for objective surgical skill evaluation,” Surgical endoscopy, vol. 25, no. 2, pp. 356–366, 2011.
-  J. Rosen, B. Hannaford, C. G. Richards, and M. N. Sinanan, “Markov modeling of minimally invasive surgery based on tool/tissue interaction and force/torque signatures for evaluating surgical skills,” Biomedical Engineering, IEEE Transactions on, vol. 48, no. 5, pp. 579–591, 2001.
-  J. Rosen, M. Solazzo, B. Hannaford, and M. Sinanan, “Task decomposition of laparoscopic surgery for objective evaluation of surgical residents’ learning curve using hidden markov model,” Computer Aided Surgery, vol. 7, no. 1, pp. 49–61, 2002.
-  T. Blum, N. Padoy, H. Feussner, and N. Navab, “Modeling and online recognition of surgical phases using Hidden Markov Models,” Medical image computing and computer-assisted intervention : MICCAI, vol. 11, no. Pt 2, pp. 627–35, Jan. 2008.
-  F. Lalys, L. Riffaud, D. Bouget, and P. Jannin, “A framework for the recognition of high-level surgical tasks from video images for cataract surgeries.” IEEE transactions on bio-medical engineering, vol. 59, no. 4, pp. 966–76, Apr. 2012.
-  D. Sanchez, M. Tentori, and J. Favela, “Activity recognition for the smart hospital,” Intelligent Systems, IEEE, vol. 23, no. 2, pp. 50–57, 2008.
-  M. J. Fard, S. Ameri, S. R. Hejazi, and A. Z. Hamadani, “One-unit repairable systems with active and standby redundancy and fuzzy parameters: Parametric programming approach,” International Journal of Quality & Reliability Management, vol. 34, no. 3, pp. 446–458, 2017. [Online]. Available: https://doi.org/10.1108/IJQRM-05-2011-0075
-  H. C. Lin, I. Shafran, D. Yuh, and G. D. Hager, “Towards automatic skill evaluation: Detection and segmentation of robot-assisted surgical motions,” Computer Aided Surgery, vol. 11, no. 5, pp. 220–230, 2006.
-  C. E. Reiley, H. C. Lin, B. Varadarajan, B. Vagvolgyi, S. Khudanpur, D. D. Yuh, and G. D. Hager, “Automatic recognition of surgical motions using statistical modeling for capturing variability,” Studies in health technology and informatics, vol. 132, no. 1, pp. 396–401, Jan. 2008.
-  C. E. Reiley and G. D. Hager, “Task versus subtask surgical skill evaluation of robotic minimally invasive surgery,” in Medical Image Computing and Computer-Assisted Intervention–MICCAI 2009. Springer, 2009, pp. 435–442.
-  N. Padoy, T. Blum, S.-A. Ahmadi, H. Feussner, M.-O. Berger, and N. Navab, “Statistical modeling and recognition of surgical workflow,” Medical Image Analysis, vol. 16, no. 3, pp. 632–641, 2012.
-  L. Zappella, B. Béjar, G. Hager, and R. Vidal, “Surgical gesture classification from video and kinematic data.” Medical image analysis, vol. 17, no. 7, pp. 732–45, Oct. 2013.
-  D. Bernad, “Finding patterns in time series: a dynamic programming approach,” Advances in knowledge discovery and data mining, 1996.
-  T.-c. Fu, “A review on time series data mining,” Engineering Applications of Artificial Intelligence, vol. 24, no. 1, pp. 164–181, 2011.
-  G. Forestier, F. Lalys, L. Riffaud, B. Trelhu, and P. Jannin, “Classification of surgical processes using dynamic time warping,” Journal of biomedical informatics, vol. 45, no. 2, pp. 255–264, 2012.
-  E. J. Keogh and M. J. Pazzani, “Derivative Dynamic Time Warping,” Proceedings of the 1st SIAM International Conference on Data Mining, pp. 1–11, 2001.
-  M. J. Fard, P. Wang, S. Chawla, and C. K. Reddy, “A bayesian perspective on early stage event prediction in longitudinal data,” IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 12, pp. 3126–3139, Dec 2016.
-  N. Bhatia et al., “Survey of nearest neighbor techniques,” International Journal of Computer Science and Information Security, vol. 8, no. 2, pp. 302–305, 2010.
-  M. J. Fard, S. Ameri, R. B. Chinnam, A. K. Pandya, M. D. Klein, and R. D. Ellis, “Machine learning approach for skill evaluation in robotic-assisted surgery,” in Proceedings of the World Congress on Engineering and Computer Science, vol. 1, 2016.
-  G. M. Fried, L. S. Feldman, M. C. Vassiliou, S. A. Fraser, D. Stanbridge, G. Ghitulescu, and C. G. Andrew, “Proving the value of simulation in laparoscopic surgery,” Annals of surgery, vol. 240, no. 3, p. 518, 2004.
-  M. J. Fard, S. Chawla, and C. K. Reddy, “Early-stage event prediction for longitudinal data,” in Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 2016, pp. 139–151.
-  S. A. Dudani, “The distance-weighted k-nearest-neighbor rule,” Systems, Man and Cybernetics, IEEE Transactions on, no. 4, pp. 325–327, 1976.
-  M. J. Fard, S. Ameri, and A. Zeinal Hamadani, “Bayesian approach for early stage reliability prediction of evolutionary products,” in Proceedings of the International Conference on Operations Excellence and Service Engineering. Orlando, Florida, USA, 2015, pp. 361–371.
-  R. Kohavi et al., “A study of cross-validation and bootstrap for accuracy estimation and model selection,” in Ijcai, vol. 14, no. 2, 1995, pp. 1137–1145.
-  C. A. Ratanamahatana and E. Keogh, “Three myths about dynamic time warping data mining,” in Proceedings of SIAM International Conference on Data Mining (SDMâ05). SIAM, 2005, pp. 506–510.
-  M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,” Information Processing & Management, vol. 45, no. 4, pp. 427–437, 2009.
-  M. J. Fard, S. Ameri, R. D. Ellis, R. Chinnam, A. K. Pandya, B., and M. D. Klein, “Automated robotâassisted surgical skill evaluation: Predictive analytics approach,” The International Journal of Medical Robotics and Computer Assisted Surgery, pp. n/a–n/a, 2017. [Online]. Available: http://dx.doi.org/10.1002/rcs.1850