Dynamic classifier chains for multilabel learning
Abstract
In this paper, we deal with the task of building a dynamic ensemble of chain classifiers for multilabel classification. To do so, we proposed two concepts of classifier chains algorithms that are able to change label order of the chain without rebuilding the entire model. Such modes allows anticipating the instancespecific chain order without a significant increase in computational burden. The proposed chain models are built using the Naive Bayes classifier and nearest neighbour approach as a base singlelabel classifiers. To take the benefits of the proposed algorithms, we developed a simple heuristic that allows the system to find relatively good label order. The heuristic sort labels according to the labelspecific classification quality gained during the validation phase. The heuristic tries to minimise the phenomenon of error propagation in the chain. The experimental results showed that the proposed model based on Naive Bayes classifier the abovementioned heuristic is an efficient tool for building dynamic chain classifiers.
Keywords:
multilabel, classifier chains, naive bayes, dynamic chains, nearest neighbour1 Introduction
Under wellknown singlelabel classification framework, an object is assigned to only one class which provides a full description of the object. However, many realworld datasets contain objects that are assigned to different categories at the same time. All of these categories constitute a full description of the object. Omitting of one of these concepts induces a loss of information. Classification process in which such kind of data is involved is called multilabel classification Gibaja2014 (). A great example of a multilabel dataset is a gallery of tagged photos. Each photo may be described using such tags as mountains, sea, forest, beach, sunset, etc. Multilabel classification is a relatively new idea that is explored extensively for last two decades. As a consequence, it was employed in a wide range of practical applications including text classification Jiang2012 (), multimedia classification Sanden2011 () and bioinformatics Wu2014 () to name a few.
Multilabel classification algorithms can be broadly partitioned into two main groups i.e. dataset transformation algorithms and algorithm adaptation approaches Gibaja2014 ().
Methods belong to the group of algorithm adaptation approaches provides a generalisation of an existing multiclass algorithm. The generalised algorithm is able to solve multilabel classification problem in a direct way. Among the others, the most known approaches from this group are: multi label KNN algorithm Jiang2012 (), the Structured SVM approach Diez2014 () or deeplearningbased algorithms Wei2015 ().
In this paper, we investigate only dataset transformation algorithms that decompose a multilabel problem into a set of singlelabel classification tasks. To reconstruct a multilabel response, during the inference phase, outputs of the underlying singlelabel classifiers are combined in order to create a multilabel prediction.
Let’s focus on one of the simplest decomposition methods. That is the binary relevance (BR) approach that decomposes a multilabel classification task into a set of onevsrest binary classification problems AlvaresCherman2010 (). This approach assumes that labels are conditionally independent. However the assumption does not hold in most of reallife recognition problems, the BR framework is one of the most widespread multilabel classification methods Tsoumakas_Katakis_Vlahavas_2008 (). This is due to its excellent scalability and acceptable classification quality Luaces2012 ().
To preserve scalability of BR systems, and provide a model of interlabel relations, Read et al. Read2009 (); Read2011 () provided us with the Classifier Chain model (CC) which establish a linked chain of modified onevsrest binary classifiers. The modification consists of an extension of the input space of singlelabel classifiers along the chain sequence. To be more strict, for a given label sequence, the feature space of each classifier along the chain is extended with a set of binary variables corresponding to the labels that precede the given one. The model implies that, during the training phase, input space of given classifier is extended using the groundtruth labels extracted from the training set. During the inference step, due to lack of the groundtruth labels, we employ binary labels predicted by preceding classifiers. The inference is done in a greedy way that makes the best decision for each of considered labels. That is, the described approach passes along the chain, information allowing CC to take into account interlabel relations at the cost of allowing the labelpredictionerrors to propagate along the chain Read2011 (). This way of performing classification induces a major drawback of the CC system. That is, the CC classifier uses a kind of greedy strategy during the inference phase. This design allows classification errors to propagate along the chain. As a consequence, the performance of a chain classifier strongly depends on chain configuration Senge2013 (). To overcome these effects, the authors suggested to generate an ensemble of chain classifiers (ECC). The ensemble consists of classifiers trained using different label sequences Read2009 ().
The originally proposed ECC ensemble uses randomly generated label orders. Additionally, each chain classifier is built using a resampled dataset. This approach provides an additional diversity into the ensemble classifier. This simple, yet effective approach allows improving the classification quality significantly in comparison to single chain classifier. However, the intuition says that there is still room for improvement when we employ a more datadriven approach.
Indeed, later research shows that the members of the ensemble may be chosen in such a way that provides further improvement of classification quality. That is, Read et al. proposed a strategy which uses Monte Carlo sampling to explore the label sequence space in order to find a classifier chain that offers the highest classification quality Read2014 (). Another approach was proposed by Liu et al. Liu2010 (). They introduced a method that builds a model of interlabel relations as a directed acyclic graph (DAG). The weights of the graph are calculated using confusion and support for each pair of labels. Then, the ensemble is generated using topological sorting of the graph. Chen et al. Chen2016 () proposed a method that makes clusters of labels. Then, for each cluster of labels, an undirected graph describing interlabel relations is built. Then, a minimum spanning tree is created for the graph. After that, breadthfirst search algorithm determines sequences for a clusterspecific ensemble of CC classifiers. A similar approach was proposed by Huang et al Huang2015 (). They proposed to build the clusters using a metaspace that mixes input space and label space. Then interlabel relations are modeled using correlation. The model is expressed using DAG structure. Finally, the CC classifier is built for each cluster. The chain structure may also be induced using Bayesian Network approach Zhang2014 ().
Chain sequence can be also found using metaheuristic approach. That is, Goncalves et al. developed a strategy that utilises a genetic algorithm (GA) to find a good chain structure for the entire dataset Goncalves2013 (); Gonalves2015 (). The proposed approach using wrapperbased approach. That is each chromosome codes different chain order. To evaluate those label orders each corresponding classifier must be built and evaluated using a validation set. A similar approach was also used by Trajdos and Kurzynski who proposed to use a multiobjective genetic algorithm to optimize classification quality and chain diversity simultaneously Trajdos2017 (). Although those methods are rather timeconsuming, they provide a significant improvement in terms of classification quality.
Another way of dealing with the error propagation is to build a classifier that combines CC algorithm BRbased approach Montaes2014 (). The authors proposed a stacking based architecture to combine the abovementioned classifiers. That is, the first layer is a simple BR classifier that predicts each label separately. The attribute set of the classifiers from the second layer is extended using all labels except the predicted one. During the training phase, both layers are trained separately. During the prediction phase, on the other hand, classifiers from the second layer mix outcomes of the BR classifiers with the outcomes provides by preceding classifiers. That is, the first classifier of the chain structure has its attribute space extended by the outcomes of the BR classifier. The second one uses the prediction of the first one and the remaining attributes are taken from the prediction of the BR classifier. Finally, the last classifier along the chain has the attribute space extended using only labels predicted by the preceding classifiers. Another way of combining the CC classifier with the BR classifier is described in Madjarov2012 ().
The previously cited methods build ensemble structure during the training procedure. Consequently, throughout this paper, this kind of methods will be called static methods. The dynamic chain classifiers, on the other hand, determines the best label order at the prediction phase daSilva2014 (). The abovementioned classifier produces a set of randomly generated label sequences and then validates the chain classifiers. During the validation phase, each point from the validation set is assigned with a label order that produces the most accurate output vector for this point. As the experimental research shows, the dynamic methods of building a label order may achieve better classification quality daSilva2014 ().
We observed that during the building of a dynamic chain classifier, multiple chain classifiers must be learned. These classifiers are built using the same training set and differ only in chain order. As a consequence, the computational burden of the algorithm may be reduced if there exists a classifier that is trained once and changing the label sequence is done without rebuilding the model. To address this issue, we built two models based on the Naive Bayes Hand2001 () approach and the nearest neighbour approach Cover1967 () that meet the abovementioned properties.
Additionally, we proposed a dynamic method of determining the chain order based on classification quality for each label separately.
A part of this paper was previously published in Trajdos2017b (). This paper is an extended version of the previouslypublished work. The main elements that has been changed/extended:

The literature review has been extended.

We have added the results of the experimental comparison of the BR and CC versions of different base classifiers.

We have proposed a new model of the dynamic chain classifier. That is, we introduce the CC model based on the nearest neighbour approach.

New experimental results have been provided.
The rest of the paper is organised as follows. Next Section 2 provides a formal description of the multilabel classification problem and describes the developed algorithms. Section 3 contains a description of the conducted experiments. The results are presented and discussed in Section 4. Finally, Section 5 concludes the paper.
2 Proposed Methods
In this section, we introduce a formal notation of multilabel classification problem and provide a description of the proposed method.
2.1 Preliminaries
Under the multilabel (ML) formalism a object is assigned to a set of labels indicated by a binary vector of length : , where denotes the number of labels.
In this paper, we follow a statistical classification framework. As a consequence, it is assumed that object and its set of labels are realizations of corresponding random vectors , and the joint probability distribution on is known.
Because the abovementioned assumption is never meet in real world, in this study, we assume that multilabel classifier , which maps feature space to the set , is built in a supervised learning procedure using the training set containing pairs of feature vectors and corresponding class labels :
(1) 
2.2 Naive Bayes Classifier for Dynamic Classifier Chains
In this paper, we consider ML classifiers build according to the chain rule. That is, the classifier is an ensemble of singlelabel classifiers that constitutes a linked chain which is built according to a permutation of label sequence . As it was mentioned earlier, in this paper we follow the statistical classification framework. Consequently, each singlelabel classifier along with the chain makes its decision according to the following rule:
(2) 
where is a random event defined below:
(3) 
and for
(4) 
Conditioning on the random event instead of allows the chain to take interlabel dependencies into account. The abovementioned classification rule is a greedy rule that calculates the probability (2) using predictions of preceding classifiers. The optimal prediction under the chaining rule may be found using the PCC approach cheng2010bayes (). However, the approach requires the number of calculations that grows exponentially with the number of labels.
The probability defined in (2) is then computed using the Bayes rule:
(5) 
The term does not depend on event . Consequently, the decision rule (2) is rewritten:
(6) 
Now, to improve the readability we simplify the notation:
(7) 
Then, following the Naive Bayes rule, we assume that all random variables that constitute are conditionally independent given . Consequently, is defined using the following formula:
(8) 
Now, it is easy to see that the term , contrary to , depends on the chain structure. Furthermore, all probability distributions used in the abovementioned terms can be estimated during the training phase when the chain structure is unknown.
The training and inference phases are described in detail using pseudocode shown in Algorithms 1 and 2.
Input data:  
 training set;  
BEGIN 

Split into and so that:  
and ,  
;  
Using build estimators of  
the following distributions:  
END 
Input data:  
 input instance;  
 validation set;  
BEGIN 

#Query the BR models  
FOR :  
;  
;  
END FOR;  
Determine label permutaion using and ;  
SET ;  
DO:  
FOR :  
END FOR;  
WHILE();  
RETURN ;  
END 
2.2.1 Computational complexity
In this section, we assess the increase in computational complexity that the proposed algorithm causes.
First of all, it is easy to see that for both the original and the proposed algorithm the number of estimators that must be built to assess is: .
The number of estimators of that must be built is also the same for both classifiers: L.
The key difference is in the number of estimators of that must be built. For the original CC classifier the number of estimators that is built is . On the other hand our method builds estimators.
At the inference phase, the only additional calculations are performed to determine the permutation of labels. Since the validation set is involved in this process, a number of calculations is proportional to .
2.3 KNN Classifier for Dynamic Classifier Chains
In this section, we define a dynamic classifier chain algorithm based on the nearest neighbours approach.The nearest neighbour algorithm is an instancebased classifier that does not build an explicit model of mapping between the feature space and the label space. Instead, the classifier performs classification in a lazy manner. That is, the R nearest instances and then the class is predicted using labels of the neighbour instances.
Let’s begin with the definition of a distance function that depends on label permutation and the position along the chain. The distance function is defined in the extended feature space that combines the input space and the label space. For the first position, the distance is a simple Euclidean distance in the input space:
(9) 
For the other positions, the distance function uses both the input space and the label space:
(10) 
Such defined distance function allows us to make the prediction using chaining rule. Since the distance is modified in order to fit the chain structure. During the inference phase, the distance calculates the extended distance using labels predicted at the preceding steps of the procedure. The abovedefined distance function is used to build the neighbourhood of a given point in the extended feature space: . The neighbourhood contains the closest instances selected from the training set according to the distance function .
Given the neighbourhood, the probability is estimated as follows:
(11) 
The label is also predicted using rule (2).
Input data:  
 training set;  
BEGIN 

Split into and so that:  
and ,  
;  
Save the training set  
END 
Input data:  
 input instance;  
 validation set;  
 Training set;  
BEGIN 

SET: ;  
Determine label permutaion using and ;  
For  
BEGIN  
END  
RETURN ;  
END 
The training procedure is described in Algorithm 3. The procedure is very simple. That is, it splits the original training set into actual training set and the validation set .
The inference procedure begins with assigning undefined values into the prediction vector . Then the predictions are updated sequentially according to the permutation . The precodure is shown in Algorithm 4
2.4 Dynamic Chain order
In this subsection, we define a local measure of classification quality. To do so, we employed a modified version of the wellknown measure.
First of all, we defined a fuzzy neighbourhood in the input space. The neighbourhood of an instance is defined using the following fuzzy set Zadeh1965 ():
(12) 
where each tripplet defines fuzzy set with the membership coefficient . The membership function is defined using gaussian potential function:
(13) 
The distance function is simple euclidean distance and the coefficient is tuned during the experiments.
Then, we define set of points that belongs to given label and that are classified as given label :
(14)  
(15) 
The abovementioned classifier responses are related to the binary relevance classifier that can be built without knowing the order of the chain. The classifier is defined using the following classification rule:
(16) 
Since the neighbourhood of a given instance is defined as a fuzzy set, consistently the abovementioned sets are also defined as fuzzy. However, the sets are fuzzy singletons. The visualisation of aforementioned sets is provided in Figure 1.
Using the abovementioned sets we define local True Positive rate, False Positive rate, False Negative rate respectively:
(17)  
(18)  
(19) 
where is the cardinality of a fuzzy set Dhar2013 (). Then, we define the local measure of classification quality:
(20) 
Finally, the label order is chosen so that the following inequalities are met:
(21) 
That is labels for whom the classification quality is higher precedes other labels in the chain structure. In other words, this simple heuristic is aimed at dealing with error propagation in the chain structure by employing the most accurate models at the beginning of the chain.
2.5 The Ensemble Classifier
Now, let us define a ML classifier ensemble: . The ensemble is built using classifier chain algorithms defined in previous sections. Each ensemble classifier is built using a subset of the original dataset. The size of subset is of the original training set.
The BR transformation may produce imbalanced singlelabel dataset. To prevent the classifier from learning from a highly imbalanced dataset, we applied the random undersampling technique Garca2012 (). The majority class is undersampled when imbalance ratio is higher than 20. The goal of undersampling is to keep the imbalance ratio at the level of 20.
The research on the application of Naive Bayes algorithm under the CC framework shows that when the number of features in the input space is significantly higher in comparison to the number of labels the Naive Bayes classifier may not perform well daSilva2014 (). To prevent the proposed system from being affected by this phenomenon, we applied the feature selection procedure for each singlelabel separately. That is, the attributes are selected in order to improve the classification quality for given label. The feature selection removes only attributes related to the original input space. Features related to labels are passed through the chain without selection. We employed the selection procedure based on correlation. In other words, we select attributes that are highly correlated to the predicted label and their intercorrelations are low Hall1999 (). Additionally, if the number of selected features is higher than 300, we select 300 random features from the set of previously selected features.
The final prediction vector of the ensemble is obtained via is a simple averaging of response vectors corresponding to base classifiers of the ensemble followed by the thresholding procedure:
(22) 
where is the Iverson bracket.
3 Experimental Setup
The experimental study is divided into three main sections. The first one assesses the impact of employing chaining approach. In the section, we compare binary relevance and classifier chains algorithms built using the following base classifiers:

J48 Classifier (C4.5 algorithm implemented in Weka) Quinlan1993 ().

SVM algorithm with radial based kernel Cortes1995 (); CC01a ().

Naive Bayes Classifier Hand2001 ().

Nearest Neighbour classifier Cover1967 ().
In this section, we compare BR and CC ensembles built using a genetic algorithm tailored to optimise the macroaveraged loss. For each ensemble, the size of the committee was set to . For the algorithm based on the genetic algorithm, the initial size of the committee was set to . Each numeric attribute in the training and validation datasets was also standardised. After the standardisation, the mean value of the attribute is 0 and its standard deviation is 1.
During the experimental study, the parameters of the SVM classifier (, ) were tuned using grid search and 3fold cross validation. The number of nearest neighbours was also tuned using 3fold cross validation. The number of neighbours was chosen among the following values .
In two remaining sections, the conducted experimental study provides an empirical evaluation of the classification quality of the proposed methods and compares it to reference methods. Namely, we conducted our experiments using the following algorithms of building a CC ensemble:

The proposed approach (Section 2.4).

Static ensemble generated using a genetic algorithm Trajdos2017 (). The enesmble is tuned to optimise the macroaveraged measure

ECC ensemble with randomly generated chain orders Read2009 ().

OOCC dynamic method proposed by da Silva et al. daSilva2014 (). The ensemble is tuned to optimise the example based measure. Additionally, the reference method uses single split into training and validation sets.
The abovementioned methods of building CC systems were evaluated using Naive Bayes and the nearest neighbour algorithms as base classifiers. Systems built using different base classifiers are investigated in two separate sections. In the sections, we will refer to the investigated algorithms using the abovesaid numbers.
The reference algorithm also uses Naive Bayes/nearest neighbour algorithm with data preprocessing procedures described in Section 2.5.
The extraction of training and test datasets was performed using fold crossvalidation. For each ensemble, the proportion of the training set was fixed at of the original training set (see Algorithm 1). For each ensemble, the size of the committee was set to . For the algorithm based on the genetic algorithm, the initial size of the committee was set to . Each numeric attribute in the training and validation datasets was also standardised. After the standardisation, the mean value of the attribute is 0 and its standard deviation is 1.
The coefficient was tuned during the training procedure using 3 CV approach. The best value among is chosen.
Single label classifiers were implemented using WEKA software Hall2009 (). Multilabel classifier were implemented using Mulan software mulan_software ().
The experiments were conducted using 30 multilabel benchmark sets. The main characteristics of the datasets are summarized in Table 1. We used datasets from the sources abbreviated as follows:A Charte2015 (), B meka () M–Tsoumakas2011_mulan (); W–Wu2014 (); X–Xu2013 (); Z–Zhou2012 (); T–Tomas2014 (); O – thorsten1998 (). Some of the employed sets needed some preprocessing. That is, we used multilabel multiinstance Zhou2012 () sets (sources Z and W) which were transformed to singleinstance multilabel datasets according to the suggestion made by Zhou et al. Zhou2012 (). Multitarget regression sets (No 9, 30) were binarised using simple thresholding strategy. That is if the response is greater than 0 the resulting label is set relevant. Two of the used datasets are synthetic ones (source T) and they were generated using algorithm described in Tomas2014 (). To reduce the computational burden, we use only a subset of original Tmc2007 and IMDB sets. Additionally, the number of labels in Stackex datasets is reduced to 15.
The algorithms were compared in terms of 11 different quality criteria coming from three groups Luaces2012 (): Instancebased (Hamming, ZeroOne, , False Discovery Rate, False Negative Rate); Labelbased. The last group contains the following measures: Macro Averaged (False Discovery Rate (FDR, 1 Precision), False Negative Rate (FNR, 1Recall), ) and Micro Averaged versions of the abovementioned criteria.
Statistical evaluation of the results was performed using the Wilcoxon signedrank test demsar2006 (); wilcoxon1945 () and the familywise error rates were controlled using the Holm procedure demsar2006 (); holm1979 (). For all statistical tests, the significance level was set to . Additionally, we also applied the Friedman Friedman1940 () test followed by the Nemenyi posthoc procedure demsar2006 ().
No  Name  Sr  LC  LD  avIR  

1  Arts1  M  7484  1733  26  1.654  .064  94.74 
2  Azotobacter  W  407  20  13  1.469  .113  2.225 
3  Birds  M  645  260  19  1.014  .053  5.407 
4  Caenorhabditis  W  2512  20  21  2.419  .115  2.347 
5  Drosophila  W  2605  20  22  2.656  .121  1.744 
6  Emotions  M  593  72  6  1.868  .311  1.478 
7  Enron  M  1702  1001  53  3.378  .064  73.95 
8  Flags  X  194  43  7  3.392  .485  2.255 
9  Flare2  M  1066  27  3  0.209  .070  14.15 
10  Genbase  M  662  1186  27  1.252  .046  37.32 
11  Geobacter  W  379  20  11  1.264  .115  2.750 
12  Haloarcula  W  304  20  13  1.602  .123  2.419 
13  Human  X  3106  440  14  1.185  .085  15.29 
14  Image  M  2000  294  5  1.236  .247  1.193 
15  IMDB  M  3042  1001  28  1.987  .071  24.61 
16  LLOG  B  1460  1004  75  1.180  .016  39.27 
17  Medical  M  978  1449  45  1.245  .028  89.50 
18  MimlImg  Z  2000  135  5  1.236  .247  1.193 
19  Ohsumed  O  13929  1002  23  1.663  .072  7.869 
20  Plant  X  978  440  12  1.079  .090  6.690 
21  Pyrococcus  W  425  20  18  2.136  .119  2.421 
22  Saccharomyces  W  3509  20  27  2.275  .084  2.077 
23  Scene  X  2407  294  6  1.074  .179  1.254 
24  SimpleHC  T  3000  30  10  1.900  .190  1.138 
25  SimpleHS  T  3000  30  10  2.307  .231  2.622 
26  SLASHDOT  B  3782  1079  22  1.181  .054  17.69 
27  Stackex_chess  A  1675  585  15  1.137  .076  4.744 
28  Tmc2007500  M  2857  500  22  2.222  .101  17.15 
29  waterquality  M  1060  16  14  5.073  .362  1.767 
30  yeast  M  2417  103  14  4.237  .303  7.197 
4 Results and Discussion
4.1 Assessing the impact of chaining approach
In this section, we assess the consequences of employing chaining approach. That is, we compare binary relevance ensembles with classifier chain ensembles built using the same base classifier. The results are shown in Figure 2 and Table 2. Full results are presented in the appendix in Tables 5, 6 and 7. The compared algorithms are numbered as follows:

BR ensemble built using J48 algorithm.

CC ensemble built using J48 algorithm.

BR ensemble built using SVM algorithm.

CC ensemble built using SVM algorithm.

BR ensemble built using NB algorithm.

CC ensemble built using NB algorithm.

BR ensemble built using KNN algorithm.

CC ensemble built using KNN algorithm.
The analysis of the results clearly shows that there is a noticeable difference between two groups of measures. That is, the differences between BRbased and CCbased algorithms are greater in terms of example based criteria. On the other hand, the differences in mean ranks are lower for example based measures.
For the example based measures, the average ranks achieved by CCbased algorithms are lower than for BRbased algorithms. However, only for algorithms based on J48 classifier, the differences are significant for examplebased FDR, FNR and measures. A similar trend is observed for the zeroone loss. In this case, only differences for the nearest neighbour classifier are insignificant. It means that CCbased classifiers obtain the higher number of ’perfect match’ results.
On the other hand, for the labelbased measures and Hamming loss, almost no significant differences are observed. However, the average ranks suggest that for this group of measures, the classification quality may deteriorate.
The results clearly show that although labelspecific quality measures do not change in a significant way, the prediction of the entire labelvector improves. This is an expected result since the CCbased approach incorporates the interlabel relations. This is a wellknown fact that has been reported by authors that have previously compared both approaches Madjarov2012 ().
The results also show that there are almost no significant differences between J48, NB and KNN based algorithms. Contrary, SVM algorithm tends to outperform the remaining ones in terms of examplebased criteria, hamming loss and zeroone loss. It means that although J48 algorithm takes the biggest advantage of employing the chain rule, NB and KNN based classifiers are comparable to J48based ensembles.
Alg.  1  2  3  4  5  6  7  8  1  2  3  4  5  6  7  8  1  2  3  4  5  6  7  8  1  2  3  4  5  6  7  8 

Hamming  ZeroOne  EX FDR  EX FNR  
Rnk  4.94  5.03  2.81  3.03  4.72  4.63  5.38  5.47  5.50  5.00  3.30  2.25  5.17  4.16  5.88  4.75  5.64  4.44  3.13  2.66  5.00  4.66  5.61  4.88  5.47  4.66  4.22  3.53  3.97  3.91  5.19  5.06 
Frd  
Wp1  1.00  .002  .148  1.00  1.00  1.00  .432  .032  .001  .000  1.00  .478  1.00  1.00  .070  .001  .001  1.00  .650  1.00  1.00  .011  1.00  .009  .205  .054  1.00  1.00  
Wp2  .000  .025  1.00  1.00  1.00  1.00  1.00  .001  1.00  1.00  .044  1.00  .845  .001  1.00  1.00  .056  1.00  1.00  .223  1.00  1.00  1.00  1.00  
Wp3  1.00  .014  .022  .000  .000  .032  .005  .627  .001  .597  1.00  .002  .097  .001  .573  .937  1.00  1.00  1.00  1.00  
Wp4  .552  .552  .017  .002  .000  .004  .000  .002  .002  .006  .000  .002  1.00  1.00  .169  .113  
Wp5  1.00  1.00  .591  .041  1.00  1.00  1.00  .964  1.00  1.00  .221  1.00  
Wp6  1.00  1.00  .478  1.00  .666  1.00  .090  .266  
Wp7  1.00  .220  1.00  1.00  
EX  Macro FDR  Macro FNR  Macro  
Rnk  5.72  4.47  4.06  2.53  4.53  4.25  5.63  4.81  4.25  5.50  3.00  4.91  3.97  5.25  4.00  5.13  4.63  4.45  5.20  4.97  4.03  4.78  3.41  4.53  4.81  5.09  4.25  4.84  3.94  5.16  3.41  4.50 
Frd  
Wp1  .005  .041  .000  .200  .072  1.00  1.00  .019  .552  1.00  1.00  .390  1.00  .479  1.00  1.00  1.00  1.00  1.00  .022  1.00  1.00  1.00  1.00  1.00  1.00  .000  1.00  
Wp2  1.00  .001  1.00  1.00  .189  1.00  .003  1.00  .697  1.00  .152  1.00  1.00  1.00  1.00  1.00  .441  1.00  1.00  1.00  1.00  1.00  .031  1.00  
Wp3  .272  1.00  1.00  .089  1.00  .066  .552  .023  .349  .003  1.00  .077  1.00  .058  1.00  1.00  1.00  1.00  1.00  1.00  
Wp4  .014  .007  .000  .005  1.00  1.00  1.00  1.00  .419  1.00  .501  1.00  1.00  1.00  .824  1.00  
Wp5  1.00  .200  1.00  .208  1.00  1.00  .998  1.00  1.00  .355  1.00  1.00  
Wp6  .090  1.00  .101  1.00  1.00  1.00  .278  1.00  
Wp7  1.00  .552  .143  .203  
Micro FDR  Micro FNR  Micro  
Rnk  4.81  4.84  2.72  3.13  4.81  4.78  5.19  5.72  4.66  4.44  5.31  4.66  4.31  4.66  3.78  4.19  4.56  4.69  4.41  3.75  4.69  5.03  4.25  4.63  
Frd  
Wp1  1.00  .001  .141  1.00  1.00  1.00  .045  1.00  .706  1.00  1.00  1.00  .136  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  
Wp2  .000  .023  1.00  1.00  1.00  .564  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  
Wp3  1.00  .001  .001  .000  .001  1.00  .233  1.00  .215  .132  1.00  1.00  1.00  1.00  1.00  
Wp4  .091  .079  .015  .007  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  
Wp5  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  
Wp6  1.00  1.00  1.00  1.00  1.00  1.00  
Wp7  1.00  1.00  1.00 
4.2 Naive Bayes Classifier
The results of the experimental study are presented in Table 3 and Figure (a)a. Tables 8, 6 and 10 show full results of the experiment. Table 3 provides results of the statistical evaluation of the experiments. Figure (a)a visualises the average ranks and provide a view of the Nemenyi posthoc procedure.
First, let’s analyse differences between the proposed heuristic and the simple ECC ensemble. The proposed method is tailored to optimise the macroaveraged loss so we begin with investigating macroaveraged measures. It is easy to see that both methods are comparable in terms of recall but the proposed one is significantly better in terms of precision. It means that the proposed method makes significantly less false positive predictions. Consequently, under the macroaveraged loss the proposed method outperforms the ECC ensemble. The same pattern is also present in results related to microaveraged measures. However, the difference for the microaveraged measure is not significant. In contrast, under example based measures, except the Hamming loss, there are no significant differences between investigated methods.
The results show that the proposed heuristic provides an effective way of improving classification quality for classifier chains ensemble. Moving the best performing labelspecific models at the beginning of the chain reduces the error that propagates along the chain. What is more, the experimental study also showed that the Naive Bayes classifier combined with proper data preprocessing may be effectively employed in classifier chain ensembles.
Now, let’s compare the proposed method to the other algorithm based on the dynamic chain approach. When we investigate the examplebased criteria it is easy to see that the OOCC algorithm outperforms the proposed one in terms of FDR and Hamming loss. Those results combined with results achieved in terms of macro and micro averaged measures shows that the OOCC mthod seems to be too much conservative. That is, it tends to makes many false negative predictions in comparison to the other methods. The outstanding results for the Hamming loss are a consequence of the imbalanced nature of the multilabel data. That is, the presence of labels is relatively rare and the prediction that contains many false negatives may achieve inadequately hight performance under the Hamming loss Luaces2012 ().
On the other hand, the average ranks clearly show that the method based on genetic algorithm achieves the best results in comparison to the other investigated methods. The main reason is that the GAbased approach optimises the entire ensemble structure, whereas the investigated dynamic chain methods, choose the best label order for single classifier chain. Then the locally chosen chains are combined into an ensemble. It gives us an important clue. That is when we consider an algorithm for dynamic chain order selection, we should think about a single chain and the global structure of the entire ensemble as well.
Alg.  1  2  3  4  1  2  3  4  1  2  3  4  1  2  3  4 

Hamming  ZeroOne  EX FDR  EX FNR  
Rnk  2.45  2.67  3.03 
1.84 
2.52 
2.14 
2.97  2.38  2.84 
1.97 
2.91  2.28  2.56 
2.03 
2.53  2.88 
Frd  
Wp1  .702  .025  .063  .358  .199  .761  .019  .295  .134  .053  .700  .700  
Wp2  .319  .319  .006  .368  .005  .239  .136  .015  
Wp3  .001  .097  .040  .136  
EX  Macro FDR  Macro FNR  Macro  
Rnk  2.69 
2.03 
2.84  2.44  2.31 
2.22 
3.16  2.31  2.56 
1.94 
2.36  3.14  2.47 
1.81 
2.84  2.88 
Frd  
Wp1  .066  .821  .821  1.00  .012  1.00  .035  .919  .022  .156  .096  .248  
Wp2  .017  .112  .031  1.00  .174  .002  .003  .022  
Wp3  .590  .012  .105  .733  
Micro FDR  Micro FNR  Micro  
Rnk  2.47  2.66  3.22 
1.66 
2.69 
1.72 
2.34  3.25  2.75 
1.78 
2.81  2.66  
Frd  
Wp1  1.00  .028  .000  .005  .254  .239  .044  .610  1.00  
Wp2  1.00  .008  .052  .000  .002  .044  
Wp3  .000  .014  1.00 
4.3 Nearest Neighbour Classifier
The results of the experimental study are presented in Table 4 and Figure (b)b. Tables 11, 12 and 13 show full results of the experiment. Table 3 provides results of the statistical evaluation of the experiments. Figure (b)b visualises the average ranks and provide a view of the Nemenyi posthoc procedure.
The results show that for the group of example based measures and the zeroone loss, there are no significant differences in classification quality between all investigated algorithms.
For macro and micro averaged measures, the best performing algorithm is an ensemble optimised using the genetic algorithm. The proposed nearestneighbourbased classifier does not differ significantly from ECC OOCC algorithms. However, it tends to be more conservative because it achieves lower FDR and higher FDR. In other words, the classifier tends to decrease the false positive rate at the cost of decreasing the true positive rate. This phenomenon causes the highest classification quality in terms of the Hamming loss. The reason is that for the multilabel set with low label density, it is easy to obtain high classification by setting all possible labels as irrelevant.
The results confirm the findings described in Section 4.1. The nearestneighbourbased CC algorithm is unable to take all benefits of the chaining approach. On the other hand, the method is still comparable to chains built using different base classifiers. The main goal of this paper was to propose the model that can change the chain structure without retraining. This goal was achieved.
Alg.  1  2  3  4  1  2  3  4  1  2  3  4  1  2  3  4 

Hamming  ZeroOne  EX FDR  EX FNR  
Rnk  1.72  2.83  2.80  2.65  2.22  2.73  2.42  2.63  2.17  2.57  2.57  2.70  2.60  2.47  2.40  2.53 
Frd  
Wp1  .004  .000  .006  1.00  1.00  1.00  1.00  1.00  .948  1.00  .985  1.00  
Wp2  .441  .173  1.00  1.00  1.00  1.00  1.00  1.00  
Wp3  .441  1.00  1.00  1.00  
EX  Macro FDR  Macro FNR  Macro  
Rnk  2.43  2.57  2.47  2.53  2.27  1.67  3.10  2.97  2.97  1.70  2.78  2.55  2.600  1.533  3.033  2.833 
Frd  
Wp1  1.00  1.00  1.00  .017  .010  .012  .000  .617  .617  .000  .276  .579  
Wp2  1.00  1.00  .000  .003  .000  .004  .000  .000  
Wp3  1.00  .271  .561  .579  
Micro FDR  Micro FNR  Micro  
Rnk  1.67  2.97  2.77  2.60  2.97  1.70  2.77  2.57  2.63  1.63  3.00  2.73  
Frd  
Wp1  .009  .000  .028  .000  .926  .926  .002  .532  .657  
Wp2  .382  .070  .002  .019  .001  .006  
Wp3  .715  .926  .657 
5 Conclusions
The main goal of this research was to provide an effective chain classifier that allows changing label order at relatively low computational cost. We achieved it using a classifier based on the Naive Bayes approach. To prove that the proposed method allows handling interlabel relations in an efficient way, we proposed a simple heuristic method that determines label order that should minimise label propagation error. Indeed, the experimental results showed that the proposed method is able to produce a good chain structure at a low computational cost. However, the proposed method of building a dynamic ensemble does not allow to outperform the static system that optimizes the entire ensemble structure. The obtained results are very promising. We believe that there is still a room for improvement. In our opinion, the performance of the system may be improved if we provide better, a better heuristic that optimises the entire ensemble in a dynamic way. The proposed dynamic classifier is a first step in the process of investigating dynamic classifier chain ensembles.
Another way of improving this idea is to build different classifiers that would be able to change the chain structure without retraining the entire model.
Acknowledgment
The work was supported by the statutory funds of the Department of Systems and Computer Networks, Wroclaw University of Science and Technology, under agreement 0401/0159/16.
References
 (1) Alvares Cherman, E., Metz, J., Monard, M.C.: A simple approach to incorporate label dependency in multilabel classification. In: Advances in Soft Computing, pp. 33–43. Springer Berlin Heidelberg (2010). DOI 10.1007/9783642167737˙3
 (2) Chang, C.C., Lin, C.J.: LIBSVM. ACM Transactions on Intelligent Systems and Technology 2(3), 1–27 (2011). DOI 10.1145/1961189.1961199
 (3) Charte, F., Rivera, A., del Jesus, M.J., Herrera, F.: Concurrence among imbalanced labels and its influence on multilabel resampling algorithms. In: Lecture Notes in Computer Science, pp. 110–121. Springer International Publishing (2014). DOI 10.1007/9783319076171˙10
 (4) Charte, F., Rivera, A.J., del Jesus, M.J., Herrera, F.: Quinta: A question tagging assistant to improve the answering ratio in electronic forums. In: IEEE EUROCON 2015  International Conference on Computer as a Tool (EUROCON). IEEE (2015). DOI 10.1109/eurocon.2015.7313677
 (5) Chen, B., Li, W., Zhang, Y., Hu, J.: Enhancing multilabel classification based on local label constraints and classifier chains. In: 2016 International Joint Conference on Neural Networks (IJCNN). IEEE (2016). DOI 10.1109/ijcnn.2016.7727370
 (6) Cheng, W., Hüllermeier, E., Dembczynski, K.J.: Bayes optimal multilabel classification via probabilistic classifier chains. In: Proceedings of the 27th international conference on machine learning (ICML10), pp. 279–286 (2010)
 (7) Cortes, C., Vapnik, V.: Supportvector networks. Machine Learning 20(3), 273–297 (1995). DOI 10.1007/bf00994018
 (8) Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13(1), 21–27 (1967). DOI 10.1109/tit.1967.1053964
 (9) Díez, J., Luaces, O., del Coz, J.J., Bahamonde, A.: Optimizing different loss functions in multilabel classifications. Progress in Artificial Intelligence 3(2), 107–118 (2014). DOI 10.1007/s1374801400607
 (10) Demšar, J.: Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research 7, 1–30 (2006)
 (11) Dhar, M.: On cardinality of fuzzy sets. International Journal of Intelligent Systems and Applications 5(6), 47–52 (2013). DOI 10.5815/ijisa.2013.06.06
 (12) Friedman, M.: A comparison of alternative tests of significance for the problem of rankings. The Annals of Mathematical Statistics 11(1), 86–92 (1940). DOI 10.1214/aoms/1177731944
 (13) García, V., Sánchez, J., Mollineda, R.: On the effectiveness of preprocessing methods when dealing with different levels of class imbalance. KnowledgeBased Systems 25(1), 13–21 (2012). DOI 10.1016/j.knosys.2011.06.013
 (14) Gibaja, E., Ventura, S.: Multilabel learning: a review of the state of the art and ongoing research. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 4(6), 411–444 (2014). DOI 10.1002/widm.1139
 (15) Gonçalves, E.C., Plastino, A., Freitas, A.A.: Simpler is better. In: Proceedings of the 2015 on Genetic and Evolutionary Computation Conference  GECCO ’15. ACM Press (2015). DOI 10.1145/2739480.2754650
 (16) Goncalves, E.C., Plastino, A., Freitas, A.A.: A genetic algorithm for optimizing the label ordering in multilabel classifier chains. In: 2013 IEEE 25th International Conference on Tools with Artificial Intelligence. IEEE (2013). DOI 10.1109/ictai.2013.76
 (17) Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software. ACM SIGKDD Explorations Newsletter 11(1), 10 (2009). DOI 10.1145/1656274.1656278
 (18) Hall, M.A.: Correlationbased feature selection for machine learning. Ph.D. thesis, The University of Waikato (1999)
 (19) Hand, D.J., Yu, K.: Idiot’s bayes: Not so stupid after all? International Statistical Review / Revue Internationale de Statistique 69(3), 385 (2001). DOI 10.2307/1403452
 (20) Holm, S.: A Simple Sequentially Rejective Multiple Test Procedure. Scandinavian Journal of Statistics 6(2), 65–70 (1979). DOI 10.2307/4615733
 (21) Huang, J., Li, G., Wang, S., Zhang, W., Huang, Q.: Group sensitive classifier chains for multilabel classification. In: 2015 IEEE International Conference on Multimedia and Expo (ICME). IEEE (2015). DOI 10.1109/icme.2015.7177400
 (22) Jiang, J.Y., Tsai, S.C., Lee, S.J.: Fsknn: Multilabel text categorization based on fuzzy similarity and k nearest neighbors. Expert Systems with Applications 39(3), 2813–2821 (2012). DOI 10.1016/j.eswa.2011.08.141
 (23) Joachims, T.: Text categorization with suport vector machines: Learning with many relevant features. In: Proc. 10th European Conference on Machine Learning, pp. 137–142 (1998)
 (24) Liu, X., Shi, Z., Li, Z., Wang, X., Shi, Z.: Sorted label classifier chains for learning images with multilabel. In: Proceedings of the international conference on Multimedia  MM ’10. ACM Press (2010). DOI 10.1145/1873951.1874121
 (25) Luaces, O., Díez, J., Barranquero, J., del Coz, J.J., Bahamonde, A.: Binary relevance efficacy for multilabel classification. Progress in Artificial Intelligence 1(4), 303–313 (2012). DOI 10.1007/s137480120030x
 (26) Madjarov, G., Kocev, D., Gjorgjevikj, D., Džeroski, S.: An extensive experimental comparison of methods for multilabel learning. Pattern Recognition 45(9), 3084–3104 (2012). DOI 10.1016/j.patcog.2012.03.004
 (27) Montañes, E., Senge, R., Barranquero, J., Ramón Quevedo, J., José del Coz, J., Hüllermeier, E.: Dependent binary relevance models for multilabel classification. Pattern Recognition 47(3), 1494–1508 (2014). DOI 10.1016/j.patcog.2013.09.029
 (28) Quinlan, J.R.: C4.5 : Programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1993)
 (29) Read, J., Martino, L., Luengo, D.: Efficient monte carlo methods for multidimensional learning with classifier chains. Pattern Recognition 47(3), 1535–1546 (2014). DOI 10.1016/j.patcog.2013.10.006
 (30) Read, J., Peter, R.: Meka:http://meka.sourceforge.net/ (2017). URL http://meka.sourceforge.net/
 (31) Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multilabel classification. In: Machine Learning and Knowledge Discovery in Databases, pp. 254–269. Springer Berlin Heidelberg (2009). DOI 10.1007/9783642041747˙17
 (32) Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multilabel classification. Machine Learning 85(3), 333–359 (2011). DOI 10.1007/s1099401152565
 (33) Sanden, C., Zhang, J.Z.: Enhancing multilabel music genre classification through ensemble techniques. In: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information  SIGIR ’11. ACM Press (2011). DOI 10.1145/2009916.2010011
 (34) Senge, R., del Coz, J.J., Hüllermeier, E.: On the problem of error propagation in classifier chains for multilabel classification. In: Studies in Classification, Data Analysis, and Knowledge Organization, pp. 163–170. Springer International Publishing (2013). DOI 10.1007/9783319015958˙18
 (35) da Silva, P.N., Gonçalves, E.C., Plastino, A., Freitas, A.A.: Distinct chains for different instances: An effective strategy for multilabel classifier chains. In: Machine Learning and Knowledge Discovery in Databases, pp. 453–468. Springer Berlin Heidelberg (2014). DOI 10.1007/9783662448519˙29
 (36) SpyromitrosXioufis, E., Tsoumakas, G., Groves, W., Vlahavas, I.: Multitarget regression via input space expansion: treating targets as inputs. Machine Learning 104(1), 55–98 (2016). DOI 10.1007/s109940165546z
 (37) SpyromitrosXioufis, E., Tsoumakas, G., Groves, W., Vlahavas, I.: Multitarget regression via input space expansion: treating targets as inputs. Machine Learning 104(1), 55–98 (2016). DOI 10.1007/s109940165546z
 (38) Tomás, J.T., Spolaôr, N., Cherman, E.A., Monard, M.C.: A framework to generate synthetic multilabel datasets. Electronic Notes in Theoretical Computer Science 302, 155–176 (2014). DOI 10.1016/j.entcs.2014.01.025
 (39) Trajdos, P., Kurzynski, M.: Naive bayes classifier for dynamic chaining approach in multilabel learning. International Journal of Education and Learning Systems 2, 133–142 (2017). URL http://www.iaras.org/iaras/filedownloads/ijels/2017/0020019(2017).pdf
 (40) Trajdos, P., Kurzynski, M.: Permutationbased diversity measure for classifierchain approach. In: Advances in Intelligent Systems and Computing, pp. 412–422. Springer International Publishing (2017). DOI 10.1007/9783319591629˙43
 (41) Tsoumakas, G., Katakis, I., Vlahavas, I.: Effective and efficient multilabel classification in domains with large number of labels, p. 30–44 (2008)
 (42) Wei, Y., Xia, W., Lin, M., Huang, J., Ni, B., Dong, J., Zhao, Y., Yan, S.: Hcp: A flexible cnn framework for multilabel image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 38(9), 1901–1907 (2016). DOI 10.1109/tpami.2015.2491929
 (43) Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics Bulletin 1(6), 80 (1945). DOI 10.2307/3001968
 (44) Wu, J.S., Huang, S.J., Zhou, Z.H.: Genomewide protein function prediction through multiinstance multilabel learning. IEEE/ACM Transactions on Computational Biology and Bioinformatics 11(5), 891–902 (2014). DOI 10.1109/tcbb.2014.2323058
 (45) Xu, J.: Fast multilabel core vector machine. Pattern Recognition 46(3), 885–898 (2013). DOI 10.1016/j.patcog.2012.09.003
 (46) Zadeh, L.: Fuzzy sets. Information and Control 8(3), 338–353 (1965). DOI 10.1016/s00199958(65)90241x
 (47) Zhang, P., Yang, Y., Zhu, X.: Approaching multidimensional classification by using bayesian network chain classifiers. In: 2014 Sixth International Conference on Intelligent HumanMachine Systems and Cybernetics. IEEE (2014). DOI 10.1109/ihmsc.2014.129
 (48) Zhou, Z.H., Zhang, M.L., Huang, S.J., Li, Y.F.: Multiinstance multilabel learning. Artificial Intelligence 176(1), 2291–2320 (2012). DOI 10.1016/j.artint.2011.10.002
Appendix A Full results
Macro FDR  Macro FNR  Macro  

No.  1  2  3  4  5  6  7  8  1  2  3  4  5  6  7  8  1  2  3  4  5  6  7  8 
1  .668  .622  .758  .645  .789  .780  .666  .590  .682  .642  .775  .604  .802  .782  .692  .622  .694  .651  .781  .649  .810  .796  .697  .625 
2  .639  .608  .507  .527  .568  .565  .722  .664  .638  .593  .529  .530  .564  .561  .678  .623  .663  .620  .528  .534  .575  .570  .729  .678 
3  .386  .402  .384  .418  .367  .389  .394  .402  .408  .429  .411  .423  .400  .413  .422  .438  .410  .432  .411  .433  .398  .414  .424  .433 
4  .526  .449  .409  .447  .449  .440  .534  .678  .556  .486  .428  .471  .464  .474  .567  .655  .557  .481  .425  .468  .467  .470  .572  .693 
5  .582  .476  .411  .453  .453  .482  .547  .623  .605  .487  .434  .469  .459  .496  .565  .608  .621  .486  .438  .467  .470  .494  .586  .639 
6  .351  .365  .330  .333  .368  .373  .342  .343  .335  .337  .322  .310  .264  .265  .324  .302  .376  .382  .359  .350  .353  .352  .365  .354 
7  .466  .510  .442  .498  .493  .517  .481  .515  .408  .398  .397  .414  .418  .419  .404  .399  .472  .485  .450  .483  .495  .506  .476  .492 
8  .293  .280  .280  .283  .276  .271  .284  .294  .208  .216  .231  .215  .248  .213  .194  .221  .262  .258  .271  .263  .276  .255  .251  .267 
9  .210  .216  .196  .198  .194  .204  .195  .190  .210  .216  .198  .200  .196  .207  .198  .192  .211  .217  .198  .200  .196  .207  .197  .192 
10  .007  .006  .008  .007  .009  .008  .007  .005  .051  .052  .052  .049  .052  .053  .051  .052  .037  .037  .038  .035  .038  .038  .037  .037 
11  .617  .543  .532  .528  .552  .556  .693  .598  .628  .550  .549  .548  .565  .581  .677  .601  .640  .561  .553  .551  .568  .577  .714  .609 
12  .507  .538  .527  .512  .574  .524  .668  .626  .554  .546  .575  .527  .597  .571  .662  .628  .551  .562  .565  .532  .597  .565  .691  .646 
13  .730  .703  .700  .640  .703  .699  .724  .739  .667  .633  .615  .577  .590  .575  .628  .594  .719  .691  .682  .628  .676  .668  .703  .699 
14  .416  .405  .380  .336  .530  .524  .411  .393  .427  .431  .400  .378  .239  .264  .411  .425  .439  .434  .410  .374  .453  .454  .430  .424 
15  .838  .785  .836  .746  .812  .716  .834  .751  .794  .690  .794  .683  .793  .731  .785  .680  .841  .769  .839  .739  .827  .756  .833  .743 
16  .713  .728  .716  .699  .692  .704  .715  .699  .689  .680  .720  .711  .647  .672  .682  .652  .715  .723  .727  .712  .689  .706  .714  .692 
17  .258  .244  .259  .228  .266  .262  .256  .232  .213  .227  .211  .235  .220  .239  .211  .233  .255  .249  .254  .243  .262  .266  .252  .245 
18  .522  .498  .446  .411  .469  .488  .472  .470  .516  .511  .447  .440  .387  .388  .472  .489  .537  .521  .465  .443  .459  .471  .491  .494 
19  .525  .471  .516  .452  .542  .524  .526  .459  .516  .488  .488  .474  .540  .522  .511  .493  .554  .513  .538  .499  .573  .556  .553  .511 
20  .800  .761  .783  .727  .774  .751  .806  .757  .693  .604  .706  .605  .714  .652  .707  .638  .772  .717  .762  .688  .760  .724  .779  .721 
21  .703  .664  .606  .610  .648  .627  .742  .749  .701  .686  .630  .644  .652  .671  .642  .712  .725  .687  .630  .639  .666  .663  .748  .762 
22  .541  .486  .455  .457  .489  .476  .826  .720  .560  .502  .467  .470  .497  .486  .785  .689  .564  .501  .468  .469  .501  .487  .841  .728 
23  .366  .335  .346  .289  .380  .366  .323  .277  .354  .342  .332  .303  .191  .210  .273  .281  .369  .345  .348  .302  .324  .320  .313  .287 
24  .261  .232  .165  .117  .200  .198  .190  .166  .485  .471  .433  .414  .404  .398  .438  .440  .428  .408  .361  .331  .361  .358  .373  .365 
25  .568  .636  .543  .580  .551  .561  .698  .706  .673  .596  .765  .679  .713  .699  .486  .500  .664  .652  .710  .669  .682  .676  .651  .657 
26  .694  .572  .677  .546  .682  .561  .701  .559  .687  .551  .667  .551  .656  .524  .681  .576  .701  .577  .684  .560  .684  .559  .704  .577 
27  .578  .553  .562  .555  .591  .577  .562  .570  .575  .559  .575  .581  .590  .570  .575  .582  .596  .576  .586  .585  .609  .594  .586  .595 
28  .345  .348  .334  .333  .350  .343  .349  .355  .357  .342  .344  .342  .295  .291  .357  .348  .396  .391  .384  .379  .369  .364  .397  .396 
29  .509  .506  .419  .510  .531  .529  .502  .484  .335  .319  .458  .334  .240  .256  .351  .360  .460  .457  .470  .468  .445  .447  .464  .460 
30  .456  .389  .392  .398  .422  .405  .436  .515  .281  .340  .299  .310  .288  .303  .309  .313  .406  .390  .375  .383  .389  .384  .409  .460 
Macro FDR  Macro FNR  Macro  Hamming  

No.  1  2  3  4  5  6  7  8  1  2  3  4  5  6  7  8  1  2  3  4  5  6  7  8  1  2  3  4  5  6  7  8 
1  . 569  .613  .563  .648  .583  .635  .564  .647  .760  .767  .811  .754  .798  .798  .769  .771  .715  .732  .768  .755  .769  .778  .718  .744  .060  .064  .061  .086  .062  .065  .058  .068 
2  . 797  .879  .804  .909  .835  .901  .807  .856  .804  .857  .894  .919  .896  .926  .729  .817  .822  .883  .875  .918  .880  .924  .784  .853  .168  .168  .124  .113  .135  .124  .214  .199 
3  . 717  .739  .708  .738  .713  .729  .697  .738  .777  .783  .777  .809  .789  .790  .767  .825  .768  .779  .761  .793  .768  .779  .752  .803  .054  .055  .053  .058  .051  .054  .053  .053 
4  . 606  .917  .282  .941  .502  .866  .565  .870  .774  .970  .807  .982  .792  .974  .726  .775  .726  .962  .720  .976  .723  .959  .675  .838  .127  .118  .099  .115  .111  .117  .126  .258 
5  . 724  .981  .520  .975  .584  .980  .682  .856  .764  .985  .812  .981  .796  .983  .701  .829  .758  .984  .757  .979  .753  .983  .697  .852  .156  .122  .117  .121  .130  .123  .155  .228 
6  . 335  .345  .310  .327  .391  .394  .329  .341  .360  .353  .341  .320  .275  .279  .340  .319  .358  .356  .338  .332  .344  .349  .345  .338  .207  .212  .195  .200  .224  .223  .202  .207 
7  . 763  .770  .751  .767  .759  .773  .759  .766  .729  .731  .739  .750  .716  .733  .728  .735  .758  .762  .757  .768  .752  .766  .755  .761  .066  .070  .060  .066  .070  .076  .066  .071 
8  . 320  .342  .342  .357  .338  .308  .326  .347  .280  .295  .312  .327  .339  .300  .264  .307  .324  .332  .345  .362  .358  .325  .312  .344  .248  .244  .255  .250  .248  .239  .240  .253 
9  . 618  .625  .547  .618  .554  .573  .566  .563  .693  .690  .692  .721  .647  .645  .692  .682  .672  .673  .654  .691  .621  .627  .656  .644  .081  .084  .075  .078  .075  .079  .075  .073 
10  . 250  .245  .255  .238  .242  .245  .246  .240  .246  .242  .250  .235  .239  .245  .242  .241  .248  .244  .252  .237  .241  .246  .244  .241  .005  .005  .005  .005  .005  .005  .005  .005 
11  . 847  .873  .831  .859  .844  .864  .829  .821  .818  .844  .853  .857  .870  .894  .745  .885  .847  .872  .852  .865  .864  .892  .805  .868  .152  .137  .126  .120  .126  .122  .206  .145 
12  . 820  .843  .859  .844  .806  .883  .819  .834  .839  .807  .881  .831  .852  .880  .811  .837  .844  .841  .877  .849  .841  .889  .829  .852  .123  .146  .123  .128  .127  .127  .176  .171 
13  . 837  .832  .800  .789  .795  .802  .817  .862  .846  .808  .806  .790  .789  .796  .803  .782  .847  .831  .812  .803  .812  .822  .832  .843  .111  .123  .112  .110  .108  .114  .124  .156 
14  . 291  .304  .275  .268  .502  .488  .323  .287  .447  .455  .425  .404  .269  .290  .439  .449  .383  .393  .361  .346  .445  .443  .389  .382  .169  .175  .162  .158  .305  .293  .178  .170 
15  . 911  .903  .906  .907  .907  .899  .908  .914  .899  .866  .890  .895  .902  .896  .893  .891  .914  .902  .912  .912  .907  .903  .909  .911  .124  .136  .129  .114  .117  .110  .127  .118 
16  . 653  .662  .668  .669  .651  .650  .658  .657  .658  .651  .672  .671  .646  .653  .658  .650  .662  .662  .674  .673  .653  .656  .662  .659  .020  .025  .018  .018  .022  .022  .022  .026 
17  . 382  .380  .381  .380  .383  .388  .383  .384  .360  .366  .358  .375  .361  .372  .360  .377  .377  .378  .375  .382  .378  .384  .377  .385  .012  .012  .012  .011  .013  .012  .012  .011 
18  . 407  .417  .342  .357  .477  .504  .398  .381  .536  .529  .469  .463  .408  .404  .490  .503  .483  .484  .417  .420  .453  .468  .454  .455  .213  .219  .190  .196  .240  .256  .208  .205 
19  . 513  .505  .536  .513  .505  .507  .524  .504  .591  .581  .567  .580  .610  .595  .581  .594  .578  .564  .574  .566  .585  .574  .577  .569  .064  .062  .065  .063  .064  .063  .065  .062 
20  . 849  .848  .814  .805  .835  .819  .816  .849  .784  .749  .811  .759  .834  .773  .806  .782  .837  .819  .821  .802  .843  .812  .832  .834  .151  .172  .128  .151  .122  .138  .145  .161 
21  . 809  .891  .857  .945  .853  .904  .830  .887  .810  .933  .919  .947  .898  .940  .692  .851  .826  .929  .911  .948  .888  .934  .789  .878  .168  .135  .127  .121  .141  .123  .240  .209 
22  . 825  .952  .826  .950  .828  .934  .882  .894  .927  .970  .964  .973  .945  .969  .756  .809  .909  .970  .945  .968  .923  .965  .843  .866  .098  .087  .084  .085  .091  .087  .215  .206 
23  . 185  .182  .172  .173  .346  .332  .249  .207  .356  .343  .335  .311  .202  .220  .280  .289  .286  .275  .268  .253  .290  .287  .272  .256  .092  .090  .088  .085  .129  .122  .100  .090 
24  . 151  .131  .106  .091  .202  .211  .141  .111  .583  .572  .543  .527  .516  .513  .542  .546  .446  .432  .400  .382  .405  .405  .407  .402  .126  .122  .114  .110  .124  .125  .118  .115 
25  . 680  .721  .733  .762  .679  .657  .744  .750  .787  .695  .882  .788  .828  .809  .563  .576  .774  .726  .881  .803  .809  .787  .678  .687  .249  .308  .221  .274  .234  .240  .392  .396 
26  . 516  .505  .522  .469  .535  .517  .523  .502  .633  .588  .620  .594  .609  .566  .621  .619  .598  .570  .591  .558  .589  .557  .596  .585  .048  .051  .049  .046  .052  .051  .052  .048 
27  . 598  .597  .586  .608  .574  .605  .571  .608  .644  .647  .681  .674  .701  .697  .673  .682  .658  .661  .676  .678  .671  .676  .669  .679  .083  .080  .077  .079  .080  .081  .077  .084 
28  . 449  .466  .434  .455  .438  .454  .423  .459  .545  .521  .506  .549  .464  .485  .550  .558  .528  .521  .498  .528  .474  .489  .520  .538  .074  .075  .073  .073  .073  .073  .075  .076 
29  . 532  .533  .491  .533  .534  .526  .534  .522  .374  .356 