Trepan Reloaded: A Knowledge-driven Approach to Explaining Black-box Models

Trepan Reloaded: A Knowledge-driven Approach to Explaining Black-box Models

Roberto Confalonieri Telefónica Innovación Alpha, email: {roberto.confalonieri,tarek.besold,fermin.moscoso}@telefonica.comDept. of Computer Science, City, University of London, email:    Tillman Weyde Dept. of Computer Science, City, University of London, email:    Tarek R. Besold    Fermín Moscoso del Prado Martín

Explainability in Artificial Intelligence has been revived as a topic of active research by the need of conveying safety and trust to users in the ‘how’ and ‘why’ of automated decision-making. Whilst a plethora of approaches have been developed for post-hoc explainability, only a few focus on how to use domain knowledge, and how this influences the understandability of global explanations from the users’ perspective. In this paper, we show how ontologies help the understandability of global post-hoc explanations, presented in the form of symbolic models. In particular, we build on Trepan, an algorithm that explains artificial neural networks by means of decision trees, and we extend it to include ontologies modeling domain knowledge in the process of generating explanations. We present the results of a user study that measures the understandability of decision trees using a syntactic complexity measure, and through time and accuracy of responses as well as reported user confidence and understandability. The user study considers domains where explanations are critical, namely, in finance and medicine. The results show that decision trees generated with our algorithm, taking into account domain knowledge, are more understandable than those generated by standard Trepan without the use of ontologies.


1 Introduction

In recent years, explainability has been identified as a key factor for the adoption of AI systems in a wide range of contexts [Confalonieriatal2019cogsci, HoffmanMKL18, Doshi-Velez2017, Lipton2018, Ribeiro2016, Miller2017]. The emergence of intelligent systems in self-driving cars, medical diagnosis, insurance and financial services among others has shown that when decisions are taken or suggested by automated systems it is essential for practical, social, and increasingly legal reasons that an explanation can be provided to users, developers or regulators. As a case in point, the European Union’s General Data Protection Regulation (GDPR) stipulates a right to “meaningful information about the logic involved”—commonly interpreted as a ‘right to an explanation’—for consumers affected by an automatic decision [GDPR2016].111Regulation (EU) 2016/679 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) [2016] OJ L119/1.

The reasons for equipping intelligent systems with explanation capabilities are not limited to user rights and acceptance. Explainability is also needed for designers and developers to enhance system robustness and enable diagnostics to prevent bias, unfairness and discrimination [mehrabi2019survey], as well as to increase trust by all users in why and how decisions are made. Against that backdrop, increasing efforts are directed towards studying and provisioning explainable intelligent systems, both in industry and academia, sparked by initiatives like the DARPA Explainable Artificial Intelligence Program (XAI), and carried by a growing number of scientific conferences and workshops dedicated to explainability.

While interest in XAI had subsided together with that in expert systems after the mid-1980s [Buchanan1984, Wick1992], recent successes in machine learning technology have brought explainability back into the focus. This has led to a plethora of new approaches for local and global post-hoc explanations of black-box models [Guidotti2018], for both autonomous and human-in-the-loop systems, aiming to achieve explainability without sacrificing system performance. Only a few of these approaches, however, focus on how to integrate and use domain knowledge to drive the explanation process (e.g., [Towell1993, Renard2019ConceptTree]) or to measure the understandability of explanations of black-box models (e.g., [Ribeiro2018]). For that reason an important foundational aspect of explainable AI remains hitherto mostly unexplored: Can the integration of domain knowledge as, e.g., modeled by means of ontologies, help the understandability of interpretable machine learning models?

To tackle this research question, we propose a neural-symbolic learning approach based on Trepan [Craven1995], an algorithm devised in order to explain trained artificial neural networks by means of decision trees, and we extend it to take into account ontologies in the explanation generation process. In particular, we modify the logic of the algorithm when choosing split nodes, to prefer features associated with more general concepts in a domain ontology. Having explanations bounded to structured knowledge, in the form of ontologies, conveys two advantages. First, it enriches explanations (or the elements therein) with semantic information, and facilitates effective knowledge transmission to users. Second, it supports the customisation of the levels of specificity and generality of explanations to specific user profiles [Hind2019exXAI].

In this paper, we focus on the first advantage, and on measuring the impact of the ontology on the perceived understandabilty of surrogate decision trees. To evaluate our approach, we designed and conducted an experiment to measure the understandability of decision trees in domains where explanations are critical, namely the financial and medical domain. Our study shows that decision trees generated by our modified Trepan algorithm taking domain knowledge into account are more understandable than those generated without the use of domain knowledge. Crucially, this enhanced understandability of the resulting trees is achieved with little compromise on the accuracy with which the resulting trees replicate the behaviour of the original neural network model.

The remainder of the paper is organised as follows. After introducing Trepan, and the notion of ontologies (Section 2), we present our revised version of the algorithm that takes into account ontologies in the decision tree extraction (Section 3). In Section 4, we propose how to measure understandability of decision trees from a technical and a user perspective. Section 5 reports and analyses the results of our experiment. After discussing our approach (Section 6), Section 7 situates our results in the context of related contributions in XAI. Finally, Section 8 concludes the paper and outlines possible lines of future work.

2 Preliminaries

In this section, we present the main foundations of our approach, namely, the Trepan algorithm and ontologies.

2.1 The Trepan algorithm

Trepan is a tree induction algorithm that recursively extracts decision trees from oracles, in particular from feed-forward neural networks [Craven1995]. The original motivation behind the development of Trepan was to approximate a neural network by means of a symbolic structure that is more interpretable than a neural network classification model. This was in the context of a wider interest in knowledge extraction from neural networks (see [Towell1993, AvilaGarcez2001] for an overview).

Priority queue
use to label examples in
enqueue root node into
while  do
     pop node from
     generate for
     use to build set of candidate
     use and to determine
     add to
     for element  do
         add as child of
         if  is not a leaf according to the  then
              enqueue node into with negative
              information gain as priority
         end if
     end for
end while
Algorithm 1 Trepan(,,) .

The pseudo-code for Trepan is shown in Algorithm 1. Trepan differs from conventional inductive learning algorithms as it uses an oracle to classify examples during the learning process. It generates new examples by sampling from distributions over the given examples and constraints, so that the amount of training data used to select splitting tests and to label leaves does not decrease with the depth of the tree. It expands a tree in a best-first manner by means of a priority queue by entropy, that prioritises nodes that have greater potential for improvement. Further details of the algorithm can be found in [Craven1995].

Trepan stops the tree extraction process using two criteria: all nodes do not need to be further expanded because their entropy is low (they contain almost exclusively instances of a single class), or a predefined limit of the tree size (the number of nodes) is reached. Whilst Trepan was designed to explain neural networks as the oracle, it is a model-agnostic algorithm and can be used to explain any other classification model.

In this paper, our objective is to improve the understandability of the decision trees extracted by Trepan. To this end, we extend the algorithm to take into account an information content measure, that is derived using ontologies, and computed using the idea of concept refinement, as detailed below. In order to evaluate the performance of both the original and extended Trepan algorithms, we measure the accuracy and the fidelity of the resulting decision trees. Accuracy is defined as the percentage of test-set examples that are correctly classified. In contrast, fidelity is defined as the percentage of test-set examples on which the classification made by a tree agrees with that provided by its neural-network counterpart. Notice that the crucial measure for assessing the quality of the reconstructed tree is the fidelity, as this is the direct measure of how well the tree’s behaviour mimics the original neural network.

,    ,    ,    ,    ,    ,   

Figure 1: An ontology excerpt for the loan domain.

2.2 Ontologies

An ontology is a set of formulae in an appropriate logical language with the purpose of describing a particular domain of interest, such as finance or medicine. The precise logic used is not crucial for our approach as the techniques introduced here apply to a variety of logics. For the sake of clarity we use description logics (DLs) as well-known ontology languages. We briefly introduce the DL , a DL allowing only conjunctions, existential restrictions, and the empty concept . For full details, see [BaaderDLH03, DBLP:conf/ijcai/BaaderBL05]. is widely used in biomedical ontologies for describing large terminologies and it is the base of the OWL 2 EL profile.

Syntactically, is based on two disjoint sets and of concept names and role names, respectively. The set of concepts is generated by the grammar

where and . A TBox is a finite set of general concept inclusions (GCIs) of the form where and are concepts. It stores the terminological knowledge regarding the relationships between concepts. An ABox is a finite set of assertions and , which express knowledge about objects in the knowledge domain. An ontology is composed by a TBox and an ABox. In this paper, we focus on the TBox only, thus we will use the terms ontology and TBox interchangeably.

The semantics of is based on interpretations of the form , where is a non-empty domain, and is a function mapping every individual name to an element of , each concept name to a subset of the domain, and each role name to a binary relation on the domain. satisfies iff and satisfies an assertion () iff (). The interpretation is a model of the TBox if it satisfies all the GCIs and all the assertions in . is consistent if it has a model. Given two concepts and , is subsumed by w.r.t. () if for every model of . We write when and . is strictly subsumed by w.r.t.  () if and .

Figure 1 shows an ontology excerpt modeling concepts and relations relevant to the loan domain. The precise formalisation of the domain is not crucial at this point; different formalisations may exist, with different levels of granularity. The ontology structures the domain knowledge from the most general concept (e.g., ) to more specific concepts (e.g., , , etc.). The subsumption relation () induces a partial order among the concepts that can be built from a TBox . For instance, the concept is more general than the concept, and it is more specific than the concept.

We will capture the degree of generality (resp. specificity) of a concept in terms of an information content measure that is based on concept refinement. The measure is defined in detail in Section 3 and serves as the basis for the subsequent extension of the Trepan algorithm.

2.3 Concept refinement

The idea behind concept refinement is to make a concept more general or more specific by means of refinement operators. Refinement operators are well-known in Inductive Logic Programming, where they are used to learn concepts from examples [vanderLaag98]. Refinement operators for description logics were introduced in [Lehmann2010], and further developed in [Confalonieri2018, aaai2018]. In this setting, two types of refinement operators exist: specialisation refinement operators and generalisation refinement operators. While the former construct specialisations of hypotheses, the latter construct generalisations.

In this paper we focus on specialisation operators. A specialisation operator takes a concept as input and returns a set of descriptions that are more specific than by taking an ontology into account. The proposal laid out in this paper can make use of any such operators (see e.g., [Confalonieri2018, aaai2018, ijcai2018]). When a specific refinement operator is needed, as in the examples and in the experiments, we use the following definition of specialisation operator based on the downcover set of a concept .

Definition 2.1.

Given a Tbox , and a concept decription , the specialisation operator is defined as follows:

where is the set of concepts that are more specific (or less general) than :

In the above definition, denotes the union of all the subconcepts in the axioms in , plus . For any given axiom in , the set of its subconcepts is ; also, . Notice that is a finite set.

A concept is specialised by any of its most general specialisations that belong to . Every concept can be specialised into in a finite number of steps.

Definition 2.2.

The unbounded finite iteration of the refinement operator is defined as:

where is inductively defined as:

Thus is the set of subconcepts of w.r.t. . We will denote this set by . Since is a finite set, the operator is finite, and it terminates. For a detailed analysis of properties of refinement operators in DLs we refer to [Lehmann2010, Confalonieri2018].

Example \thetheorem.

Let us consider the concepts , and defined in the ontology in Figure 1. Then: , ; ; = .

3 Trepan Reloaded

Our aim is to create decision trees that are more understandable for humans by determining which features are more understandable for a user, and assigning priority in the tree generation process according to increased understandability. Our hypothesis, which we will validate in this paper, is that features are more understandable if they are associated to more general concepts present in an ontology.

To measure the degree of semantic generality or specificity of a concept, we consider its information content [Sanchez2011] as typically adopted in computational linguistics [Resnik1995]. There it quantifies the information provided by a concept when appearing in a context. Classical information theoretic approaches compute the information content of a concept as the inverse of its appearance probability in a corpus, so that infrequent terms are considered more informative than frequent ones.

In ontologies, the information content can be computed either extrinsically from the concept occurrences (e.g., [Resnik1995]), or intrinsically, according to the number of subsumed concepts modeled in the ontology. Here, we adopt the latter approach. We use this degree of generality to prioritise features that are more general (thus presenting less information content), as our assumption is that the decision tree becomes more understandable when it uses more general concepts. From a cognitive perspective this appears reasonable, since more general concepts have been found to be easier to understand and learn [Eleanor1976BasicOI], and we test this assumption empirically below.

Definition 3.1.

Given an ontology , the information content of a feature is defined as:

where is the set of specialisations for , and is the set of subconcepts that can be built from the axioms in the TBox of the ontology (see Section 2.2).

It can readily be seen that the values of are smaller for features associated to more general concepts, and larger for those associated to more specific concepts instead.

Example \thetheorem.

Let us consider the concepts , and defined in the ontology in Figure 1 and the refinements in Example 2.3. The cardinality of is . The cardinality of and is 12 and 2 respectively. Then: , and .

Having a way to compute the information content of a feature , we now propose to update the information gain used by Trepan to give preference to features with a lower information content.

Definition 3.2.

The information gain given the information content of a feature is defined as:

where is the information gain as usually defined in the decision tree literature.

According to the above equation, of a feature is decreased by a certain proportion that varies depending on its information content, and is set to either when the feature is not present in the ontology or when its information content is maximal.

Our assumption that using features associated with more general concepts in the creation of split nodes can enhance the understandability of the tree, is based on users being more familiar with more general concepts rather than more specialised ones. To validate this hypothesis we ran a survey-based online study with human participants. Before proceeding to the details of the study and the results, as a prerequisite we introduce two measures for the understandability of a decision tree—an objective, syntax-based and a subjective, performance-based one—in the following section.

4 Understandability of Decision Trees

Understandability depends not only on the characteristics of the tree itself, but also on the cognitive load experienced by users in using the decision model to classify instances, and in understanding the features in the model itself. However, for practical processing, understandability of decision trees needs to be approximated by an objective measure. We compare here two characterisations of the understandability of decision trees, approaching the topic from these two different perspectives:

  • Understandability based on the syntactic complexity of a decision tree.

  • Understandability based on users’ performances, reflecting the cognitive load in carrying out tasks using a decision tree.

On the one hand, it is desirable to provide a technical characterisation of understandability that can give a certain control over the process of generating explanations. For instance, in Trepan, experts might want to stop the extraction of decision trees that do not overcome a given tree size limit, do have a stable accuracy/fidelity, but have an increasing syntactic complexity.

Previous work attempting to measure the understandability of symbolic decision models (e.g., [Huysmans2011]), and decision trees in particular [Piltaver2016], proposed syntactic complexity measures based on the tree structure. The syntactic complexity of a decision tree can be measured, for instance, by counting the number of internal nodes in the tree or leaves, the number of symbols used in the splits (relevant especially for m-of-n splits), or the number of branches that decision nodes have.

For the sake of simplicity, we focus on the combination of two syntactic measures: the number of leaves in a decision tree, and the number of branches on the paths from the root of the tree to all the leaves in the decision tree. Based on the results in [Piltaver2016], we define the syntactic complexity of a decision tree as:


with being a tuning factor that adjusts the weight of and , and being the coefficient of the linear regression built using the results in [Piltaver2016].

On the other hand, the syntactic complexity of decision trees does not necessarily capture the ease with which actual people can use the resulting trees. A direct measure of user understandability is how accurately a user can employ a given decision tree to perform a decision. An often more precise measure of cognitive difficulty in mental processing is the reaction time (RT) or response latency [Donders1969OnTS]. RT is a standard measure used by cognitive psychologists and has even become a staple measure of complexity in the domain of design and user interfaces [principles03]. In the following section we describe an experiment measuring the cost of processing in terms of accuracy, and RT (among other variables) for different types of decision trees.

An additional factor that has to be taken into account is the tree size. It seems very likely that trees of different sizes, irrespective of any actual complexity, present more difficulties for human understanding that are not necessarily linearly related to the increase in tree size. Therefore, properly understanding the effects on actual understandability requires explicitly controlling the tree sizes. For our experiments, we define three categories of tree sizes based on the number of internal nodes: small (the number of internal nodes is between and ), medium (the number of internal nodes is between and ), and large (the number of internal nodes is between and ).

5 Experimental Evaluation

5.1 Methods

Figure 4: Decision trees of size ‘small’ in the loan domain, extracted without (left) and with (right) a domain ontology. As it can be seen the features used in the creation of the conditions in the split nodes are different.


We used datasets from two different domains to evaluate our approach: finance and medicine. We used the Cleveland Heart Disease Data Set from the UCI archive222, and a loan dataset from Kaggle333 For each of them, we developed an ontology defining the main concepts and relevant relations (the heart and loan ontology contained 29 classes, 66 logical axioms and 28 classes, 65 logical axioms respectively). To extract decision trees using the Trepan and Trepan Reloaded algorithm, we trained two artificial neural networks implemented in pytorch. The neural networks we use in our experiments have a single layer of hidden units. The number of hidden units used for each network is chosen using cross-validation on the network’s training set, and we use a validation set to decide when to stop training networks. The accuracy of the trained neural networks was of and for the loan and heart dataset respectively. In total, for each of the neural networks, we constructed six decision trees, varying their size (measured in number of nodes; i.e., small, medium, large), and whether or not an ontology had been used in generating them. In this manner, we obtained a total of twelve distinct decision trees (2 domains 3 sizes 2 ontology presence values). Figure 4 shows two examples of distilled decision trees. The (avg.) fidelity of the extracted trees was of (Trepan) (Trepan Reloaded) and (Trepan) (Trepan Reloaded) for the loan and heart dataset respectively (see also Table 3). Notice that since the trees are post-hoc explanations of the artificial neural network, the fidelities of the distilled trees, rather than their accuracies, are the crucial measure.


The experiment used two online questionnaires on the usage of decision trees.The questionnaires contained an introductory and an experimental phase.

In the introductory phase, subjects were shown a short video about decision trees, and how they are used for classification. In this phase, participants were asked to provide information on their age, gender, education, and on their familiarity with decision trees.

The experiment phase was subdivided into two tasks: classification, and inspection. Each task starts with an instruction page describing the task to be performed. In these tasks the participants were presented with the six trees corresponding to one of the two domains. In the classification task, subjects were asked to use a decision tree to assign one of two classes to a given case whose features are reported in a table (e.g., Will the bank grant a loan to a male person, with 2 children, and a yearly income greater than €50.000,00?). In the inspection task, participants had to decide on the truth value of a particular statement (e.g., You are a male; your level of education affects your eligibility for a loan.). The main difference between the two types of questions used in the two tasks is that the former provides all details necessary for performing the decision, whereas the latter only specifies whether a subset of the features influence the decision. In these two tasks, for each tree, we recorded:

  • Correctness of the response.

  • Confidence in the response, as provided on a scale from to (‘Totally not confident’=1, …, ‘Very confident’=5).

  • Response time measured from the moment the tree was presented.

  • Perceived tree understandability as provided on a scale from to ( ‘Very difficult to understand’=1, …, ‘Very easily understandable’=5).


63 participants (46 females, 17 males) volunteered to take part in the experiment via an online survey.444The participants were recruited among friends and acquaintances of the authors. Of these 34 were exposed to trees from the finance domain, and 29 to those in the medical domain. The average age of the participants is 33 ( 12.23) years (range: 19 – 67). In terms of educational level their highest level was a Ph.D. for 28 of them, a Master degree for 9 of them, a Bachelor for 12, and a high school diploma for 14. 47 of the respondents reported some familiarity with the notion of decision trees, while 16 reported no such familiarity.

5.2 Results

We fitted a mixed-effects logistic regression model [BAAYEN2008390] predicting the correctness of the responses in the classification and inspection tasks. The independent fixed-effect predictors were the syntactic complexity of the tree, the presence or absence of an ontology in the tree generation, the task identity (classification vs. inspection), and the domain (financial vs. medical), as well as all possible interactions between them, as well as a random effect of the identity of the participant.

Figure 7: Estimated main effects of ontology presence on accuracies (top) and time of response (bottom).

A backwards elimination of factors revealed significant main effects of the task identity, indicating that responses were more accurate in the classification task than they were in the inspection (), of the syntactic complexity (), by which more complex tree produced less accurate responses, and of the presence of the ontology (), indicating that trees generated using the ontology indeed produced more accurate responses (Figure 7a). We did not observe any significant interactions or effect of the domain identity.

We analysed the response times (on the correct responses) using a linear mixed-effect regression model [BAAYEN2008390], with the log response time as the independent variable. As before, we included as possible fixed effects the task identity (classification vs inspection), the domain (medical vs financial), the syntactic complexity of the tree, and the presence or absence of ontology in the trees’ generation, as well as all possible interactions between them. In addition, we also included the identity of the participant as a random effect. A step-wise elimination of factors revealed main effects of task identity (), syntactic complexity (), ontology presence (), as well as significant interactions between task identity and syntactic complexity (), and task identity and domain (). In line with what we observed in the accuracy analysis, we find that those trees that were generated using an ontology were processed faster than those that were generated without one (see Figure 7b).

We analysed the user confidence ratings using a linear mixed-effect regression model, with the confidence rating as the independent variable. We included as possible fixed effects the task identity (classification vs inspection), the domain (medical vs financial), the size of the tree, and the presence or absence of ontology in the trees’ generation, as well as all possible interactions between then. In addition, we also included the identity of the participant as a random effect. A stepwise elimination of factors revealed a main effect of ontology presence (), as well as significant interactions between task identity and syntactic complexity (), and task identity and domain (). These results are almost identical to what was observed in the response time analysis: users show more confidence on judgments performed on trees that involved an ontology, the effect of syntactic complexity is most marked in the inspection task, and the difference between domains only affects the classification task.

Finally, we also analysed the user rated understandability ratings using a linear mixed-effect regression model, with the confidence rating as the independent variable. We included as possible fixed effects the task identity (classification vs inspection), the domain (medical vs financial), the syntactic complexity of the tree, and the presence or absence of ontology in the trees’ generation, as well as all possible interactions between then, and an additional random effect of the identity of the participant as a random effect. A stepwise elimination of factors revealed significant main effects of task (), syntactic complexity (), and of the presence of an ontology (). These results are in all relevant aspects almost identical to what was observed in the accuracy analysis: the inspection task is harder, more syntactically complex trees are less understandable than less complex ones, and trees originating from an ontology are perceived as more understandable.

6 Discussion

Our hypothesis was that the use of ontologies to select features for conditions in split nodes, as described above, leads to decision trees that are easier to understand. This ease of understanding was measured theoretically using a syntactic complexity measure, and cognitively through time and accuracy of responses as well as reported user confidence and understandability.

First of all, the syntactic complexity (Eq. 1) of the trees distilled with Trepan Reloaded is slightly smaller than those generated with Trepan (see Table 1). Such small reduction on syntactic complexity might or might not reflect differences in the actual understandability of the distilled trees by people. However, in our experiments, all online implicit measures (accuracy and response time), and off-line explicit measures (user confidence and understandability ratings) indicate that trees generated using an ontology are significantly more accurately and easier understood by people than are trees generated without ontology. The analyses of the four measures are remarkably consistent in this crucial aspect (see Table 2).

C4.5 Trepan Trepan Reloaded
heart 5.64 3.56 3.46
loan 5.9 2.89 2.63
Table 1: Syntactic complexity (Eq. 1) for trees inferred using C4.5, and distilled using Trepan, and Trepan Reloaded respectively.
Task Measure Trepan Trepan Reloaded
Class. %C. Answers 0.87 (0.32) 0.94 (0.18)
Time (sec) 43.25 (61.16) 24.29 (15.67)
Confidence 4.38 (0.86) 4.56 (0.80)
Understd. 4.06 (0.97) 4.50 (0.48)
%C. Answers 0.78 (0.41) 0.90 (0.27)
Time (sec) 35.90 (24.80) 26.55 (35.74)
Confidence 4.10 (0.96) 4.36 (0.78)
Understd. 3.83 (1.01) 4.20 (0.87)
Table 2: Mean values of correct answers, time of response, user confidence, and user understandability for trees distilled using Trepan and Trepan Reloaded (standard deviations are reported in paranthesis). The difference in results is statistically significant w.r.t. Mann-Whitney and Wilcoxon tests for all measures.
Accuracy Fidelity
C4.5 NN Trepan Trepan Rld. Trepan Trepan Rld.
heart 81.97% 94.65% 82.43% 80.87% 89.23% 88.17%
loan 80.48% 85.98% 86.03% 82.80% 92.73% 92.63%
Table 3: Test-set accuracies and fidelities for trees distilled using Trepan and Trepan Reloaded.

As we anticipated, coercing the outputs of Trepan onto a pre-determined ontology (as in Trepan reloaded) impacts the fidelity (and accuracy) of the resulting trees (see Table 3). Crucially, however, the very small compromise in the fidelity (on both examples, a drop of around one percent) of the neural network reconstruction is more than compensated for by the substantial improvement in the ease with which actual people can understand the resulting trees. When the goal is providing model explanations that are actually understandable by people, such a small compromise in fidelity is well worth it. Notice that, if we were not willing to compromise on fidelity at all, it would not make any sense to deviate in any amount from the original neural network’s performance (i.e., any fidelity below 100% would be unacceptable). In such case, however, one would retain the lack of user understandability of the models.

At this point, one might wonder why we should bother to create surrogate decision trees from black-box models, rather than inferring them directly from data. As already noticed in the original Trepan work [Craven1995], distilling trees from networks can actually result in better trees than those one would obtain by building the decision trees directly. To demonstrate this point, we also trained decision trees directly from the datasets using the classical C4.5 algorithm. Table 3 shows that the trees inferred by the Trepan variants are as accurate –if not more– than those inferred directly. Moreover, the trees built directly had syntactic complexities that roughly doubled those of the trees distilled using either Trepan variant (see Table 1). This indicates that constructing trees directly from the data results in trees substantially more complex than those distilled by Trepan variants, that nevertheless do not outperform them in the task.

There is a similarly small compromise in the accuracy of the decision trees (see Table 3). As we discussed above, in this approach, the accuracy of the resulting trees (i.e., their ability to replicate the testing sets) is less relevant than their fidelity (i.e., their ability to replicate the behaviour of the model we intend to explain). Nevertheless, our Trepan Reloaded method improves the understandability of the trees w.r.t. the original Trepan, while compromising little on the accuracy.

Apart from improving the understandability of (distilled) decision trees, ontologies also pave the way towards the capability of changing the level of abstraction of explanations to match different user profiles or audiences. For instance, the level of technicality used in an explanation for a medical doctor should not be the same as that used for lay users. One wants to adapt explanations without changing the underlying explanation procedure. Ontologies are amenable to automated abstractions to improve understandability [Keet07]. The idea of concept refinement adopted here can be extended to operate on changing the definition of concepts and make them more general or more specific by means of refinement operators [Confalonieri2018, aaai2018, ijcai2018]. This is a line of work that we find a natural continuation of the current study.

In its current form, Trepan Reloaded requires a predefined ontology onto which the features used by our algorithm should be mapped. In such cases, which are common in many domains (e.g., medical, pharmaceutical, legal, biological, etc.), one can directly apply Trepan Reloaded to improve the quality of the explanations. Additional work –beyond the scope of the current study– would be to automatically construct the most appropriate ontology to be mapped onto. Such a process could be achieved by automatically mapping sets of features into pre-existing general domain ontologies (e.g., MS Concept Graph [MSGraph12], DBpedia [dbpedia-swj]). The provision of some form of explicit knowledge, rather than being particular to our method, resides at the core of any attempts at human interpretable explanations. Whether such knowledge is in the form of a domain-specific ontology (as in this study), or as a domain-general one to be adapted ad-hoc, will depend on the particulars of specific applications.

7 Related Works

Most approaches on interpretable machine learning focus either on building directly interpretable models, or on reconstructing post-hoc local explanations. Our approach belongs to the category of post-hoc global explanation methods.

In this latter category, there are a few approaches that closely relate to ours. For instance, the work in [Renard2019ConceptTree] uses concepts to group features (either using expert knowledge or correlations), and embed them into surrogate models in order to constrain their training. In particular, the authors propose a revised version of Trepan that considers only features belonging to concepts in the extraction of a decision tree. Whilst their results show that surrogate trees preserve accuracy and fidelity compared with original versions, the improvement in human-readability is not explicitly tested with users. The approach in [Dhurandhar2019ibm] uses a complex neural network model to improve the accuracy of a simpler model, e.g., a simpler neural network, or a decision tree. This approach assumes to have a white-box access to (some of) the layers of the complex network model, whereas, in our approach we treat the black-box as an oracle. The authors in [BastaniKB17] describe a method for extracting decision tree explanations that actively samples new training points to avoid overfitting. Our approach is similar since Trepan also uses new sampled data during the extraction of the decision tree.

Other works focus on building terminological decision trees from and using ontologies, e.g., [RIZZO20171, ZhangSH02]). These approaches perform a classification task while building a tree rather than building a decision tree from a classification process computed by a black-box.

8 Conclusion and Future Works

In this paper, we proposed an extension of Trepan, an algorithm that extracts global post-hoc explanations of black box-models in the form of decision trees. Our algorithm Trepan Reloaded takes into account ontologies in the distillation of decision trees.

We showed that the use of ontologies ease the understanding of the distilled trees by actual users. We measured this ease of understanding through a rigorous experimental evaluation: theoretically, using a syntactic complexity measure, and, cognitively, through time and accuracy of responses as well as reported user confidence and understandability. All our measures indicated that trees distilled by Trepan Reloaded are significantly more accurately and easier understood by people than are trees generated by Trepan, with only little compromise of the accuracy and the fidelity (see Section 6).

The results obtained are very promising, and they open several direction of future research. On the one hand, we plan to extend this work to support the automatic generation of explanations that can accommodate different user profiles. On the other hand, we aim at investigating to apply our approach to explain CNNs in image classification (e.g., [Zhang2019CNNdtrees]). We also believe that this approach can be useful in bias identification, to understand, for instance, if any undesirable discrimination features are affecting a black-box model.


Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description