Generating Plausible Counterfactual Explanations for Deep Transformers in Financial Text Classification

Generating Plausible Counterfactual Explanations for Deep Transformers in Financial Text Classification


Corporate mergers and acquisitions (M&A) account for billions of dollars of investment globally every year, and offer an interesting and challenging domain for artificial intelligence. However, in these highly sensitive domains, it is crucial to not only have a highly robust and accurate model, but be able to generate useful explanations to garner a user’s trust in the automated system. Regrettably, the recent research regarding eXplainable AI (XAI) in financial text classification has received little to no attention, and many current methods for generating textual-based explanations result in highly implausible explanations, which damage a user’s trust in the system. To address these issues, this paper proposes a novel methodology for producing plausible counterfactual explanations, whilst exploring the regularization benefits of adversarial training on language models in the domain of FinTech. Exhaustive quantitative experiments demonstrate that not only does this approach improve the model accuracy when compared to the current state-of-the-art and human performance, but it also generates counterfactual explanations which are significantly more plausible based on human trials.


1 Introduction and Related Work

In recent years, large-scale, pre-trained transformer models have led to massive improvements on a wide range of natural language processing (NLP) tasks [6, 21], including financial technology applications [7, 35, 32, 34]. However, this impressive ability also coincides with an inherent lack of robustness and transparency, which undermines human trust in the prediction outcome. In the highly sensitive (and financially lucrative) area of FinTech, explainable financial text classification remains an open, and highly alluring question. To tackle this problem, this paper advances a novel approach which first applies robust transformer models (by leveraging adversarial training) on a real-world, up-to-date, self-collected mergers and acquisitions (M&A) dataset, and then generating plausible, post-hoc, counterfactual explanations. In the remainder of this section, we describe relevant work to both of these areas before detailing our contributions.

1.1 Artificial Intelligence in Mergers and Acquisitions

M&As have reshaped the global business landscape for generations, and are having an accelerating impact on the world’s economy as new technologies such as the internet, big data, and artificial intelligence disrupt many business sectors [33]. To appreciate this, a recent economic study provided strong evidence that M&A deal rumours could influence the share price volatility of rumor target firms [22]. In particular, they showed that, on average, M&A rumors have a positive short term impact and a negative long term impact on the cumulative abnormal returns of the potential acquirers and targets. In the existing AI literature, focus here is typically on predicting likely M&A targets [33], and forecasting the likely success of M&A [5] for developing high-risk/high-reward investment strategies based on M&A speculation [12]. While the existing literature typically focuses on predicting likely M&A acquirers and targets, in this work we address a distinct but related task: namely, whether a merger and acquisition rumor is likely going to prove to be correct.

1.2 Visualization-based Explanations

To interpret a model’s prediction, prior efforts have focused on either incorporating pre-hoc analysis into the experimental design [2], or developing post-hoc analysis algorithms to select or modify particular instances of the dataset to explain the behavior of models [15, 16]. Recent research [11] shows that transformer models can not be perfectly explained from their intrinsic architecture, and a further work [2] provides strong evidence that self-attention distributions are not directly interpretable. For this reason, model-agnostic, post-hoc explanation methods have come to the fore among these works for explaining text classification models, as they are easy to understand and do not require access to the data or the model [15].

Towards post-hoc explanation in NLP tasks, [25] proposes a popular way named contextual decomposition (CD) to quantify the importance of each individual word/phrase by computing the change to the model prediction when solely removing a word/phrase. Its hierarchical extensions [27, 13] continue to refine the explanation algorithms that calculate and further visualize the individual phrase’s importance. However, despite these visualization-based methods [25, 27, 13] having achieved good results on a popular dataset of sentiment analysis (namely the Stanford Sentiment Treebank-2 [SST-2] dataset where human create the ground truth with their subjective judgement), how to generate explanations in more complex scenarios where human performance is worse than a model have not been well studied. As a result, the prior lines of visualization-based works cannot provide a clear boundary between positive and negative instances to human, whereas counterfactuals could provide \enquotehuman-like logic to show a modification to the input that makes a difference to the output classification [3]. Hence, post-hoc, example-based explanation methods have received more and more attention in recent years [15].

1.3 Counterfactual Text Explanations

Counterfactual explanations are renown for their explanatory ability in AI systems [30]; specifically, they offer the ability to explain models (such as transformers) without having to \enquoteopen the black-box [10], by conveying causal information about what contributed to a given classification. To understand counterfactuals in the context of text classification, consider a sentiment classification task were a black-box model may classify \enquoteJohn loved the film with a positive sentiment, and explain the prediction counterfactually by presenting \enquoteJohn hated the film. Glossed, this latter text is the AI explaining the prediction by saying \enquotef the word love was replaced with the word hate, I would have thought it was a negative sentiment. This allows us to understand the main reasoning process behind the classifier in question, thus explaining the prediction causally. To understand the issue of counterfactual plausibility, consider that the previous explanation may also generate a counterfactual which reads \enquoteJohn not the film. This text may \enquoteflip the classification to the counterfactual class, but it is grammatically implausible, and (arguably) very difficult to contextualize. The reason this is important is because humans avoid creating counterfactuals which are far from a \enquotepossible world [30], and by extension wildly implausible [3, 17]. In response to this, our work attempts to guarantee more grammatically plausible explanations, and does not rely on attention weights, nor is it constrained to a specific text domain.

Contributions and Paper Outline

  • We present a novel dataset to the interesting and challenging problem of artificial intelligence in M&A prediction.

  • To the best of our knowledge, the present work is the first general approach to generate grammatically plausible counterfactual explanations for unstructured text classification.

  • The primary technical contribution in this work is to generate grammatically plausible counterfactuals by replacing the most important words with the antonyms (REP-SCD) based on pre-trained language models. Furthermore, two additional variants (removing/inserting works at the most important place, namely RM-SCD and INS-SCD) are proposed to guarantee counterfactual generations, albeit ones which are less plausible.

The remainder of this paper is organized as follows. Section 2 details our novel dataset and the pre-processing steps involved. Section 3 describes our adversarial training approach, with the sensitivity-based method for counterfactual explanation generation. Exhaustive experiments (both quantitative and human-based) show clear improvements in our method over current state-of-the-art, both in regards to classification accuracy, and explanation quality (see Sections 4 and 5). Finally, the implications of this work on XAI and future research is discussed.

2 The Novel Mergers and Acquisitions Dataset

Description Number
#Processed deal news total (2007-2019) 4,098
#Train (2007-2014) 3,120
#Validation (2015 - 2016) 478
#Test (2017 - Aug 2019) 500
#Unique companies and institutions 1,406
Table 1: The description of our dataset

For this study we adopted a large-scale, up-to-date M&A dataset collected from Zephyr, a comprehensive database of deal data from the \enquotereal world. The dataset 1 contains 14,539 news articles or tweets on M&A events between January 1st 2007, and August 12th 2019. Each instance corresponds to a specific editorial M&A article which describes a possible deal between an acquirer and a target company (also including a few IPO rumours). Additionally, each datapoint also includes the deal outcome (see below), and the deal announcement data, if relevant. In this work, the deal outcome corresponds to the target class, and the raw dataset contains the following outcome types: complete -- a deal between the acquirer and target companies concluded successfully; rumour -- no deal materialized between the acquirer and target company; pending -- a desired deal between the acquirer and target company has been confirmed, and at the time of data collection was deemed to be in-progress, but not yet complete; cancelled -- a past potential deal between the acquirer and target companies has been confirmed, but it did not complete, and is no longer being pursued.

In order to prepare the raw dataset for use in this study, a number of pre-processing steps were carried out:

  1. In this work we chose to focus on a binary classification task and, as such, removed instances with outcome types of cancelled and pending, leaving only those instances that correspond to completed deals (the positive class) and rumours (the negative class).

  2. We eliminated instances where both acquiring and target companies were non-US, due to a tendency towards low-quality data; in other words, all of the instances in our dataset include a US Listed Company as either the acquirer or the target or both.

  3. Articles published within one day or after the deal announcement date were also removed, this is because our interest is in developing a prediction model that is capable of generating accurate predictions at least one day in advance of any deal outcome.

  4. Finally, the remaining instances are randomly over-sampled to ensure an even split between positive (completed) and negative (rumours) instances for each year.

The result is a dataset of 4,098 instances (news articles and meta-data) which we split into training, validation, and testing sets on a year-by-year basis (see Table 1).

3 Methodology

The pipeline of our method is shown in Fig. 1. First, as a prerequisite, a transformer variant is fine-tuned on the M&A prediction task, alongside adversarial training (which as we shall see is shown to be promising in this domain). Second, important words in the test instances are identified using a sampled contextual decomposition technique after the prediction. Third, a counterfactual explanation is generated by replacing these words with grammatically plausible substitutes. As we shall see, although this method does not always guarantee a plausible counterfactual will be found, we propose two alternative methods which will, albeit with the possible trade-off of plausibility. These steps are detailed next.

3.1 Step 1: Robust Transformer Classification Models

As eluded to earlier, M&A prediction is a highly sensitive domain, and despite adversarial training showing promise previously [9, 28], it has never been tested in this domain. Hence, to try ensure a robust model which can simultaneously generate intelligible explanations, we explore its usage here compared to other popular approaches. Given a news article, we adopt the classical transformer architecture proposed by [29]. The original multi-head self-attention is subsequently applied to the -th document , which is calculated as follows:


where are weight metrics, and the attention is computed as:


for input query, key and value matrices . The outputs from the attention calculations are concatenated and transformed using an output weight matrix .

Figure 1: The pipeline of our methods, namely REP-SCD, RM-SCD, and INS-SCD. We show real examples of generating diverse counterfactual instances that flip the prediction result from completed to rumour. The original input has been changed by iteratively modifying words in order of their importance until the prediction matches the counterfactual class. The outputs (logits) of the predictions are represented in green, and orange points, respectively.

Additionally, the adversarial noise, treated as a form of regularization, is generated by the Fast Gradient Method (FGM) [24] and Projected Gradient Descent (PGD) [23]. The idea of using adversarial perturbation is derived from the usage of adversarial attacks [4] to evaluate the robustness of neural networks, while the recent advances of using the adversarial training in NLP models [20] inspires us to use it as a way of regularization. For each embedded word in -th news article , the FGM computes its perturbation as follows:


where is the perturbation of , denotes the current values of the parameters of the classifier, and denotes the loss function (cross entropy) associated with the classifier. The perturbation can be easily computed using back-propagation. The projected gradient descent, which can be considered as a multi-step variant of the FGM, computes the perturbation of iteratively:


where is the constraint space of the perturbation, denotes the projection of a vector onto the feasible set , and is the step size. We use Adam optimizer with learning rate decay to train our model until convergence.

Input: Testing document example = , the corresponding ground truth label Y, pre-trained Mask Language Model MLM, negative pronouns list NP, fine-tuned transformer classifier C.
Output: Positive Word Dictionaries POS, Negative Word Dictionaries NEG, Plausible counterfactual example(s) =

2:for each word in in  do
3:     if the prev word is in NP then
4:         Creat the whole phrase by contextual decomposition
5:         Computer the importance score via Eq.(7)
6:     else
7:         Computer the importance score via Eq.(7)
8:     end if
9:end for
10:Create dictionaries with words: , alongside the word positions sorted by the descending order of their importance scores .
11:for each word position in  do
12:     ,
13:     if  == POS then
14:         , Intersection ( and ), ( and )
15:     else
16:         , Intersection ( and ), ( and )
17:     end if
19:end for
20:for each word , in zip (,do
21:      Insert to
22:      Replace with
23:     if then
24:         Add to the set
25:     end if
26:end for
Algorithm 1 Plausible Counterfactual Instances Generation

3.2 Step 2: Context-Independent Word Importance

To calculate the context independent importance up to one word, we adopt the sensitivity of contextual decomposition technique from [23] which removed part of inputs from the sequence text to evaluate a model’s sensitivity to them, thereby allowing for the identification of important features. In its hierarchical extensions -- Sampling and Contextual Decomposition (SCD), [13] mask out the phrase from the input while the max sequence length is set to 40. However, the average input length in our data is much larger than 40. We, therefore, propose a phrase-level removing method only if the phrase starts with the negative pronouns or limitations. Otherwise, only a single word will be removed. For example, in the sentence \enquotethe deal is not closing currently, the attribution of \enquoteclosing should be positive while the attribution of \enquotenot closing should be negative. In this situation, we remove the whole phrase \enquotenot closing together to calculate the influence in terms of the logits change in the output layer of the transformer and then assign the negative score to the word \enquoteclosing.

Given a phrase starting with the negative limitations in the -th document , we sample the documents which contain the same phrase to alleviate the influence by chance when there are multiple shreds of evidence saturating the prediction. For example, in the source \enquoteJPMorgan is closing in on a deal, sources close to the situation are optimistic for deal completion, if we only remove the word \enquoteclosing, the prediction would not be changed so much. In this sampling way, the proposed context-independent importance of word and phrase is more robust to saturation. The formula for calculating the importance can be written as:


where denotes the resulting document after masking out a single token or a phrase starting with the negative pronoun in the length of surrounding the phrase . we use to represent the model prediction logits after replacing the masked-out context. indicates the operation of masking out the phrase in a input file sampling from the testing set .

As an aside, the resulting top 15 most influenced words are shown in Table 2. In total, there are 123 positive words and 155 negative words in the dictionaries. We can see the average influence score of positive words (0.637) is higher than the negative words (0.385). It may reveal that positive words usually contain more powerful clues in predicting the M&A deal. That would be interesting to see which kind of words in the sources illustrate the deal is more likely to be completed in the future and which kind of words would be likely to kill the deal.

Positive Words Sensitivity Negative Words Sensitivity
announced 5.841 talks 4.674
line 5.715 could 2.484
announcement 4.469 flag 2.236
agreement 3.378 diligence 1.363
acquiring 3.342 considering 1.196
completion 2.727 time 1.186
agreed 2.429 may 1.085
closing 2.125 looking 0.983
consideration 1.994 this 0.972
prevailed 1.639 when 0.914
acquire 1.520 potentially 0.870
paid 1.461 if 0.847
disclosed 1.403 intention 0.836
selling 1.385 year 0.812
could 1.360 takeover 0.790
Table 2: Top 15 most influenced words towards the M&A prediction. The influence score for each word is calculated and added up by Sampling and Contextual Decomposition (SCD) on the testing set.

3.3 Step 3: Counterfactual Instance Generation

As shown in Algorithm 1, we summarize three different counterfactual generation methods, namely, the primary technique which generates grammatically plausible counterfactuals (REP-SCD), and two further variants to guarantee counterfactual generation (RM-SCD and INS-SCD). We combine these three methods to alleviate a major issue in counterfactual explanation, that is, there is no guarantee that for a given example a counterfactual instance is found. Our main technique identifies the most important word(s) in a test instance using SCD and replaces them with the intersection of grammatically plausible substitutes [using masked language model (MLM)] and words in the reverse emotional dictionary. The raw document content itself is taken as input, and MLM outputs for each masked position. After all masked positions are infilled, we get the reconstructed document:


We iterative repeat this operation at the most important word positions ranked by SCD until the reconstructed document ultimately moves the model’s classification towards the opposing class. Notably, there may be more than one counterfactual explanation corresponding with the original text instance.

4 Experiment 1: Financial Text Classification with Robust Transformers

In this section we describe the results of a comprehensive evaluation of classification accuracy, comparing a variety of different classification baselines (including a human baseline) to our adversarial transformer approach.

4.1 Methods Used

The baselines used can be grouped into several distinct categories: human evaluations -- traditional machine learning approaches (SVM) -- classical deep learning approaches (CNN [18], BiGRU [1] , and HAN [36]) -- and various transformer approaches with/without pruning strategies. These transformer-based models are generally considered to provide the current state-of-the-art in text classification. We reproduce these baselines based on the Transformers.2

Acquiring a human baseline

As a baseline, we asked 26 participants which were experts in economics and finance to predict M&A events by completing 50 M&A evaluation questionnaires. The participants consisted of Ph.D. students, and academics from the fields of economics/finance. All participants were either native English speakers or had a high degree of English competence. Each questionnaire provided information on ten M&A cases/instances, sampled randomly without replacement from the test set. In addition, the news articles available in the dataset that were published before the deal announcement were also provided. The questionnaire asked the participant to predict the outcome of the deal (complete or rumour), and to state their confidence in this prediction.

4.2 Classification Results

In line with best practice, model hyper-parameters are tuned using the validation set. In particular, the maximum sequence length is set as 256, and the size of transformers are all set as large. All experiments are using the conventional Matthews Correlation Coefficient (MCC), accuracy and F1 metrics. The classification results are summarized in Table 3 with Random Guess used to provide a lower-baseline based on chance. While the human evaluators performed better than chance their ability to predict deal outcomes is limited when compared to the more sophisticated machine models that follow. These results are particularly compelling as the human evaluators had considerable domain expertize.

Each of the machine learning approaches offer substantial improvements over the human evaluators and a clear separation can be seen between traditional machine learning (with MCC scores in low 0.7 range/F1 scores in the low 0.8 range), classical deep learners (with MCC scores in the range 0.73-0.74/F1 scores in the range 0.84-0.85), and recent transformer-based models (MCC0.75/F10.87).

We further evaluate the relative influence of the adversarial perturbation to test the robustness of the models. We find that all variants of the transformer [19, 26] benefit from the adversarial perturbation during the training process in terms of the prediction results in the practice. For exploring the reason why the optimal transformer classifier can outperform the human test a lot -- 39%, we take the best performed model -- RoBERTa [21] with adversarial training as our optimal classifier in the following experiments for generating the plausible counterfactual explanations.

Evaluation MCC Accuracy F1 Evaluation MCC Accuracy F1
Baselines Transformers
Random Guess 0.013 0.510 0.462 ALBERT 0.768 0.882 0.879
Human Evaluation 0.307 0.640 0.672 +Ad. 0.780 0.890 0.888
Traditional ML DistilBERT 0.750 0.874 0.877
SVM(TF-IDF) 0.701 0.816 0.816 +Ad. 0.784 0.890 0.891
Classical DL BERT-WWM 0.751 0.874 0.879
CNN-Text 0.729 0.848 0.847 +Ad. 0.788 0.894 0.894
BiGRU 0.734 0.836 0.849 RoBERTa 0.780 0.892 0.888
HAN 0.742 0.848 0.853 +Ad. 0.788 0.894 0.895
Table 3: Evaluations performed by human, machine learning, deep learning, and transformer-based models, alongside the ablation study for adversarial training (indicate as +Ad.). The scores in bold and italics indicate the best performance across all approaches.

5 Experiment 2: Generating Plausible Counterfactual Explanations

Interpretability is an increasingly important property for many deep learning techniques, including computer vision and natural language processing [16], especially in critical tasks such as financial text classification; high-value investment decisions demand a reasonable level of interpretability if investors are to trust the predictions that come for a system such as the one described in this work. In this section, we describe the qualitative analysis for each of our methods. Subsequently, we show the evaluation of user studies compared to the existing example-based explanation methods.

5.1 Qualitative Analysis for the Resulting Counterfactual Instances

In qualitative analysis, we identified five typical patterns among the generated counterfactual instances as shown in Table 4 where we highlight the changing parts. Based on the 500 testing examples, we guarantee that there is at least one counterfactual instance corresponding with the original input. We gain insight into which aspects are causally relevant by comparing the original context to the revised context which can flip the classifier’s prediction.

Types of Algorithms
Ori: Professional vacation services provider ILG is considering a merger with Diamond Resorts International...
REP-SCD: Replacing with the certainty word Rev: Professional vacation services provider ILG is announcing a merger with Diamond Resorts International...
Ori: Vivendi is in early discussions to sell a 10.0 per cent stake in Universal Music Group (UMG) to Tencent for roughly EUR 3.00 billion...
REP-SCD: Changing the deal value Rev: Vivendi is in early discussions to sell a 10.0 per cent stake in Universal Music Group (UMG) to Tencent for roughly EUR 3.00 million
Ori: Stryker is buying US-based spinal implant technology company K2M Group Holdings for USD 1.40 billion in cash
INS-SCD: Recasting as Rev: Stryker is potentially buying US-based spinal implant technology company K2M Group Holdings for USD 1.40 billion in cash
Ori: WPP has confirmed the recent speculation that it has entered into exclusive negotiations with private equity firm Bain Capital...
INS-SCD: Inserting the negative word Rev: WPP has not confirmed the recent speculation that it has entered into exclusive negotiations with private equity firm Bain Capital...
Ori: This suitor is the Namdar and Washington Prime consortium, the insiders noted, adding that there can be no certainty a deal will complete...
RM-SCD: Removing the negative limitation(s) Rev: This suitor is the Namdar and Washington Prime consortium, the insiders noted, adding that there can be certainty a deal will complete...
Table 4: Most prominent categories of counterfactual explanations generated by our algorithms, namely RM-SCD, REP-SCD, and INS-SCD for M&A Predictions. Ori and Rev are short for original and revised instances, respectively.

5.2 Human Evaluation for the Explanation

We implement interpretation experiments on the optimal fine-tuned transformer classifier. While an explainable model trained with supervised learning is a common method to interpret the results of text classification [31], the self-supervised learning explainable frameworks have been scarcely found. Meanwhile, the work in [14] consider similar types of edits to generate counterfactually-revised data, however, all of the instances are generated by human which greatly limits the expansibility of the method. To comprehensively evaluate the performance of our method, we consider a state-of-the-art example-based explanation framework for comparison, namely HotFlip  [8], which uses gradients to identify important words and then flip it with the adversarial word which can cause the maximum change in gradients.

For user evaluation, here we ask domain experts in finance to rate our explanations on two aspects, (1) how plausible (mainly in terms of grammar and comprehension) it is, and (2) how reasonable it is (i.e., does the explanation make sense). We compare our method to Hotflip - the current state-of-the-art framework for counterfactual explanation - at the time of writing. Each score is measured on a scale of 1-5, where 5 is the best, and 1 is the worst. We randomly sample 100 examples from the testing set for 5 participants to answer (20 examples per person). By combining the REP-SCD, RM-SCD, INS-SCD together, our method achieves significantly higher ranking score compared to HotFlip, more specifically, 2.35 score improvements (4.35/2.00) were made regarding plausibility while 0.85 score improvements (4.00/3.15) were made on reasonableness, showing a -value less than 0.001 and 0.05, respectively. Hence, there is compelling evidence that our method can generate counterfactual explanations which are more plausible and reasonable.

6 Conclusion and Future Work

In this work, we pursued a new research problem of M&A prediction. Our transformer-based classifier leveraged the regularization benefits of adversarial training to enhance model robustness. More importantly, we built upon previous techniques to quantify the importance of words and help guarantee the generation of plausible counterfactual explanations with a masked language model in financial text classification. The results demonstrate superior accuracy and explanatory performance compared to state-of-the-art techniques. An obvious extension would be to include canceled deals into the classifier, or to predict novel M&A events based on market descriptions of companies (e.g., scale, finances, and target markets). Moreover, additional financial events (e.g., misstatement detection and earnings call analysis) is yet another related task to be considered for further research.


We would like to thank Tianhao Fu, Yimeng Li, Yang Xu and Prof. Mark Keane for their helpful advice and discussion during this work. Also, we would like to thank the anonymous reviewers for their insightful comments and suggestions to help improve the paper. This research was supported by Science Foundation Ireland (SFI) under Grant Number _.




  1. D. Bahdanau, K. Cho and Y. Bengio (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473. Cited by: §4.1.
  2. G. Brunner, Y. Liu, D. Pascual, O. Richter, M. Ciaramita and R. Wattenhofer (2020) On identifiability in transformers. In Proceedings of the International Conference on Learning Representations (ICLR), External Links: Link Cited by: §1.2.
  3. R. M. Byrne (2019) Counterfactuals in explainable artificial intelligence (xai): evidence from human reasoning. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, pp. 6276–6282. Cited by: §1.2, §1.3.
  4. N. Carlini and D. Wagner (2017) Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp), pp. 39–57. Cited by: §3.1.
  5. J. Danbolt, A. Siganos and A. Tunyi (2016) Abnormal returns from takeover prediction modelling: challenges and suggested investment strategies. Journal of Business Finance & Accounting 43 (1-2), pp. 66–97. Cited by: §1.1.
  6. J. Devlin, M. Chang, K. Lee and K. Toutanova (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. Cited by: §1.
  7. J. Duan, Y. Zhang, X. Ding, C. Y. Chang and T. Liu (2018) Learning target-specific representations of financial news documents for cumulative abnormal return prediction. In Proceedings of the 27th International Conference on Computational Linguistics (COLING-18), pp. 2823–2833. Cited by: §1.
  8. J. Ebrahimi, A. Rao, D. Lowd and D. Dou (2017) Hotflip: white-box adversarial examples for text classification. arXiv preprint arXiv:1712.06751. Cited by: §5.2.
  9. I. J. Goodfellow, J. Shlens and C. Szegedy (2014) Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572. Cited by: §3.1.
  10. R. M. Grath, L. Costabello, C. L. Van, P. Sweeney, F. Kamiab, Z. Shen and F. Lecue (2018) Interpretable credit application predictions with counterfactual explanations. arXiv preprint arXiv:1811.05245. Cited by: §1.3.
  11. C. Grimsley, E. Mayfield and J. R. Bursten (2020) Why attention is not explanation: surgical intervention and causal reasoning about neural models. In Proceedings of The 12th Language Resources and Evaluation Conference, pp. 1780–1790. Cited by: §1.2.
  12. X. Ji and G. Jetley (2009-03) The shrinking merger arbitrage spread: reasons and implications. Financial Analysts Journal 66, pp. . External Links: Document Cited by: §1.1.
  13. X. Jin, Z. Wei, J. Du, X. Xue and X. Ren (2020) Towards hierarchical importance attribution: explaining compositional semantics for neural sequence models. In Proceedings of the International Conference on Learning Representations (ICLR), External Links: Link Cited by: §1.2, §3.2.
  14. D. Kaushik, E. Hovy and Z. C. Lipton (2020) Learning the difference that makes a difference with counterfactually-augmented data. In Proceedings of the International Conference on Learning Representations (ICLR), Cited by: §5.2.
  15. M. T. Keane and B. Smyth (2020) Good counterfactuals and where to find them: a case-based technique for generating counterfactuals for explainable ai (xai). In International Conference on Case-Based Reasoning (ICCBR), Cited by: §1.2, §1.2.
  16. E. M. Kenny and M. T. Keane (2019) Twin-systems to explain artificial neural networks using case-based reasoning: comparative tests of feature-weighting methods in ann-cbr twins for xai. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, pp. 2708–2715. Cited by: §1.2, §5.
  17. E. M. Kenny and M. T. Keane (2020) On generating plausible counterfactual and semi-factual explanations for deep learning. arXiv preprint arXiv:2009.06399. Cited by: §1.3.
  18. Y. Kim (2014) Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751. Cited by: §4.1.
  19. Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma and R. Soricut (2019) Albert: a lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942. Cited by: §4.2.
  20. X. Liu, H. Cheng, P. He, W. Chen, Y. Wang, H. Poon and J. Gao (2020) Adversarial training for large neural language models. arXiv preprint arXiv:2004.08994. Cited by: §3.1.
  21. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer and V. Stoyanov (2019) Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692. Cited by: §1, §4.2.
  22. M. Ma and F. Zhang (2016) Investor reaction to merger and acquisition rumors. SSRN 2813401. Cited by: §1.1.
  23. A. Madry, A. Makelov, L. Schmidt, D. Tsipras and A. Vladu (2018) Towards deep learning models resistant to adversarial attacks. In Proceedings of the International Conference on Learning Representations (ICLR), Cited by: §3.1, §3.2.
  24. T. Miyato, A. M. Dai and I. Goodfellow (2017) Adversarial training methods for semi-supervised text classification. In Proceedings of the International Conference on Learning Representations (ICLR), Cited by: §3.1.
  25. W. J. Murdoch, P. J. Liu and B. Yu (2018) Beyond word importance: contextual decomposition to extract interactions from LSTMs. In Proceedings of the International Conference on Learning Representations (ICLR), External Links: Link Cited by: §1.2.
  26. V. Sanh, L. Debut, J. Chaumond and T. Wolf (2019) DistilBERT, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108. Cited by: §4.2.
  27. C. Singh, W. J. Murdoch and B. Yu (2019) Hierarchical interpretations for neural network predictions. In Proceedings of the International Conference on Learning Representations (ICLR), External Links: Link Cited by: §1.2.
  28. D. Tsipras, S. Santurkar, L. Engstrom, A. Turner and A. Madry (2018) Robustness may be at odds with accuracy. In International Conference on Learning Representations, Cited by: §3.1.
  29. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser and I. Polosukhin (2017) Attention is all you need. In Advances in neural information processing systems, pp. 5998–6008. Cited by: §3.1.
  30. S. Wachter, B. Mittelstadt and C. Russell (2017) Counterfactual explanations without opening the black box: automated decisions and the gdpr. Harv. JL & Tech. 31, pp. 841. Cited by: §1.3.
  31. E. Wallace, J. Tuyls, J. Wang, S. Subramanian, M. Gardner and S. Singh (2019) AllenNLP interpret: a framework for explaining predictions of nlp models. arXiv preprint arXiv:1909.09251. Cited by: §5.2.
  32. F. Z. Xing, E. Cambria and Y. Zhang (2019) Sentiment-aware volatility forecasting. Knowledge-Based Systems 176, pp. 68–76. Cited by: §1.
  33. J. Yan, S. Xiao, C. Li, B. Jin, X. Wang, B. Ke, X. Yang and H. Zha (2016) Modeling contagious merger and acquisition via point processes with a profile regression prior.. In International Joint Conferences on Artifical Intelligence, IJCAI-16, pp. 2690–2696. Cited by: §1.1.
  34. L. Yang, T. L. J. Ng, B. Smyth and R. Dong (2020) HTML: hierarchical transformer-based multi-task learning for volatility prediction. In Proceedings of The Web Conference 2020, WWW ’20, pp. 441–451. Cited by: §1.
  35. L. Yang, Z. Zhang, S. Xiong, L. Wei, J. Ng, L. Xu and R. Dong (2018) Explainable text-driven neural network for stock prediction. In 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), pp. 441–445. Cited by: §1.
  36. Z. Yang, D. Yang, C. Dyer, X. He, A. Smola and E. Hovy (2016) Hierarchical attention networks for document classification. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-16, pp. 1480–1489. Cited by: §4.1.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description