End-to-end Emotion-Cause Pair Extraction via Learning to Link

End-to-end Emotion-Cause Pair Extraction via Learning to Link


Emotion-cause pair extraction (ECPE), as an emergent natural language processing task, aims at jointly investigating emotions and their underlying causes in documents. It extends the previous emotion cause extraction (ECE) task, yet without requiring a set of pre-given emotion clauses as in ECE. Existing approaches to ECPE generally adopt a two-stage method, i.e., (1) emotion and cause detection, and then (2) pairing the detected emotions and causes. Such pipeline method, while intuitive, suffers from two critical issues, including error propagation across stages that may hinder the effectiveness, and high computational cost that would limit the practical application of the method. To tackle these issues, we propose a multi-task learning model that can extract emotions, causes and emotion-cause pairs simultaneously in an end-to-end manner. Specifically, our model regards pair extraction as a link prediction task, and learns to link from emotion clauses to cause clauses, i.e., the links are directional. Emotion extraction and cause extraction are incorporated into the model as auxiliary tasks, which further boost the pair extraction. Experiments are conducted on an ECPE benchmarking dataset. The results show that our proposed model outperforms a range of state-of-the-art approaches in terms of both effectiveness and efficiency.

1 Introduction

Emotion cause extraction (ECE) [15] aims to extract possible causes for a given emotion clause. While ECE has attracted an increasing attention due to its theoretical and practical significance, it requires that the emotion signals should be given. In practice emotion annotation is rather labor intensive, limiting the applicability of ECE in practical settings. To address the limitation of ECE, the emotion-cause pair extraction (ECPE) task was recently proposed in [20]. Unlike ECE [15], ECPE aims to extract emotions and causes without any given emotion signals, and thus better aligns with real-world applications. An illustrative example is given in Figure 1(a), showing that clause 4 serves as the emotion and clauses 2 and 3 are the corresponding causes. Typically, ECPE is formulated as extracting emotion-cause pairs, e.g., (clause 4, clause 2) and (clause 4, clause 3), directly from provided documents.

(a) An example document and extracted emotion-cause pairs
(b) Directional graph of the document
Figure 1: Extracting emotion-cause pairs of the given document via learning to link.

ECPE is a challenging task as it requires the extraction of emotions, causes and emotion-cause pairs. Existing work in ECPE mainly focuses on how to collaboratively extract emotions and causes and combine them in an appropriate way. Thus, a two-stage method [20] is typically adopted, which divides pair extraction into two steps: firstly detecting emotions and causes, and then pairing them based on the likelihood of cartesian products between them. Such pipeline method approach is intuitive and straightforward. However, two critical issues arise. One is the error propagation from the first step to the second. The other issue is that, owing to the step-by-step structure, the two-stage models are often computationally expensive.

To tackle the afore-mentioned issues, we propose an end-to-end multi-task learning model for predicting emotion-cause pairs, namely E2EECPE, which connects emotions and causes within one single stage. More specifically, benefitting from end-to-end architectures, E2EECPE resolves the issue of error propagation while substantially reduces the computation time compared with the state-of-the-art models. Particularly, our model views emotion-cause pair extraction as a link prediction problem, as shown in Figure 1(b), and predicts whether there exists a directional link from the emotion to cause. Furthermore, we incorporate into the model two auxiliary tasks, namely emotion extraction and cause extraction, which are oriented to further enhance the expressiveness of the intermediate emotion representation and cause representation.

Extensive experiments are carried out on a benchmarking ECPE dataset. The experimental results demonstrate the effectiveness and efficiency of the proposed E2EECPE model, in comparison with a variety of state-of-the-art baselines. Moreover, further ablation study indicates that the auxiliary tasks are beneficial.

2 Related Work

First of all, our work is related to extracting causes based on emotions explicitly presented in documents, i.e., ECE [15]. Earlier work views ECE as a word-level sequence tagging problem and tries to solve it with corresponding tagging techniques. Therefore, primary efforts have been made on discovering refined linguistic features [4, 14], yielding improved performance. In line with other tagging related tasks such as named entity recognition (NER), support vector machines (SVMs) [9] and conditional random fields (CRFs) [13] have been used for ECE. More recently, instead of concentrating on word-level cause detection, clause-level extraction [8] is put forward in that the impact of individual words in a cause can span over the whole sequence in the clause.

With the emergence and development of deep representation learning, neural models has also been utilized in ECE.  [5] leverages long short-term memory networks (LSTMs) [11] to promote the context awareness of clause modelling. [7] views the information extraction problem as the retrieval task in question answering (QA) and examines the effect of memory networks [19] for extraction. Likewise, taking advantage of attention mechanism [1],  [16] employs a co-attention based model and achieves the state-of-the-art performance.

In light of recent advances in multi-task learning, joint extraction of emotions and causes is investigated  [3] to exploit the mutual information between two correlated tasks. However, these works do not explicitly combine two tasks into one. Thereafter,  [20] argues that, while co-extraction of emotions and causes is important, emotion-cause pair extraction (ECPE) is a more challenging problem that is worth putting more emphasis on. Nevertheless,  [20] adopts a two-stage approach, which performs emotion and cause extraction first and then pairs the extracted emotions and causes. As discussed in the previous section, such two stage approach suffers from error propagation and high computation cost.

Our work aims to tackle these challenges in ECPE. Rather than processing emotion-cause pair extraction as a two stage task (as used in the existing work), we consolidate two stages into a unified multi-task learning framework, and further consider it as a link prediction task which could be solved in an end-to-end manner.

3 Methodology

To solve ECPE in end-to-end fashion, we take inspiration from the link prediction problem in graph learning, which aims at predicting potential edges between unconnected vertices in a graph. Essentially, if we consider emotion-cause pairs in a document as triplets in a graph, then the extraction of such pairs is a sort of link prediction from a graph that is at first armed with no edges but only vertices. In order to achieve above procedure, we borrow the idea of learning a graph-based dependency parser [6] and adapt it to our target task. Coupling link prediction with auxiliary emotion extraction and cause extraction tasks, our model is capable of jointly, and more effectively, extracting emotions, causes, and emotion-cause pairs.

Figure 2: An overview of E2EECPE. EE, CE are short for emotion extraction, cause extraction respectively.

3.1 Problem Formulation

Generally, for any provided document , we could split it into a sequence of clauses based on punctuation, i.e., , where could further be decomposed into words, i.e., . Here, is the number of clauses in the document and is the number of words in the -th clause. ECPE aims to extract a set of emotion-cause pairs from the document , where , represents the emotion clause and the cause clause in the -th pair, respectively.

3.2 Overall Architecture

An overview of the proposed E2EECPE approach is shown in Figure 2. The bottom layer is a clause encoder (Section 3.3) and a document modelling layer (Section 3.4) which transform the word embeddings into the contextualized clause representations. The middle part consists of auxiliary tasks (Section 3.7), i.e., emotion extraction and cause extraction. The top most part is a biaffine attention layer (Section 3.5) which first encodes interaction between the emotion representation and cause representation, and then outputs a postion-weighted pair matrix for pair extraction.

3.3 Embedding & Clause Encoder

With the purpose of integrating words into clause-level neural models, we embed each word in a clause into low-dimensional vectors [2], by which we could represent each word in the clause with its vector representation1 , where and is the dimensionality of the embedding.

After that, we need to attain contextualized representations of clauses. Owing to the recognized performance and local context awareness of the convolutional neural networks (CNNs) on text classification benchmarks [12], we adopt CNNs as the backbone of our clause encoder.

For an embedded clause , we apply one-dimensional convolution operations with kernels of different sizes over the word sequence:


where denotes the -th convolution operation and is the output of the operation. is the number of filters employed in one convolution operation and used here is actually .

Then max-pooling is used to distill the features for concatenation. Hence, we finally get context-aware features for the clause:


where is the convoluted feature and means vector concatenation. is the total number of convolution operations.

3.4 Document Modelling

Since we have sequential clauses in a document, the influences brought by document-level structures become a crucial part that we should fit into our model. A straightforward idea is to leverage temporal relations among clauses with LSTMs.

Specifically, provided with the encoded clause representations , we employ a bidirectional LSTM to update clause-level features and get :


where and denote the forward and backward unidirectional LSTMs, respectively. is the dimensionality of hidden states for a unidirectional LSTM.

3.5 Biaffine Attention

Motivated by advances in link prediction [18], we can directly compute the similarity scores among vertex representations (clause representations in our task), e.g., for any representations of vertex and , to make predictions. However, the above predictions are only concerned with undirectional circumstances since , which is not adequate for emotion-cause pair extraction. To solve the problem, we utilize biaffine transform to complete the filling of adjacent matrices, which are called pair matrices in our work. This idea is similar to dependency parsing [6] that is also directional.

Emotion & Cause Representation

According to biaffine attention mechanism, each vertex in the graph should have two independent representations, i.e., one is for pointing out and the other for pointed in. In doing so, the pair matrix output by the transformation is asymmetric and direction-aware.

The emotion representation and cause representation are separately offered as below:


where , and , are two sets of trainable weights and biases, respectively for the emotion and cause representations.

Biaffine Transform

Then, we implement biaffine transform on the collected emotion and cause representations. In other words, with the purpose of merging these two kinds of representations into our aimed pair matrices, we fold emotion-cause dynamics into two components. On the one hand, we need to perform a bilinear like operation on each possible pair of emotion and cause. On the other hand, we believe bilinear transform is not enough to deal with such complicated interactions, and thus we facilitate it by injecting bias.

More specifically, we calculate each entry in the expected pair matrix as follows:


where and are learnable parameters of affine transform, while indicates an entry of the pair matrix in the -th row, -th column.

Constrained by the inherent property of an adjacent matrix, we further activate the pair matrix with the sigmoid function :


3.6 Asymmetric Position Weight Matrix

A trivial observation on the co-occurrence patterns of emotions and causes is that the emotion and the cause in a unique pair appear near each other in term of their absolute positions in the document. Thus, position embeddings are introduced to directly encode positions into vectors [20]. Different from the existing approaches, our work is based on graph learning, and thereby can not be aided by manipulation of embeddings. Instead, we apply proximity weights on features as in [21], but extent it to matrices.

Moreover, we notice that in reality people are more likely to inform the causes before expressing emotions, which will be verified with statistics of dataset, implying we should assign asymmetric position weight matrices, instead of symmetric ones, to the original matrix representations.

Specifically, the entries in the lower triangular of position weight matrix for the given document is of more significance:


where is a small number for smoothing.

Finally, we obtain the features for indicating pair links as follows:


where denotes element-wise multiplication.

3.7 Multi-task Setting

As aforementioned, emotion extraction and cause extraction can be viewed as auxiliary tasks to augment the emotion and cause representations for constructing more expressive pair matrix. Hence we develop a multi-task paradigm, which shares the fundamental part of network structure for the main task with auxiliary tasks. To achieve this goal, we first acquire features dedicated for classification with following procedure:


where , and , are again two sets of trainable weights and biases, respectively.

Subsequently, predictions are produced by two fully connected layers followed by softmax normalization layers:


where , and , are weights and biases for learning.

3.8 Training Objective

Eventually, the whole structure can be trained by standard gradient descent. Accordingly, the objective function is a combination of cross entropy with -norm regularization, formulated as below:


where and , serve as enumerators over all elements. , , and are correspondingly the ground truth.

Furthermore, we add two coefficients to balance the influences of above two objective functions. The ultimate training objective then becomes:


where the term is used to adjust the potential influences brought by multi-task learning, which is refined according to a pilot study. stands for all parameters that need to be optimized, while is a coefficient for -norm regularization.

3.9 Inference

With well trained model, we can infer emotion-cause pairs by comparing each entry in with a predefined threshold


where is the inference result matrix with binary (1-0) indicators.

4 Experimental Setting

4.1 Dataset

We carry out experiments on a publicly available dataset for emotion-cause pair extraction, released by [20]. Consisting of news crawled from web, the dataset is referred to as News in the rest of the paper. It is randomly split into ten folds. In our experiments, to evaluate our trained model, each fold is further divided into two parts, namely train set and test set which respectively take 90% and 10% of the data. Table 1 shows some basic statistics of the dataset. A key observation is that most documents only contain one emotion-cause pair therein, implying the sparsity of the pair matrix. Therefore the issue of label imbalance will be elaborated in following discussions. Moreover, a large amount of emotion-cause pairs have the emotion and the cause within 1 relative offset, suggesting the necessity of using proximity constraints (exactly what position weight matrix does) in the predicted pair matrix.

# of documents 2105
avg. # of clauses per document 14
# of EC pairs 2167
# of documents with 1 EC pair 1746
# of documents with over 1 EC pairs 421
# of EC pairs with 0 relative offset 511
# of EC pairs with 1 relative offset 1342
# of EC pairs with 2 relative offset 224
# of EC pairs with over 2 relative offset 90
maximum EC pair offset 12
avg. offset of EC pairs 0.9977
Table 1: Statistics of the dataset. EC stands for emotion-cause, and relative offset indicates the absolute distance between the emotion and the cause of a pair in the document.

4.2 Parameter Settings

For all our experiments, pre-trained word vectors on Weibo (a Chinese micro-bloging website) using Word2Vec [17] are leveraged to initialize the word embeddings. The dimensionality of embeddings (i.e., ) is set to 200. We use 4 convolutional layers (i.e., ) whose kernel sizes are {2,3,4,5} for the clause encoder and the number of filters for all the convolutional layers (i.e., ) is 50, for capturing gram-level features. In order to avoid overfitting, we apply dropout to embeddings and outputs of the clause encoder, yielding 0.5 probability of randomized zeroes on features. The dimensionality for hidden states of a unidirectional LSTM (i.e., ) is 300. The dimensionalities for all fully connected layers in the main task and auxiliary tasks (i.e., ) are 100. Moreover, the batch size and learning rate are determined through grid parameter search, which are 32 and 10-3, respectively. The coefficient for -norm regularization (i.e., ) is 10-5 . Based on a pilot study, we find the best value for the threshold (i.e., ) in the inference stage is 0.3, which will be detailed in next section. The coefficient for the trade-off in objective function (i.e., ) is 1. In addition, the smoothing term in the calculation of position weight matrix (i.e., ) is 1. Furthermore, Adam is used as the optimizer and all trainable parameters are randomly initialized with uniform distribution [10].

4.3 Baselines & Evaluation Metrics

Our approach2 is compared with a range of strong baselines, which are the state-of-the-art methods proposed by [20] for emotion-cause pair extraction. These baselines are either one-stage or two-stage models.

The two-stage models are listed below. They first extract emotions and causes with multi-task architectures independently or interactively, then classify the cartesian products of emotions and causes extracted in the first stage into pairs or non-pairs.

  • Indep firstly considers emotion extraction and cause extraction as independent tasks and extract emotions and causes with multi-task learning, then pairs the extracted emotions and causes with a classifier.

  • Inter-CE and Inter-EC typically follow the procedure of Indep, however, assist emotion extraction and cause extraction with directed interaction modelling.

The one-stage models neglect the second stage in the two-stage models, and consider the cartesian products as predictions. They are listed as follows.

  • Indep w/o filter removes the classifier of Indep.

  • Inter-CE w/o filter removes the classifier of Inter-CE.

  • Inter-EC w/o filter removes the classifier of Inter-EC.

Precision, recall, and macro F1 measures are adopted as effectiveness metrics in our experiments. Meanwhile, runtime is used as a measure of efficiency. The final results are obtained by averaging the ten folds results.

5 Result & Analysis

5.1 Results in Effectiveness

We perform a comparison of E2EECPE with one-stage and two-stage baseline models to quantitatively understand in what ways E2EECPE is more effective than the baselines.

Models emotion extraction cause extraction emotion-cause pair extraction
P R F1 P R F1 P R F1
Indep w/o filter 0.8483 0.7961 0.8208 0.6898 0.5648 0.6198 0.5932 0.5094 0.5470
Inter-CE w/o filter 0.8458 0.8035 0.8263 0.6838 0.5754 0.6231 0.5827 0.5300 0.5531
Inter-EC w/o filter 0.8406 0.8097 0.8242 0.6989 0.5991 0.6426 0.5975 0.5538 0.5716
Indep 0.8483 0.7961 0.8208 0.6898 0.5648 0.6198 0.6943 0.5047 0.5833
Inter-CE 0.8458 0.8035 0.8263 0.6838 0.5754 0.6231 0.6780 0.5254 0.5896
Inter-EC 0.8406 0.8097 0.8242 0.6989 0.5991 0.6426 0.6691 0.5503 0.6013
E2EECPE 0.8595 0.7915 0.8238 0.7062 0.6030 0.6503 0.6478 0.6105 0.6280
Table 2: Comparison results of emotion extraction, cause extraction, and emotion-cause pair extraction with precision, recall, and F1-measure as metrics. The results in bold are the best performing ones under each column. The results of emotion extraction and cause extraction for one-stage and two-stage models are exactly the same because one-stage models are ablated ones of two-stage models. indicates results that are significantly better than best performing baseline Inter-EC with paired t-test (p is smaller than 0.05).

Table 2 gives the results in terms of precision, recall and macro-F1 measures. The comparison results demonstrate that our model E2EECPE consistently outperforms the baselines for the main task (emotion-cause pair extraction) with regard to recall and F1, indicating the representation power and the effectiveness of our model. Nevertheless, we also observe that our model performs less well in precision than the two-stage baseline models. With additional observation that the baseline models are performing poorly on recall, we conjecture the existing models suffer from predicting only few testing instances as pairs.

Furthermore, E2EECPE is superior on the two auxiliary tasks (emotion extraction and cause extraction). We attribute the improvement to multi-task structure in our model which combines auxiliary tasks and the main task. Apart from that, the one-stage models yield lower results than E2EECPE on cause extraction, suggesting that error propagation is a comparably severe issue in the existing models but is alleviated in our model.

5.2 Runtime Analysis

To confirm that our end-to-end model is more efficient than the two-stage models, we perform runtime analysis among the two-stage baseline models and ours. Concretely, we display the average running time consumed by models on a epoch in different folds.

Figure 3: Runtime analysis (s).

Results listed in Figure 3 suggests that our model is 6-7 times faster than two-stage models, indicating the efficiency of our end-to-end model.

5.3 Ablation Study

To understand the efficacy of auxiliary tasks and position weight matrix, we conduct an ablation study on E2EECPE. Specifically, we separately ablate auxiliary tasks and position weight matrix from E2EECPE, and call them E2EECPE w/o auxiliary and E2EECPE w/o position, respectively.

Models P R F1
E2EECPE 0.6478 0.6105 0.6280
E2EECPE w/o auxiliary 0.5982 0.5340 0.5635
E2EECPE w/o position 0.6421 0.6158 0.6275
Table 3: Ablation study results. The results in bold are the best performing ones under each column.

The results in Table 3 show a significant performance drop of E2EECPE w/o auxiliary and a relatively minor drop of E2EECPE w/o position compared with E2EECPE, verifying the remarkable benefit of the multi-task learning schema. Meanwhile, the results that E2EECPE w/o position only differs slightly from E2EECPE based on all metrics, indicate the fact that imposing position information is still of importance.

5.4 Effect of Threshold in Inference Stage

Inference based on pair matrix is powerful, yet we do not exactly know what threshold (i.e., ) is the most suitable one for its expressiveness. It is therefore helpful to explore the effect of the threshold by altering it and examining the results.

P R F1
0.2 0.5743 0.6456 0.6071
0.3 0.6478 0.6105 0.6280
0.4 0.6757 0.5849 0.6265
0.5 0.7185 0.5543 0.6255
0.6 0.7326 0.5385 0.6201
Table 4: Effect of threshold. The results in bold are the best performing ones under each column.

From Table 4, we conclude that 0.3 is the most appropriate one for our studied task. With increases of , drops of F1 are noted, implying potential loss of extracted pairs. In addition, we also speculate that the reason why the best value is not around 0.5 (the expectation of random variables ranging uniformly from 0 to 1) is that the element-wise multiplication of a position weight matrix with the sigmoid-activated pair matrix produces a smaller expectation (as upper bound decreases).

5.5 Issue of Label Imbalance

In order to measure the impact brought by label imbalance, typically in the form of pair matrix sparsity, we remove the examples containing more than one pair for test set in each fold to make up a Hard dataset, then record the mean results across ten folds correspondingly.

Dataset P R F1
Full 0.6478 0.6105 0.6280
Hard 0.6002 0.6479 0.6226
Table 5: The results for verifying the issue of label imbalance.

We can observe in Table 5 that our model encounters a failure on the Hard dataset with decreases on precision and F1 measure, suggesting that further investigation is needed to solve this problem.

6 Conclusion and Future Work

The emotion-cause pair extraction task is a new and more realistic task that seeks to identify emotion-cause pairs in documents. However, previous two-stage models are inherently limited by the idea of solving this task via two stages. To this end, we propose an end-to-end model that regards the oriented problem as predicting directional links between emotions and causes via biaffine attention. Additionally, we also aid the model with auxiliary tasks and position weight matrix. Experimental results prove the superiority of our model over other baselines.

Based on the work in this paper, we believe there are some promising directions yet to be explored. On the one hand, more fancy models such as graph neural networks are expected to be developed to incorporate with learned position information instead of refined one. On the other hand, the label imbalance issue should be addressed with task-specific tactics.


  1. If not specified, we use notations in bold as the vector representations of their original concepts.
  2. Code will be available soon after the anonymity period.


  1. D. Bahdanau, K. Cho and Y. Bengio (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473. Cited by: §2.
  2. Y. Bengio, R. Ducharme, P. Vincent and C. Jauvin (2003) A neural probabilistic language model. Journal of machine learning research 3 (Feb), pp. 1137–1155. Cited by: §3.3.
  3. Y. Chen, W. Hou, X. Cheng and S. Li (2018) Joint learning for emotion classification and emotion cause detection. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 646–651. Cited by: §2.
  4. Y. Chen, S. Y. M. Lee, S. Li and C. Huang (2010) Emotion cause detection with linguistic constructions. In Proceedings of the 23rd International Conference on Computational Linguistics, pp. 179–187. Cited by: §2.
  5. X. Cheng, Y. Chen, B. Cheng, S. Li and G. Zhou (2017) An emotion cause corpus for chinese microblogs with multiple-user structures. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 17 (1), pp. 6. Cited by: §2.
  6. T. Dozat and C. D. Manning (2016) Deep biaffine attention for neural dependency parsing. arXiv preprint arXiv:1611.01734. Cited by: §3.5, §3.
  7. L. Gui, J. Hu, Y. He, R. Xu, L. Qin and J. Du (2017) A question answering approach for emotion cause extraction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1593–1602. Cited by: §2.
  8. L. Gui, D. Wu, R. Xu, Q. Lu and Y. Zhou (2016) Event-driven emotion cause extraction with corpus construction. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1639–1649. Cited by: §2.
  9. L. Gui, L. Yuan, R. Xu, B. Liu, Q. Lu and Y. Zhou (2014) Emotion cause detection with linguistic construction in chinese weibo text. In Natural Language Processing and Chinese Computing, pp. 457–464. Cited by: §2.
  10. K. He, X. Zhang, S. Ren and J. Sun (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pp. 1026–1034. Cited by: §4.2.
  11. S. Hochreiter and J. Schmidhuber (1997) Long short-term memory. Neural computation 9 (8), pp. 1735–1780. Cited by: §2.
  12. Y. Kim (2014) Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751. Cited by: §3.3.
  13. J. D. Lafferty, A. McCallum and F. C. Pereira (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning, pp. 282–289. Cited by: §2.
  14. S. Y. M. Lee, Y. Chen, C. Huang and S. Li (2013) DETECTING emotion causes with a linguistic rule-based approach 1. Computational Intelligence 29 (3), pp. 390–416. Cited by: §2.
  15. S. Y. M. Lee, Y. Chen and C. Huang (2010) A text-driven rule-based system for emotion cause detection. In Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, pp. 45–53. Cited by: §1, §2.
  16. X. Li, K. Song, S. Feng, D. Wang and Y. Zhang (2018) A co-attention neural network model for emotion cause analysis with emotional context awareness. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4752–4757. Cited by: §2.
  17. T. Mikolov, K. Chen, G. Corrado and J. Dean (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. Cited by: §4.2.
  18. M. Schlichtkrull, T. N. Kipf, P. Bloem, R. Van Den Berg, I. Titov and M. Welling (2018) Modeling relational data with graph convolutional networks. In European Semantic Web Conference, pp. 593–607. Cited by: §3.5.
  19. S. Sukhbaatar, J. Weston and R. Fergus (2015) End-to-end memory networks. In Advances in neural information processing systems, pp. 2440–2448. Cited by: §2.
  20. R. Xia and Z. Ding (2019-07) Emotion-cause pair extraction: a new task to emotion analysis in texts. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp. 1003–1012. External Links: Link, Document Cited by: §1, §1, §2, §3.6, §4.1, §4.3.
  21. C. Zhang, Q. Li and D. Song (2019) Syntax-aware aspect-level sentiment classification with proximity-weighted convolution network. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1145–1148. Cited by: §3.6.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description