Towards Supervised Extractive Text Summarization via RNN-based Sequence Classification
This article briefly explains our submitted approach to the DocEng’19 competition on extractive summarization. We implemented a recurrent neural network based model that learns to classify whether an article’s sentence belongs to the corresponding extractive summary or not. We bypass the lack of large annotated news corpora for extractive summarization by generating extractive summaries from abstractive ones, which are available from the CNN corpus.
Keywords:neural networks, extractive text summarization
The DocEng ’19 competition focused on automatic extractive text summarization. Participants were provided with a corpus of 50 news articles from the CNN-corpus. These articles contained corresponding extractive and abstractive summaries aimed to train and test a system to perform the summarization task. The gold standard summaries contained around 10% of the original text, with a minimum of 3 sentences. After submission, the methods were tested on a larger test set consisting of 1000 articles randomly chosen from the CNN-corpus. The limited available training data was one of the major challenges of this competition, which prevented any deep learning approach from being successful if no external corpus was incorporated to the training set.
Our work is based on the SummaRuNNer model . It consists of a two-layer bi-directional Gated Recurrent Unit (GRU) Recurrent Neural Network (RNN) which treats the summarization problem as a binary sequence classification problem, where each sentence is classified sequentially as sentence to be included or not in the summary. However, we introduced two modifications to the original SummaRuNNer architecture, leading to better results while reducing complexity:
Our model operates directly on a sentence level (instead of at word level within each sentence). We compute sentence vector representations by means of the the Flair library. 111https://github.com/zalandoresearch/flair. These sentence embeddings substitute the bottom layer of the SummaRuNNer architecture.
We do not consider the position of each sentence (absolute or relative) for the logistic layer.
The resulting architecture is displayed on Figure 1. Our code to generate extractive summaries according to the instructions established for the competition is publicly available222https://jira.iais.fraunhofer.de/stash/users/dbiesner/repos/doceng2019_fraunhofer.
In contrast to , we trained our model only on CNN articles from the CNN/Daily Mail corpus. Due to the limited number of provided news articles, we automatically annotated a large corpus of CNN articles from which an abstractive summary was available. In a similar approach to , we calculated the ROUGE-1 F1 score between each sentence and its article’s abstractive summary. Finally for each article, we sorted the sentences having the highest ROUGE-1 F1 score and picked the top sentences.
|Sentence matching gold standard||0.375||0.357||0.358|
We evaluated our model on the provided labeled CNN news articles with three different metrics: sentences from the generated summary matching the gold standard summary, ROUGE-1 and ROUGE-2. The achieved scores with our trained model after 20 epochs are displayed on Table 1.
Our approach achieved the second best performance among the compared methods in the competition, although the F1-score difference between both approaches is not statistically significant. Additionally, the performance of these approaches is hardly better than some of the ”traditional algorithms” that were presented as baselines, which are much simpler than ours. Moreover, the real value of the different approaches on the various use cases of automatic text summarization cannot be covered with the current evaluation since the valuable properties of the summaries vary depending on the use case. For instance, coherence is important if the summary will be read by a final user while it is not if the summary is ”just” a preprocessing step within an indexing pipeline. Therefore, it would be interesting to assess the different techniques on several downstream tasks to obtain a better overview about which algorithms are most suitable.
-  (2018) Contextual string embeddings for sequence labeling. In COLING 2018, 27th International Conference on Computational Linguistics, pp. 1638–1649. Cited by: item 1.
-  (2015) Teaching machines to read and comprehend. In Advances in Neural Information Processing Systems (NIPS), External Links: Cited by: §3.
-  (2019) DocEng’19 competition on extractive text summarization. In Proceedings of the ACM Symposium on Document Engineering 2019, DocEng ’19, New York, NY, USA, pp. 4:1–4:2. External Links: Cited by: Towards Supervised Extractive Text Summarization via RNN-based Sequence Classification, §5.
-  (2019) The cnn-corpus: a large textual corpus for single-document extractive summarization. In Proceedings of the ACM Symposium on Document Engineering 2019, DocEng ’19, New York, NY, USA, pp. 16:1–16:10. External Links: Cited by: §1.
-  (2017) Summarunner: a recurrent neural network based sequence model for extractive summarization of documents. In Thirty-First AAAI Conference on Artificial Intelligence, Cited by: Figure 1, §2, §3.