KnowBias: Detecting Political Polarity in Long Text Content
We introduce a classification scheme for detecting political bias in long text content such as newspaper opinion articles. Obtaining long text data and annotations at sufficient scale for training is difficult, but it is relatively easy to extract political polarity from tweets through their authorship; as such, we train on tweets and perform inference on articles. Universal sentence encoders and other existing methods that aim to address this domain-adaptation scenario deliver inaccurate and inconsistent predictions on articles, which we show is due to a difference in opinion concentration between tweets and articles. We propose a two-step classification scheme that utilizes a neutral detector trained on tweets to remove neutral sentences from articles in order to align opinion concentration and therefore improve accuracy on that domain. Our implementation is available for public use at https://knowbias.ml.
Rising bias in news media, along with the formation of filter bubbles on social media, where content with the same political slant is repeatedly shared, have contributed to severe partisanship in the American political environment in recent years [6, 3]. We aim to increase awareness of this heightened polarization by alerting users to the political bias in the content they consume.
In this work, we discuss an NLP-based approach that predicts political bias on long text such as news articles independent of metadata such as content origin or authorship. Annotating polarity on long documents at sufficient scale for training is infeasible since doing so requires that humans read each article and manually determine polarity. On the other hand, tweets can be easily gathered in high volume and can be annotated based on authorship.
We envision an approach where we transfer knowledge from tweets to long text at test time. While previous work has attempted to analyze tweets for political sentiment , there is no research on domain adaptation from short to long documents in this context. There has been research on filtering text for the purposes of deriving justifiable predictions , but not for domain adaptation for our target problem. Universal sentence encoders  provide good text representations regardless of target task and we would expect training a classifier on these to provide good performance on all text; however, this approach delivers inaccurate and inconsistent predictions.
We show that this poor performance is due to the existence of neutral, apolitical sentences in articles that dilute opinion concentration compared to tweets. Our proposed method alleviates this issue by using a neutral detector trained on tweets to remove neutral sentences before predicting bias, improving prediction accuracy and consistency. Our work summarizes \citeauthor2019arXiv190500724S (\citeyear2019arXiv190500724S).
Predicting Polarity in Text Content
Data collection We train on political tweets due to the aforementioned ease in collecting and annotating them at scale and aim to transfer this knowledge to longer articles. Our polarity data consisted of roughly 150,000 tweets from 28 Twitter verified politicians or media personalities across the political spectrum. 80% of these samples were used for training and 20% were used as testing. We also sampled a set of roughly 80,000 neutral tweets from the Twitter general stream in order to train the neutral detector.
Baseline approach We use a sentence embedding suite to convert tweets to machine-readable, high dimensional vectors that preserve semantic meaning in vector space. We specifically use the Google Universal Sentence Encoder  as it offers good semantic representation regardless of the target task. We trained a deep neural network with two hidden layers on these sentence embeddings. While an 83% test accuracy was achieved on the Twitter test set, we noticed poor performance on long-form articles, with inconsistent and inaccurate predictions.
Opinion concentration We note that a primary stylistic difference between tweets and long-form articles is the existence of neutral and apolitical sentences in the latter medium. These sentences help article flow and cohesion, but also dilute the concentration of opinion compared to tweets. We hypothesize that this difference in opinion concentration is responsible for poor performance on long-form articles. We test this hypothesis by obtaining a set of neutral, apolitical sentences from the Twitter general stream and then augmenting them into the political test data. As demonstrated in Figure 2, accuracy decreases noticeably with the addition of augmented neutral sentences.
|Twitter Political - Acc.||82.27%||82.42%|
|Twitter Crowdsourced - Acc.||86.00%||86.00%|
|Twitter Crowdsourced -||0.65||0.65|
|Articles Crowdsourced - Acc.||66.67%||75.00%|
|Articles Crowdsourced -||0.52||0.69|
Neutral detector After identifying the dilution of opinion concentration as responsible for accuracy degradation on long-form articles, we propose the addition of a classifier to detect and remove neutral sentences. We train a second deep neural network on the sentence embeddings of 80,000 tweets sampled from the general Twitter stream as well as the political samples, obtaining a high 95.63% accuracy.
Two-step classification scheme We propose a two-step classification scheme in order to improve prediction quality on long-form articles as demonstrated in Figure 1. On any data passed to the system for inference, we first tokenize it into individual sentences. On each of these sentences, we use the neutral detector to mark and remove all neutral sentences. We then fuse the remaining sentences back together, aligning opinion concentration to that of tweets, and then use the main baseline classifier to predict polarity.
Datasets We tested our approach on a number of datasets. The first, Twitter Political, is a simple 20% split of the obtained political tweet data consisting of 20,000 samples that were labeled on authorship. We also selected a separate set of 50 tweets and 24 articles for which we collected crowdsourced annotations from 79 respondents.
Accuracy On the long-form articles, accuracy significantly increased to 75% from 66.7% due to the two-step method. However, we did not expect our accuracy to substantially change on the Twitter datasets with the two-step method as the opinion concentration remains the same due to the lack of neutral sentences. Indeed, this is true, with no significant improvement in accuracy.
Spearman-Rho To verify prediction consistency, we computed the Spearman-rho rank correlation  against crowd opinions. Table 1 shows that the proposed system ( is far more consistent in assigning predictions with respect to crowdsourced predictions on articles than the baseline one-step method ().
Conclusions & Future Work
We introduced a two-step classification method to detect polarity in text content without the use of metadata or user details. By evening opinion concentration using a neutral detector to remove apolitical sentences, our method performs well on both tweets and long-form articles. Future work may involve exploring the problem of time shift, where positions on new issues are not accurately represented by predictions if the training data is too stale. This reinforces the need for continuous model updates.
-  (2018-03) Universal Sentence Encoder. arXiv e-prints, pp. arXiv:1803.11175. External Links: Cited by: Introduction, Predicting Polarity in Text Content.
-  (2019-04) Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings. arXiv e-prints, pp. arXiv:1904.01596. External Links: Cited by: Introduction.
-  (2018) This is what filter bubbles actually look like. MIT Technology Review. Cited by: Introduction.
-  (2016) Rationalizing neural predictions. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 107–117. Cited by: Introduction.
-  (2015) Spearman rank correlation. Handbook of Biological Statistics. Cited by: Experiments.
-  (2010-03) Political bias in the news media. Southeast Missouri State University. External Links: Cited by: Introduction.
Acknowledgement The author thanks Prof. Kai-Wei Chang (UCLA) for comments and suggestions in writing this paper.