Prediction Uncertainty Estimation for Hate Speech Classification

Prediction Uncertainty Estimation for Hate Speech Classification

Kristian Miok Computer Science Department, West University of Timisoara,
Bulevardul Vasile Pârvan 4, 300223 Timișoara, Romania
1
Jožef Stefan Institute and Jožef Stefan International Postgraduate School,
Jamova 39, 1000 Ljubljana, Slovenia
2
Faculty of Computer and Information Science, University of Ljubljana,
Večna pot 113, 1000 Ljubljana, Slovenia
3
   Dong Nguyen-Doan Computer Science Department, West University of Timisoara,
Bulevardul Vasile Pârvan 4, 300223 Timișoara, Romania
1
Jožef Stefan Institute and Jožef Stefan International Postgraduate School,
Jamova 39, 1000 Ljubljana, Slovenia
2
Faculty of Computer and Information Science, University of Ljubljana,
Večna pot 113, 1000 Ljubljana, Slovenia
3
   Blaž Škrlj Computer Science Department, West University of Timisoara,
Bulevardul Vasile Pârvan 4, 300223 Timișoara, Romania
1
Jožef Stefan Institute and Jožef Stefan International Postgraduate School,
Jamova 39, 1000 Ljubljana, Slovenia
2
Faculty of Computer and Information Science, University of Ljubljana,
Večna pot 113, 1000 Ljubljana, Slovenia
3
   Daniela Zaharie Computer Science Department, West University of Timisoara,
Bulevardul Vasile Pârvan 4, 300223 Timișoara, Romania
1
Jožef Stefan Institute and Jožef Stefan International Postgraduate School,
Jamova 39, 1000 Ljubljana, Slovenia
2
Faculty of Computer and Information Science, University of Ljubljana,
Večna pot 113, 1000 Ljubljana, Slovenia
3
  
Marko Robnik-Šikonja
Computer Science Department, West University of Timisoara,
Bulevardul Vasile Pârvan 4, 300223 Timișoara, Romania
1
Jožef Stefan Institute and Jožef Stefan International Postgraduate School,
Jamova 39, 1000 Ljubljana, Slovenia
2
Faculty of Computer and Information Science, University of Ljubljana,
Večna pot 113, 1000 Ljubljana, Slovenia
3
1email: {kristian.miok,dong.nguyen10,daniela.zaharie}@e-uvt.ro
1email: blaz.skrlj@ijs.si
1email: marko.robnik@fri.uni-lj.si
Abstract

As a result of social network popularity, in recent years, hate speech phenomenon has significantly increased. Due to its harmful effect on minority groups as well as on large communities, there is a pressing need for hate speech detection and filtering. However, automatic approaches shall not jeopardize free speech, so they shall accompany their decisions with explanations and assessment of uncertainty. Thus, there is a need for predictive machine learning models that not only detect hate speech but also help users understand when texts cross the line and become unacceptable.

The reliability of predictions is usually not addressed in text classification. We fill this gap by proposing the adaptation of deep neural networks that can efficiently estimate prediction uncertainty. To reliably detect hate speech, we use Monte Carlo dropout regularization, which mimics Bayesian inference within neural networks. We evaluate our approach using different text embedding methods. We visualize the reliability of results with a novel technique that aids in understanding the classification reliability and errors.

Keywords:
prediction uncertainty estimation, hate speech classification, Monte Carlo dropout method, visualization of classification errors

1 Introduction

Hate speech represents written or oral communication that in any way discredits a person or a group based on characteristics such as race, color, ethnicity, gender, sexual orientation, nationality, or religion [35]. Hate speech targets disadvantaged social groups and harms them both directly and indirectly [33]. Social networks like Twitter and Facebook, where hate speech frequently occurs, receive many critics for not doing enough to deal with it. As the connection between hate speech and the actual hate crimes is high [4], the importance of detecting and managing hate speech is not questionable. Early identification of users who promote such kind of communication can prevent an escalation from speech to action. However, automatic hate speech detection is difficult, especially when the text does not contain explicit hate speech keywords. Lexical detection methods tend to have low precision because, during classification, they do not take into account the contextual information those messages carry [11]. Recently, contextual word and sentence embedding methods capture semantic and syntactic relation among the words and improve prediction accuracy.

Recent works on combining probabilistic Bayesian inference and neural network methodology attracted much attention in the scientific community [23]. The main reason is the ability of probabilistic neural networks to quantify trustworthiness of predicted results. This information can be important, especially in tasks were decision making plays an important role [22]. The areas which can significantly benefit from prediction uncertainty estimation are text classification tasks which trigger specific actions. Hate speech detection is an example of a task where reliable results are needed to remove harmful contents and possibly ban malicious users without preventing the freedom of speech. In order to assess the uncertainty of the predicted values, the neural networks require a Bayesian framework. On the other hand, Srivastava et al. [32] proposed a regularization approach, called dropout, which has a considerable impact on the generalization ability of neural networks. The approach drops some randomly selected nodes from the neural network during the training process. Dropout increases the robustness of networks and prevents overfitting. Different variants of dropout improved classification results in various areas [1]. Gal and Ghahramani [15] exploited the interpretation of dropout as a Bayesian approximation and proposed a Monte Carlo dropout (MCD) approach to estimate the prediction uncertainty. In this paper, we analyze the applicability of Monte Carlo dropout in assessing the predictive uncertainty.

Our main goal is to accurately and reliably classify different forms of text as hate or non-hate speech, giving a probabilistic assessment of the prediction uncertainty in a comprehensible visual form. We also investigate the ability of deep neural network methods to provide good prediction accuracy on small textual data sets. The outline of the proposed methodology is presented in Figure 1.

Figure 1: The diagram of the proposed methodology.

Our main contributions are:

  • investigation of prediction uncertainty assessment to the area of text classification,

  • implementation of hate speech detection with reliability output,

  • evaluation of different contextual embedding approaches in the area of hate speech,

  • a novel visualization of prediction uncertainty and errors of classification models.

The paper consists of six sections. In Section 2, we present related works on hate speech detection, prediction uncertainty assessment in text classification context, and visualization of uncertainty. In Section 3, we propose the methodology for uncertainty assessment using dropout within neural network models, as well as our novel visualization of prediction uncertainty. Section 4 presents the data sets and the experimental scenario. We discuss the obtained results in Section 5 and present conclusions and ideas for further work in Section 6.

2 Related Work

We shortly present the related work in three areas which constitute the core of our approach: hate speech detection, recurrent neural networks with Monte Carlo dropout for assessment of prediction uncertainty in text classification, and visualization of predictive uncertainty.

2.1 Hate Speech Detection

Techniques used for hate speech detection are mostly based on supervised learning. The most frequently used classifier is the Support Vector Machines (SVM) method [30]. Recently, deep neural networks, especially recurrent neural network language models [20], became very popular. Recent studies compare (deep) neural networks [28, 9, 12] with the classical machine learning methods.

Our experiments investigate embeddings and neural network architectures that can achieve superior predictive performance to SVM or logistic regression models. More specifically, our interest is to explore the performance of MCD neural networks applied to the hate speech detection task.

2.2 Prediction Uncertainty in Text Classification

Recurrent neural networks (RNNs) are a popular choice in text mining. The dropout technique was first introduced to RNNs in 2013 [34] but further research revealed negative impact of dropout in RNNs, especially within language modeling. For example, the dropout in RNNs employed on a handwriting recognition task, disrupted the ability of recurrent layers to effectively model sequences [25]. The dropout was successfully applied to language modeling by [36] who applied it only on fully connected layers. The then state-of-the-art results were explained with the fact that by using the dropout, much deeper neural networks can be constructed without danger of overfitting. Gal and Ghahramani [14] implemented the variational inference based dropout which can also regularize recurrent layers. Additionally, they provide a solution for dropout within word embeddings. The method mimics Bayesian inference by combining probabilistic parameter interpretation and deep RNNs. Authors introduce the idea of augmenting probabilistic RNN models with the prediction uncertainty estimation. Recent works further investigate how to estimate prediction uncertainty within different data frameworks using RNNs [37]. Some of the first investigation of probabilistic properties of SVM prediction is described in the work of Platt [26]. Also, investigation how Bayes by Backprop (BBB) method can be applied to RNNs was done by [13].

Our work combines the existing MCD methodology with the latest contextual embedding techniques and applies them to hate speech classification task. The aim is to obtain high quality predictions coupled with reliability scores as means to understand the circumstances of hate speech.

2.3 Prediction Uncertainty Visualization in Text Classification

Visualizations help humans in making decisions, e.g., select a driving route, evacuate before a hurricane strikes, or identify optimal methods for allocating business resources. One of the first attempts to obtain and visualize latent space of predicted outcomes was the work of Berger et al. [2]. Prediction values were also visualized in geo-spatial research on hurricane tracks [10, 29]. Importance of visualization for prediction uncertainty estimation in the context of decision making was discussed in [18, 17].

We are not aware of any work on prediction uncertainty visualization for text classification or hate speech detection. We present visualization of tweets in a two dimensional latent space that can reveal relationship between analyzed texts.

3 Deep Learning with Uncertainty Assessment

Deep learning received significant attention in both NLP and other machine learning applications. However, standard deep neural networks do not provide information on reliability of predictions. Bayesian neural network (BNN) methodology can overcome this issue by probabilistic interpretation of model parameters. Apart from prediction uncertainty estimation, BNNs offer robustness to overfitting and can be efficiently trained on small data sets [16]. However, neural networks that apply Bayesian inference can be computationally expensive, especially the ones with the complex, deep architectures. Our work is based on Monte Carlo Dropout (MCD) method proposed by [15]. The idea of this approach is to capture prediction uncertainty using the dropout as a regularization technique.

In contrast to classical RNNs, Long Short-term Memory (LSTM) neural networks introduce additional gates within the neural units. There are two sources of information for specific instance that flows through all the gates: input values and recurrent values that come from the previous instance . Initial attempts to introduce dropout within the recurrent connections were not successful, reporting that dropout brakes the correlation among the input values. Gal and Ghahramani [14] solve this issue using predefined dropout mask which is the same at each time step. This opens the possibility to perform dropout during each forward pass through the LSTM network, estimating the whole distribution for each of the parameters. Parameters’ posterior distributions that are approximated with such a network structure, , is used in constructing posterior predictive distribution of new instances :

(1)

where denotes the likelihood function. In the regression tasks, this probability is summarized by reporting the means and standard deviations while for classification tasks the mean probability is calculated as:

(2)

where . Thus, collecting information in dropout passes throughout the network during the training phase is used in the testing phase to generate (sample) predicted values for each of the test instance. The benefit of such results is not only to obtain more accurate prediction estimations but also the possibility to visualize the test instances within the generated outcome space.

3.1 Prediction Uncertainty Visualization

For each test instance, the neural network outputs a vector of probability estimates corresponding to the samples generated through Monte Carlo dropout. This creates an opportunity to visualize the variability of individual predictions. With the proposed visualization, we show the correctness and reliability of individual predictions, including false positive results that can be just as informative as correctly predicted ones. The creation of visualizations consists of the following five steps, elaborated below.

  1. Projection of the vector of probability estimates into a two dimensional vector space.

  2. Point coloring according to the mean probabilities computed by the network.

  3. Determining point shapes based on correctness of individual predictions (four possible shapes).

  4. Labeling points with respect to individual documents.

  5. Kernel density estimation of the projected space — this step attempts to summarize the instance-level samples obtained by the MCD neural network.

As the MCD neural network produces hundreds of probability samples for each target instance, it is not feasible to directly visualize such a multi-dimensional space. To solve this, we leverage the recently introduced UMAP algorithm [19], which projects the input dimensional data into a -dimensional (in our case ) representation by using computational insights from the manifold theory. The result of this step is a two dimensional matrix, where each of the two dimensions represents a latent dimension into which the input samples were projected, and each row represents a text document.

In the next step, we overlay the obtained representation with other relevant information, obtained during sampling. Individual points (documents) are assigned the mean probabilities of samples, thus representing the reliability of individual predictions. We discretize the probability interval into four bins of equal size for readability purposes. Next, we shape individual points according to the correctness of predictions. We take into account four possible outcomes (TP - true positives, FP - false positives, TN - true negatives, FN - false negatives).

As the obtained two dimensional projection represents an approximation of the initial sample space, we compute the kernel density estimation in this subspace and thereby outline the main neural network’s predictions. We use two dimensional Gaussian kernels for this task.

The obtained estimations are plotted alongside individual predictions and represent densities of the neural network’s focus, which can be inspected from the point of view of correctness and reliability.

4 Experimental Setting

We first present the data sets used for the evaluation of the proposed approach, followed by the experimental scenario. The results are presented in Section 5.

4.1 Hate Speech Data Sets

We use three data sets related to the hate speech.

4.1.1 1 - HatEval

data set is taken from the SemEval task ”Multilingual detection of hate speech against immigrants and women in Twitter (hatEval)111https://competitions.codalab.org/competitions/19935”. The competition was organized for two languages, Spanish and English; we only processed the English data set. The data set consists of 100 tweets labeled as 1 (hate speech) or 0 (not hate speech).

4.1.2 2 - YouToxic

data set is a manually labeled text toxicity data, originally containing omments crawled from YouTube videos about the Ferguson unrest in 2014222https://zenodo.org/record/2586669#.XJiS8ChKi70. Apart from the main label describing if the comment is hate speech, there are several other labels characterizing each comment, e.g., if it is a threat, provocative, racist, sexist, etc. (not used in our study). There are 138 comments labeled as a hate speech and 862 as non-hate speech. We produced a data set of 300 comments using all 138 hate speech comments and randomly sampled 162 non-hate speech comments.

4.1.3 3 - OffensiveTweets

data set333https://github.com/t-davidson/hate-speech-and-offensive-language originates in a study regarding hate speech detection and the problem of offensive language [11]. Our data set consists of weets. We took 1430 tweets labeled as hate speech and randomly sampled 1670 tweets from the collection of remaining weets.

4.1.4 Data Preprocessing

Social media text use specific language and contain syntactic and grammar errors. Hence, in order to get correct and clean text data we applied different prepossessing techniques without removing text documents based on the length. The pipeline for cleaning the data was as follows:

  • Noise removal: user-names, email address, multiple dots, and hyper-links are considered irrelevant and are removed.

  • Common typos are corrected and typical contractions and hash-tags are expanded.

  • Stop words are removed and the words are lemmatized.

4.2 Experimental Scenario

We use logistic regression (LR) and Support Vector Machines (SVM) from the scikit-learn library [5] as the baseline classification models. As a baseline RNN, the LSTM network from the Keras library was applied [8]. Both LSTM and MCD LSTM networks consist of an embedding layer, LSTM layer, and a fully connected layer within the Word2Vec and ELMo embeddings. The embedding layer was not used in TF-IDF and Universal Sentence encoding.

To tune the parameters of LR (i.e. liblinear and lbfgs for the solver functions and the number of component from to ) and SVM (i.e. the rbf for the kernel function, the number of components from to and the gamma values from to ), we utilized the random search approach [3] implemented in scikit-learn. In order to obtain best architectures for the LSTM and MCD LSTM models, various number of units, batch size, dropout rates and so on were fine-tuned.

5 Evaluation and Results

We first describe experiments comparing different word representations, followed by sentence embeddings, and finally the visualization of predictive uncertainty.

5.1 Word Embedding

In the first set of experiments, we represented the text with word embeddings (sparse TF-IDF [31] or dense word2vec [21], and ELMo [24]). We utilise the gensim library [27] for word2vec model, the scikit-learn for TFIDF, and the ELMo pretrained model from TensorFlow Hub444https://tfhub.dev/google/elmo/2. We compared different classification models using these word embeddings. The results are presented in Table 1.

The architecture of LSTM and MCD LSTM neural networks contains an embedding layer, LSTM layer, and fully-connected layer (i.e. dense layer) for word2vec and ELMo word embeddings. In LSTM, the recurrent dropout is applied to the units for linear transformation of the recurrent state and the classical dropout is used for the units with the linear transformation of the inputs. The number of units, recurrent dropout, and dropout probabilities for LSTM layer were obtained by fine-tuning (i.e. we used , and for word2vec and TF-IDF, , , and for ELMo in the experiments with MCD LSTM architecture). The search ranges for hyper parameter tuning are described in Table 2.

HatEval YouToxic OffensiveTweets
Model TF-IDF W2V ELMo TF-IDF W2V ELMo TF-IDF W2V ELMo
Logistic Regression 68.0 [2.4] 54.0 [13.6] 62.0 [6.8] 69.3 [3.0] 54.0 [3.0] 76.6 [6.1] 77.2 [1.1] 68.0 [2.4] 75.6 [1.2]
SVM 63.0 [5.1] 66.0 [3.7] 62.0 [12.9] 70.6 [4.2] 55.0 [3.4] 73.3 [5.5] 77.0 [0.7] 59.6 [1.5] 73.0 [1.9]
LSTM 69.0 [7.3] 67.0 [6.8] 66.0 [12.4] 66.6 [2.3] 59.3 [4.6] 74.3 [2.7] 73.4 [0.8] 75.0 [1.7] 74.7 [1.9]
MCD LSTM 67.0 [10.8] 69.0 [6.6] 67.0 [9.8] 66.0 [3.7] 59.3 [3.8] 75.3 [5.5] 71.1 [1.6] 72.0 [1.6] 75.2 [0.9]
Table 1: Comparison of classification accuracy (with standard deviation in brackets) for word embeddings, computed using 5-fold cross-validation. All the results are expressed in percentages and the best ones for each data set are in bold.
Name Parameter type Values
Optimizers Categorical Adam, rmsprop
Batch size Discrete 4 to 128, step=4
Activation function Categorical tanh, relu and linear
Number of epochs Discrete 10 to 100, step=5
Number of units Discrete 128, 256, 512, or 1024
Dropout rate Float 0.1 to 0.8, step=0.05
Table 2: Hyper-parameters for LSTM and MCD LSTM models

The classification accuracy for HatEval data set is reported in the Table 1 (left). The difference between logistic regression and the two LSTM models indicates accuracy improvement once the recurrent layers are introduced. On the other hand, as the ELMo embedding already uses the LSTM layer to take into account semantic relationship among the words, no notable difference between logistic regression and LSTM models can be observed using this embedding.

Results for YouToxic and OffensiveTweets data sets are presented in Table 1 (middle) and (right), respectively. Similarly to the HatEval data set, there is a difference between the logistic regression and the two LSTM models using the word2vec embeddings. For all data sets, the results with ELMo embeddings are similar across the four classifiers.

5.2 Sentence Embedding

In the second set of experiments, we compared different classifiers using sentence embeddings [6] as the representation. Table 3 (left) displays results for HatEval. We can notice improvements in classification accuracy for all classifiers compared to the word embedding representation in Table 1. The best model for this small data set is MCD LSTM. For larger YouToxic and OffensiveTweets data sets, all the models perform comparably. Apart from the prediction accuracy the four models were compared using precision, recall and F1 score [7].

We use the Universal Sentence Encoder module555https://tfhub.dev/google/universal-sentence-encoder-large/3 to encode the data. The architecture of LSTM and MCD LSTM contains a LSTM layer and dense layer. With MCD LSTM architecture in the experiments, the number of neurons, recurrent dropout and dropout value for LSTM is , and , respectively. The dense layer has the same number of units as LSTM layer, and the applied dropout rate is . The hyper-parameters used to tune the LSTM and MCD LSTM models are presented in the Table 2.

HatEval YouToxic OffensiveTweets
Model Accuracy Precision Recall F1 Accuracy Precision Recall F1 Accuracy Precision Recall F1
LR 66.0 [12.4] 67.3 [15.3] 65.2 [15.9] 65.2 [13.1] 77.3 [4.1] 74.3 [7.3] 77.3 [3.6] 75.7 [5.3] 80.8 [1.0] 79.6 [1.9] 84.9 [1.2] 82.2 [1.1]
SVM 67.0 [12.1] 68.2 [15.2] 65.0 [15.8] 65.8 [13.3] 77.3 [6.2] 72.6 [8.6] 80.7 [7.4] 76.3 [7.6] 80.7 [1.3] 78.6 [2.0] 86.7 [1.0] 82.4 [1.2]
LSTM 70.0 [8.4] 70.8 [11.0] 63.1 [17.5] 66.2 [14.4] 76.6 [8.6] 73.4 [11.2] 79.2 [8.0] 75.8 [8.6] 80.7 [1.6] 82.8 [2.1] 79.7 [2.3] 81.1 [1.5]
MCD LSTM 74.0 [10.7] 73.4 [12.7] 78.4 [13.6] 74.9 [10.0] 78.7 [5.8] 74.7 [9.2] 80.9 [6.5] 77.5 [7.4] 81.0 [1.2] 81.5 [1.8] 82.5 [2.7] 81.9 [1.3]
Table 3: Comparison of predictive models using sentence embeddings. We present average classification accuracy, precision, recall and score (and standard deviations), computed using 5-fold cross-validation. All the results are expressed in percentages and the best accuracies are in bold.

5.3 Visualizing Predictive Uncertainty

In Figure 2 we present a new way of visualizing dependencies among the test tweets. The relations are result of applaing the MCD LSTM network to the HetEval data set. This allows further inspection of the results as well as interpretation of correct and incorrect predictions. To improve comprehensibility of predictions and errors, each point in the visualization is labeled with a unique identifier, making the point tractable to the original document, given in Table 4.

Figure 2: Visualization of individual predictions along with their probability ranges. The numeric labels besides each point correspond to individual tweets, summarized in Table  4. The size of the points corresponds to the neural network’s mean probability for a given prediction. True positives are marked with circles, true negatives with crosses, false positives with squares, and false negatives as pluses.
Index (see Figure 2) — Corresponding text
0 — ok if you fu***** said leave block me. but dm me first i am gonna kick your ass. shut the f*** up you
1 — wow mo c*** got hard. want to pull your panties down, push you against these door and let you feel my c*** deep in your wet p****
2 — no i said hope. i hope you women learn your place! sitdowninthekitchen
3 — so this b**** blocked me but check her pinned tweet out, she is accusing men of rape on tumblr with …
4 — this is not rape, or anything like it. this a crude pass. this is not ’rape culture.’ it is w**** culture.
5 — rape is bad but alota h*** be lying bout that s*** women are crazy dawg & will f*** ya life up be careful
6 — lol. they did not applaud a rape threat .how long are you going to ride that s***?
7 — you had a great chance to show everyone that youre not a c*** on tv tonight and you f***** it up.
8 — lmao!!! stuff i say. some stuff be so damn funny.especially when it comes to illegal immigrants who are full fledged criminals coming to america …
9 — the german government pays for 3 week vacation for refugees to go home -
10 — burundian refugees should go home!
11 — no benefits to any immigrants !! come here & make it on your own, or go back home !!
12 — illegal aliens waiting for first class msc container-shippping in antwerpen stop immigration
13 — sick barstewards! this is what happens when we put up the refugees welcome signs! they not only rape our wives or girlfriends
14 — i am not forcing anyone. i am just putting out my view. people believe it or not is their prerogative.even if a single person agree with me
15 — tajikistani charged with beating new york university professor to death another animal./
16 — child prostitution: feds break up mexico-to-queens sex trafficking ring via …
17 — home office guilty of a serious breach of the duty of candour and cooperation regarding children entitled to enter uk. where did these children go? …
18 — p.s why do you not pay unemployed people who do endless hours of voluntary work they do that to give something to the community
19 — seriously, amy and cindy are bffs, i know that for sure. hmm, mmm.
Table 4: Test documents (one fold) used for the visualization in Figure 2.

As Figure 2 shows, the tweets are grouped into two clusters. According to the kernel density isometric lines, two centers are identified: the tweets assigned lower probability of being hate speech and the tweets with higher probability of being hate speech. Let us focus on the wrongly classified tweets and their positions in the graph (tweets 8, 16 and 18). While for tweets 8 and 18 the classifier wasn’t certain and a mistake seems possible according to the plot, the tweet 16 was predicted to be hate speech with high probability. Analyzing the words that form this tweet, we notice that not only that most of them often do appear in the hate speech but also this combination of the words used together is very characteristic for the offensive language.

Our short demonstration shows the utility of the proposed visualization which can identify different types of errors and helps to explain weaknesses in the classifier or wrongly labeled data.

6 Conclusions

We present the first successful approach to assessment of prediction uncertainty in hate speech classification. Our approach uses LSTM model with Monte Carlo dropout and shows performance comparable to the best competing approaches using word embeddings and superior performance using sentence embeddings. We demonstrate that reliability of predictions and errors of the models can be comprehensively visualized. Further, our study shows that pretrained sentence embeddings outperform even state-of-the-art contextual word embeddings and can be recommended as a suitable representation for this task. The full Python code is publicly available 666https://github.com/KristianMiok/Hate-Speech-Prediction-Uncertainty.

As persons spreading hate speech might be banned, penalized, or monitored not to put their threats into actions, prediction uncertainty is an important component of decision making and can help humans observers avoid false positives and false negatives. Visualization of prediction uncertainty can provide better understanding of the textual context within which the hate speech appear. Plotting the tweets that are incorrectly classified and inspecting them can identify the words that trigger wrong classifications.

Prediction uncertainty estimation is rarely implemented for text classification and other NLP tasks, hence our future work will go in this direction. A recent emergence of cross-lingual embeddings possibly opens new opportunities to share data sets and models between languages. As evaluation in rare languages is difficult, the assessment of predictive reliability for such problems might be an auxiliary evaluation approach. In this context, we also plan to investigate convolutional neural networks with probabilistic interpretation.

Acknowledgments.

The work was partially supported by the Slovenian Research Agency (ARRS) core research programme P6-0411. This project has also received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 825153 (EMBEDDIA).

References

  • [1] P. Baldi and P. J. Sadowski (2013) Understanding dropout. In Advances in neural information processing systems, pp. 2814–2822. Cited by: §1.
  • [2] W. Berger, H. Piringer, P. Filzmoser, and E. Gröller (2011) Uncertainty-aware exploration of continuous parameter spaces using multivariate prediction. In Computer Graphics Forum, pp. 911–920. Cited by: §2.3.
  • [3] J. Bergstra and Y. Bengio (2012) Random search for hyper-parameter optimization. Journal of Machine Learning Research 13 (Feb), pp. 281–305. Cited by: §4.2.
  • [4] E. Bleich (2011) The rise of hate speech and hate crime laws in liberal democracies. Journal of Ethnic and Migration Studies 37 (6), pp. 917–934. Cited by: §1.
  • [5] L. Buitinck, G. Louppe, M. Blondel, F. Pedregosa, A. Mueller, O. Grisel, V. Niculae, P. Prettenhofer, A. Gramfort, J. Grobler, R. Layton, J. VanderPlas, A. Joly, B. Holt, and G. Varoquaux (2013) API design for machine learning software: experiences from the scikit-learn project. In ECML PKDD Workshop: Languages for Data Mining and Machine Learning, pp. 108–122. Cited by: §4.2.
  • [6] D. Cer, Y. Yang, S. Kong, N. Hua, N. Limtiaco, R. S. John, N. Constant, M. Guajardo-Cespedes, S. Yuan, C. Tar, et al. (2018) Universal sentence encoder. arXiv preprint arXiv:1803.11175. Cited by: §5.2.
  • [7] N. Chinchor (1992) MUC-4 evaluation metrics. In Proc. of the Fourth Message Understanding Conference, pp. 22–29. Cited by: §5.2.
  • [8] F. Chollet et al. (2015) Keras. Note: https://keras.io Cited by: §4.2.
  • [9] M. Corazza, S. Menini, P. Arslan, R. Sprugnoli, E. Cabrio, S. Tonelli, and S. Villata (2018) Comparing different supervised approaches to hate speech detection. In EVALITA 2018, Cited by: §2.1.
  • [10] J. Cox and M. Lindell (2013) Visualizing uncertainty in predicted hurricane tracks. International Journal for Uncertainty Quantification 3 (2). Cited by: §2.3.
  • [11] T. Davidson, D. Warmsley, M. Macy, and I. Weber (2017) Automated hate speech detection and the problem of offensive language. In Eleventh International AAAI Conference on Web and Social Media, Cited by: §1, §4.1.3.
  • [12] F. Del Vigna12, A. Cimino23, F. Dell’Orletta, M. Petrocchi, and M. Tesconi (2017) Hate me, hate me not: hate speech detection on facebook. Cited by: §2.1.
  • [13] M. Fortunato, C. Blundell, and O. Vinyals (2017) Bayesian recurrent neural networks. arXiv preprint arXiv:1704.02798. Cited by: §2.2.
  • [14] Y. Gal and Z. Ghahramani (2016) A theoretically grounded application of dropout in recurrent neural networks. In Advances in Neural Information Processing Systems, pp. 1019–1027. Cited by: §2.2, §3.
  • [15] Y. Gal and Z. Ghahramani (2016) Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In International Conference on Machine Learning, pp. 1050–1059. Cited by: §1, §3.
  • [16] A. Kucukelbir, D. Tran, R. Ranganath, A. Gelman, and D. M. Blei (2017) Automatic differentiation variational inference. The Journal of Machine Learning Research 18 (1), pp. 430–474. Cited by: §3.
  • [17] L. Liu, A. P. Boone, I. T. Ruginski, L. Padilla, M. Hegarty, S. H. Creem-Regehr, W. B. Thompson, C. Yuksel, and D. H. House (2016) Uncertainty visualization by representative sampling from prediction ensembles. IEEE transactions on visualization and computer graphics 23 (9), pp. 2165–2178. Cited by: §2.3.
  • [18] L. Liu, L. Padilla, S. H. Creem-Regehr, and D. H. House (2019) Visualizing uncertain tropical cyclone predictions using representative samples from ensembles of forecast tracks. IEEE Transactions on Visualization and Computer Graphics 25 (1), pp. 882–891. Cited by: §2.3.
  • [19] L. McInnes, J. Healy, N. Saul, and L. Grossberger (2018) UMAP: uniform manifold approximation and projection. The Journal of Open Source Software 3 (29), pp. 861. Cited by: §3.1.
  • [20] Y. Mehdad and J. Tetreault (2016) Do characters abuse more than words?. In Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 299–303. Cited by: §2.1.
  • [21] T. Mikolov, K. Chen, G. Corrado, and J. Dean (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. Cited by: §5.1.
  • [22] K. Miok (2018-09) Estimation of prediction intervals in neural network-based regression models. In 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), pp. 463–468. Cited by: §1.
  • [23] P. Myshkov and S. Julier (2016) Posterior distribution analysis for bayesian inference in neural networks. In Workshop on Bayesian Deep Learning, NIPS, Cited by: §1.
  • [24] M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer (2018) Deep contextualized word representations. arXiv preprint arXiv:1802.05365. Cited by: §5.1.
  • [25] V. Pham, T. Bluche, C. Kermorvant, and J. Louradour (2014) Dropout improves recurrent neural networks for handwriting recognition. In 2014 14th International Conference on Frontiers in Handwriting Recognition, pp. 285–290. Cited by: §2.2.
  • [26] J. C. Platt (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in large margin classifiers, pp. 61–74. Cited by: §2.2.
  • [27] R. Rehurek and P. Sojka (2010-05-22) Software framework for topic modelling with large corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, Valletta, Malta, pp. 45–50 (English). Cited by: §5.1.
  • [28] K. Rother, M. Allee, and A. Rettberg (2018) ULMFiT at germeval-2018: a deep neural language model for the classification of hate speech in german tweets. In 14th Conference on Natural Language Processing KONVENS 2018, pp. 113. Cited by: §2.1.
  • [29] I. T. Ruginski, A. P. Boone, L. M. Padilla, L. Liu, N. Heydari, H. S. Kramer, M. Hegarty, W. B. Thompson, D. H. House, and S. H. Creem-Regehr (2016) Non-expert interpretations of hurricane forecast uncertainty visualizations. Spatial Cognition & Computation 16 (2), pp. 154–172. Cited by: §2.3.
  • [30] A. Schmidt and M. Wiegand (2017) A survey on hate speech detection using natural language processing. In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media, pp. 1–10. Cited by: §2.1.
  • [31] K. Sparck Jones (1972) A statistical interpretation of term specificity and its application in retrieval. Journal of documentation 28 (1), pp. 11–21. Cited by: §5.1.
  • [32] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov (2014) Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15 (1), pp. 1929–1958. Cited by: §1.
  • [33] J. Waldron (2012) The harm in hate speech. Harvard University Press. Cited by: §1.
  • [34] S. Wang and C. Manning (2013) Fast dropout training. In International Conference on Machine Learning, pp. 118–126. Cited by: §2.2.
  • [35] W. Warner and J. Hirschberg (2012) Detecting hate speech on the world wide web. In Proceedings of the Second Workshop on Language in Social Media, pp. 19–26. Cited by: §1.
  • [36] W. Zaremba, I. Sutskever, and O. Vinyals (2014) Recurrent neural network regularization. arXiv preprint arXiv:1409.2329. Cited by: §2.2.
  • [37] L. Zhu and N. Laptev (2017) Deep and confident prediction for time series at uber. In 2017 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 103–110. Cited by: §2.2.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
390216
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description