Transgender Community Sentiment Analysis from Social Media Data: A Natural Language Processing Approach
Transgender community are experiencing a huge disparity in mental health condition compared with the general population. Interpreting the social medial data posted by transgender people may help us understand the sentiments of these sexual minority group better and apply early interventions. In this study, we manually categorize 300 social media comments posted by transgender people to the sentiment of negative, positive and neutral. 5 machine learning algorithms and 2 deep neural networks are adopted to build sentiment analysis classifiers based on the annotated data. Results show that our annotations are reliable with a high Cohen’s Kappa score over 0.8 across all three classes. LSTM model yields an optimal performance of accuracy over 0.85 and AUC of 0.876. Our next step will focus on using advanced natural language processing algorithms on larger annotated dataset.
Transgender is defined as a person whose gender self-identity differs from their birth sex. Many transgender people experience serious gender dysphoria , which is a depression feeling due to the sexual mismatch between their self-identity and assigned sex. Therefore, medical treatments such as hormone replacement therapy or psychotherapy, are commonly sought by transgender people. According to research by Amnesty International , about 0.3% of the EU’s population is transgender, accounting for 1.5 million people. In Unites States, an estimated proportion of 0.5 to 0.6% population are identified as transgender. In other countries or regions, the statistics of transgender are less reported, due to culture difference, religion reason or other reality obstacles.
According to the research of National Center for Transgender Equality (NCTE) , transgender community experience a huge disparity in healthcare access, employment, and criminal justice system. To be detailed, 14% of the transgender people are unemployed which is double of the rate in general population; 19% transgender people have been refused to be provided health care; approximately one-fifth of the community have reported being harassed by police. Most of these huge disparity are caused by the discrimination from people with other sexual orientations. Besides this, transgender people are also experiencing serious mental health challenges. Transgender itself is not a mental disorder. However, gender dysphoria will make people have depression and distressing feelings. Moreover, the confusion of self-identification and misunderstand from others will enhance this disorder. This mental health problem will result in unexpected health outcomes even suicide . 41% of transgender individuals have attempted suicide found by the study of NCTE.
Therefore, it is essential to understand the sentiment of transgender community and give appropriate intervention or prevention to those people having gender dysphoria or other mental illness. However, transgender people, being a sexual minority, are less represented or spoken out to the public . Due to the anonymous schema, people are more willing to express their own opinion to the public on the social media. To this end, social media data is a perfect data resource to the research of sentiment analysis of the transgender community. Therefore, we will conduct sentiment analysis using natural language processing approach on social media data posted by transgender community. To be detailed, our contribution is in 2-fold: 1) To give positive, negative or neural annotations to 500 contexts extracted from Reddit data posted by transgender community(sub-reddit:/r/asktransgender); 2) Implement trending natural language processing methods on the annotated data to build a automatic sentiment classifier.
2 Related Work
With the fast development of artificial intelligence (AI) in the last decade, these techniques has been widely applied to the research different fields. Particularly in healthcare, AI is used to improve the predictive accuracy of diseases[14, 20], to interpret medical images automatically [13, 16], to explore the RNA sequence pattern between co-morbidity  and so on. In terms of sentiment analysis of social media data, natural language processing is heavily used. Kanakaraj et al.  apply ensemble classifiers from semantic features to improve the performance of classification tasks compared with traditional bag-of-word models. Yadav et al  build a convolutional neural networks (CNN) -based model to implement a patient assisted system from patient.info data. Broek-Altenburg et al. ] analysis consumers’ sentiments of purchasing health insurance during enrollment season. These techniques are also applied to the reseach of sexual minority groups’ sentiment analysis using social media data. Fitri et al.  conduct sentiment analysis of twitter data using machine learning algorithms with focus on an anti-LGBT campaign in Indonesia. Khatua et al.  explores the tweeting attitude in support of LGBT community in India with a deep learning approach. Unlike previous research, we find transgender group is less discussed and may have more serious mental health concerns. To these limitations, we will apply natural language processing techniques to analyze transgender peoples’ sentiments on social media.
We use the dataset pulished at https://github.com/mjtat/Trans-NLP-Project. In this dataset, more than 20 thousand comments are crawled from the sub-reddit /r/asktransgender. We can not annotate the sentiments of each comment as it requires tremendous labor. Thus, we manually annotated 300 of all extracted comments and apply natural language processing to automatic annotate the other comments. Each comment will be categorized as one of the three sentiments: negative, positive and neutral. Two annotators are involved in our annotation process. They first discuss the definition of each label and then reach a consensus. The Cohen’s kappa  score of each label is calculated to evaluate the inter-rater reliability. For those comments that both annotators can’t make a consensus, a third annotator is involved to give a find result. The definition, statistics and Cohen’s kappa score of each label is shown in Table 1.
3.2 Machine Learning Classifiers
We apply traditional Bag-of-Words  model to tokenize each reddit comment. On tokenizing to spare matrix, term frequency–inverse document frequency (TF-IDF) is used to adjust weight of each word in order to highlight the keywords of each comment. We will apply conventional machine learning algorithms to build a multi-class (3-class) classifier. The algorithms including naive Bayes, random forest, support vector machine with linear kernel , logistic regression and k-nearest neighbors. These conventional algorithms will be compared with deep learning models mentioned in this section.
3.3 Deep Learning Models
We adopt convolutional neural networks (CNN) and recurrent neural networks to train a text classifier. Before feeding into the deep neural networks, each comment will be represented by a word2vec embedding  that is pre-trained on large scale social media data. The detailed configuration of CNN is listed as follows: the network is stacked with 3 convolutional layers with 128, 64, and 32 7-dimension filters; each convolutional layer is activated by a rectifier (ReLU) activation function, a global max pooling and a dropout layer (dropout rate is 0.2); a dense layer is added to the convolutional features to 3 classes. For RNN network, we stack 3 layer of long short-term memory (LSTM)  layers with 128, 64 and 32 neurons. Each layer is followed by a sigmoid activation function. Same embedding scheme as CNN is applied. For both deep neural networks, we employ Adam optimizer and binary cross-entropy loss function. Batch size is set to be 32. We train each model for 100 epochs with early stopping conditioned on the loss of development set.
We leave 20% of the entire dataset as the held-out test. Accuracy and averaged AUC across all 3 labels used to evaluate the performance of our models. For conventional classifiers, 5-fold cross validation is used to select best configurations. For deep neural networks, 10% of the training set is used as the development set to tune hyper parameters.
All experiments are implemented using Python 3.6. Keras is used to design deep neural networks. Conventional machine leanring classifiers are built with scikit-learn package. We used a single GPU to expedite the training of deep neural networks.
The annotation results are shown in Table 1. Among the 300 reddit comment, 72 are categorized as negative post, while 85 are annotated to be positive content. The rest 48% are considered as neutral sentiment. The Cohen’s Kappa score is over 0.8 across all classes, which can be considered as a powerful annotation with high inter-rating agreement.
The performance of our NLP methods is shown in Table 2. Naive Bayes method serves as the baseline performance. All other models outperforms baseline by more than 5 percent on accuracy except K-nearest neighbour. Convolutional neural network yields the best accuracy of 0.861, followed by Long Short-Term Memory, which has an accuracy of 0.852. These 2 deep neural networks achieve better performance than all conventional machine learning classifier. LSTM also obtains an optimal AUC of 0.876, follwed by random forest, which has an AUC of 0.868. In short, LSTM yields the best performance when evaluated both on accuracy and AUC.
|Support Vector Machine||0.832||0.842|
|Convolutional Neural Network (CNN)||0.861||0.834|
|Long Short-Term Memory (LSTM)||0.852||0.876|
5 Discussion and conclusion
We visualize the single word’s weights in logistic regression to discuss the important words that contribute to classification in Fig. 1. Top 20 words are visualized to a word cloud figure. We can see it is natural that trans and transgender are two words of the greatest importance. Besides this dysphoria is also heavily mentioned in the comments. The words: question, wondering, anything and anyone are also very important to the classification. These words may related to the depression and helpless of transgender community.
In terms of real case use, our classifier can identify negative sentiment from transgender people when they post on social media. This may help health providers understand the mental condition of transgender community better. It can be also used to detect extreme negative sentiments from these community and send alert to the friends, doctors or social media users themselves so that early intervention can be applied.
Our work also have following limitations. First, we only annotate 300 reddit comments which may not large enough for our NLP algorithms to build a solid automatic model. Second, we don’t apply the most trending NLP algorithms such as graph convolutional networks  and transformer-based BERT model . In the next step, we are planning to employ more powerful algorithms on a larger annotated dataset.
6 Author Contribution Statement
The research idea is discussed by all authors. M.L. and Y.W annotate all 300 reddit comments. For those annotations without agreement between M.L. and Y.W., Y.Z. is involved to give a final decision. M.L., Y.W. and Y.Z. complete all experiments, error analysis and manuscript draft writing. Z.L. modifies the manuscript to a higher academic standard.
- (2015) Predicting the sequence specificities of dna-and rna-binding proteins by deep learning. Nature biotechnology 33 (8), pp. 831–838. Cited by: §2.
- (1988) A generalization of cohen’s kappa agreement measure to interval measurement and multiple raters. Educational and Psychological Measurement 48 (4), pp. 921–933. Cited by: §3.1.
- (1987) Heterosexual and homosexual gender dysphoria. Archives of sexual behavior 16 (2), pp. 139–152. Cited by: §1.
- (2020) An evaluation of document clustering and topic modelling in two online social networks: twitter and reddit. Information Processing & Management 57 (2), pp. 102034. Cited by: §3.3.
- (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. Cited by: §5.
- (2019) Sentiment analysis of social media twitter with case of anti-lgbt campaign in indonesia using naïve bayes, decision tree, and random forest algorithm. Procedia Computer Science 161, pp. 765–772. Cited by: §2.
- (1999) Learning to forget: continual prediction with lstm. Cited by: §3.3.
- (2008) A dual coordinate descent method for large-scale linear svm. In Proceedings of the 25th international conference on Machine learning, pp. 408–415. Cited by: §3.2.
- (2015) Transgender stigma and health: a critical review of stigma determinants, mechanisms, and interventions. Social science & medicine 147, pp. 222–231. Cited by: §1.
- (2015) Performance analysis of ensemble methods on twitter sentiment analysis using nlp techniques. In Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015), pp. 169–170. Cited by: §2.
- (2019) Tweeting in support of lgbt? a deep learning approach. In Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, pp. 342–345. Cited by: §2.
- (2019) Transgender-inclusive care. CMAJ 191 (3), pp. E79–E79. Cited by: §1.
- (2020) A comparison of pre-trained vision-and-language models for multimodal representation learning across medical images and reports. arXiv preprint arXiv:2009.01523. Cited by: §2.
- (2018) Early prediction of acute kidney injury in critical care setting using clinical notes. In 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 683–686. Cited by: §2.
- (2004) Lesbian, gay, bisexual, and transgender people receiving services in the public mental health system: raising issues. Journal of Gay & Lesbian Psychotherapy 8 (3-4), pp. 25–42. Cited by: §1.
- (2017) Chexnet: radiologist-level pneumonia detection on chest x-rays with deep learning. arXiv preprint arXiv:1711.05225. Cited by: §2.
- (2009) Doing gender, doing heteronormativity: “gender normals,” transgender people, and the social maintenance of heterosexuality. Gender & society 23 (4), pp. 440–464. Cited by: §1.
- (2013) Disapproval of homosexuality: comparative research on individual and national determinants of disapproval of homosexuality in 20 european countries. International Journal of Public Opinion Research 25 (1), pp. 64–86. Cited by: §1.
- (2019) Using social media to identify consumers’ sentiments towards attributes of health insurance during enrollment season. Applied Sciences 9 (10), pp. 2035. Cited by: §2.
- (2019) Using machine learning to integrate socio-behavioral factors in predicting cardiovascular-related mortality risk.. In MedInfo, pp. 433–437. Cited by: §2.
- (2020) Sentiment analysis using deep learning architectures: a review. Artificial Intelligence Review 53 (6), pp. 4335–4385. Cited by: §2.
- (2019) Graph convolutional networks for text classification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, pp. 7370–7377. Cited by: §5.
- (2010) Understanding bag-of-words model: a statistical framework. International Journal of Machine Learning and Cybernetics 1 (1-4), pp. 43–52. Cited by: §3.2.