Transgender Community Sentiment Analysis from Social Media Data: A Natural Language Processing Approach

Transgender Community Sentiment Analysis from Social Media Data: A Natural Language Processing Approach


Transgender community are experiencing a huge disparity in mental health condition compared with the general population. Interpreting the social medial data posted by transgender people may help us understand the sentiments of these sexual minority group better and apply early interventions. In this study, we manually categorize 300 social media comments posted by transgender people to the sentiment of negative, positive and neutral. 5 machine learning algorithms and 2 deep neural networks are adopted to build sentiment analysis classifiers based on the annotated data. Results show that our annotations are reliable with a high Cohen’s Kappa score over 0.8 across all three classes. LSTM model yields an optimal performance of accuracy over 0.85 and AUC of 0.876. Our next step will focus on using advanced natural language processing algorithms on larger annotated dataset.

1 Introduction

Transgender is defined as a person whose gender self-identity differs from their birth sex[17]. Many transgender people experience serious gender dysphoria [3], which is a depression feeling due to the sexual mismatch between their self-identity and assigned sex. Therefore, medical treatments such as hormone replacement therapy or psychotherapy, are commonly sought by transgender people. According to research by Amnesty International [18], about 0.3% of the EU’s population is transgender, accounting for 1.5 million people. In Unites States, an estimated proportion of 0.5 to 0.6% population are identified as transgender. In other countries or regions, the statistics of transgender are less reported, due to culture difference, religion reason or other reality obstacles.

According to the research of National Center for Transgender Equality (NCTE) [9], transgender community experience a huge disparity in healthcare access, employment, and criminal justice system. To be detailed, 14% of the transgender people are unemployed which is double of the rate in general population; 19% transgender people have been refused to be provided health care; approximately one-fifth of the community have reported being harassed by police. Most of these huge disparity are caused by the discrimination from people with other sexual orientations. Besides this, transgender people are also experiencing serious mental health challenges. Transgender itself is not a mental disorder. However, gender dysphoria will make people have depression and distressing feelings. Moreover, the confusion of self-identification and misunderstand from others will enhance this disorder. This mental health problem will result in unexpected health outcomes even suicide [12]. 41% of transgender individuals have attempted suicide found by the study of NCTE.

Therefore, it is essential to understand the sentiment of transgender community and give appropriate intervention or prevention to those people having gender dysphoria or other mental illness. However, transgender people, being a sexual minority, are less represented or spoken out to the public [15]. Due to the anonymous schema, people are more willing to express their own opinion to the public on the social media. To this end, social media data is a perfect data resource to the research of sentiment analysis of the transgender community. Therefore, we will conduct sentiment analysis using natural language processing approach on social media data posted by transgender community. To be detailed, our contribution is in 2-fold: 1) To give positive, negative or neural annotations to 500 contexts extracted from Reddit data posted by transgender community(sub-reddit:/r/asktransgender); 2) Implement trending natural language processing methods on the annotated data to build a automatic sentiment classifier.

2 Related Work

With the fast development of artificial intelligence (AI) in the last decade, these techniques has been widely applied to the research different fields. Particularly in healthcare, AI is used to improve the predictive accuracy of diseases[14, 20], to interpret medical images automatically [13, 16], to explore the RNA sequence pattern between co-morbidity [1] and so on. In terms of sentiment analysis of social media data, natural language processing is heavily used. Kanakaraj et al. [10] apply ensemble classifiers from semantic features to improve the performance of classification tasks compared with traditional bag-of-word models. Yadav et al [21] build a convolutional neural networks (CNN) -based model to implement a patient assisted system from data. Broek-Altenburg et al. ][19] analysis consumers’ sentiments of purchasing health insurance during enrollment season. These techniques are also applied to the reseach of sexual minority groups’ sentiment analysis using social media data. Fitri et al. [6] conduct sentiment analysis of twitter data using machine learning algorithms with focus on an anti-LGBT campaign in Indonesia. Khatua et al. [11] explores the tweeting attitude in support of LGBT community in India with a deep learning approach. Unlike previous research, we find transgender group is less discussed and may have more serious mental health concerns. To these limitations, we will apply natural language processing techniques to analyze transgender peoples’ sentiments on social media.

3 Methods

3.1 Dataset

We use the dataset pulished at In this dataset, more than 20 thousand comments are crawled from the sub-reddit /r/asktransgender. We can not annotate the sentiments of each comment as it requires tremendous labor. Thus, we manually annotated 300 of all extracted comments and apply natural language processing to automatic annotate the other comments. Each comment will be categorized as one of the three sentiments: negative, positive and neutral. Two annotators are involved in our annotation process. They first discuss the definition of each label and then reach a consensus. The Cohen’s kappa [2] score of each label is calculated to evaluate the inter-rater reliability. For those comments that both annotators can’t make a consensus, a third annotator is involved to give a find result. The definition, statistics and Cohen’s kappa score of each label is shown in Table 1.

Sentiment Defination Count Prevalence
Feeling uncomfortable, depressed, unconfident of
being a transgender person; Experiencing dysphoria;
Being scared of receiving medical treatments
and all other negative sentiments caused by transgender.
72 24.00% 0.825
I’ve really struggled trying to find a way to express
how I felt about my own gender issues.
I’ve had them for years and years now,
but at the same time I’ve rarely showed outward
evidence of femininity in my daily life …
Feeling comfortable, confident, and proud of
being a transgender people; Positively receiving
medical treatment; Helping other people
in the community and all other positive feelings
associated with transgender.
85 28.33% 0.877
Reading a lot about gender dysphoria and online tests re:
transgender identity. I realize that while I (AMAB) have recently
begun intermittently fantasizing about being a woman,
I don’t have any negative feelings about my life prior to these…
The posts can’t be categorized to negative or
positive sentiment ; Or can be categorized
to both sentiments.
143 47.67% 0.902
So, I’ve got an appointment coming up with my primary care physician,
basically to attempt to suss out whether I could go thru the process of
getting hormones thru them. and I think I’ve got a basic idea of
what I should be asking, but I’m not sure what to really ask or to expect.
Table 1: The definition, count, prevalence, example and Cohen’s Kappa score of the sentiments of 300 Reddit comments posted by transgender community

3.2 Machine Learning Classifiers

We apply traditional Bag-of-Words [23] model to tokenize each reddit comment. On tokenizing to spare matrix, term frequency–inverse document frequency (TF-IDF) is used to adjust weight of each word in order to highlight the keywords of each comment. We will apply conventional machine learning algorithms to build a multi-class (3-class) classifier. The algorithms including naive Bayes, random forest, support vector machine with linear kernel [8], logistic regression and k-nearest neighbors. These conventional algorithms will be compared with deep learning models mentioned in this section.

3.3 Deep Learning Models

We adopt convolutional neural networks (CNN) and recurrent neural networks to train a text classifier. Before feeding into the deep neural networks, each comment will be represented by a word2vec embedding [4] that is pre-trained on large scale social media data. The detailed configuration of CNN is listed as follows: the network is stacked with 3 convolutional layers with 128, 64, and 32 7-dimension filters; each convolutional layer is activated by a rectifier (ReLU) activation function, a global max pooling and a dropout layer (dropout rate is 0.2); a dense layer is added to the convolutional features to 3 classes. For RNN network, we stack 3 layer of long short-term memory (LSTM) [7] layers with 128, 64 and 32 neurons. Each layer is followed by a sigmoid activation function. Same embedding scheme as CNN is applied. For both deep neural networks, we employ Adam optimizer and binary cross-entropy loss function. Batch size is set to be 32. We train each model for 100 epochs with early stopping conditioned on the loss of development set.

3.4 Evaluation

We leave 20% of the entire dataset as the held-out test. Accuracy and averaged AUC across all 3 labels used to evaluate the performance of our models. For conventional classifiers, 5-fold cross validation is used to select best configurations. For deep neural networks, 10% of the training set is used as the development set to tune hyper parameters.

3.5 Implentation

All experiments are implemented using Python 3.6. Keras is used to design deep neural networks. Conventional machine leanring classifiers are built with scikit-learn package. We used a single GPU to expedite the training of deep neural networks.

4 Results

The annotation results are shown in Table 1. Among the 300 reddit comment, 72 are categorized as negative post, while 85 are annotated to be positive content. The rest  48% are considered as neutral sentiment. The Cohen’s Kappa score is over 0.8 across all classes, which can be considered as a powerful annotation with high inter-rating agreement.

The performance of our NLP methods is shown in Table 2. Naive Bayes method serves as the baseline performance. All other models outperforms baseline by more than 5 percent on accuracy except K-nearest neighbour. Convolutional neural network yields the best accuracy of 0.861, followed by Long Short-Term Memory, which has an accuracy of 0.852. These 2 deep neural networks achieve better performance than all conventional machine learning classifier. LSTM also obtains an optimal AUC of 0.876, follwed by random forest, which has an AUC of 0.868. In short, LSTM yields the best performance when evaluated both on accuracy and AUC.

Algorithm Accuray Averaged AUC
Naive Bayes 0.777 0.720
Random Forest 0.821 0.868
Support Vector Machine 0.832 0.842
Logistic Regression 0.829 0.851
K-Nearest Neighbour 0.764 0.767
Convolutional Neural Network (CNN) 0.861 0.834
Long Short-Term Memory (LSTM) 0.852 0.876
Table 2: Performance of 5 machine learning classifiers and 2 deep neural networks on trangender sentiment analysis

5 Discussion and conclusion

Figure 1: A word cloud of Top 20 important words contribute to classification

We visualize the single word’s weights in logistic regression to discuss the important words that contribute to classification in Fig. 1. Top 20 words are visualized to a word cloud figure. We can see it is natural that trans and transgender are two words of the greatest importance. Besides this dysphoria is also heavily mentioned in the comments. The words: question, wondering, anything and anyone are also very important to the classification. These words may related to the depression and helpless of transgender community.

In terms of real case use, our classifier can identify negative sentiment from transgender people when they post on social media. This may help health providers understand the mental condition of transgender community better. It can be also used to detect extreme negative sentiments from these community and send alert to the friends, doctors or social media users themselves so that early intervention can be applied.

Our work also have following limitations. First, we only annotate 300 reddit comments which may not large enough for our NLP algorithms to build a solid automatic model. Second, we don’t apply the most trending NLP algorithms such as graph convolutional networks [22] and transformer-based BERT model [5]. In the next step, we are planning to employ more powerful algorithms on a larger annotated dataset.

6 Author Contribution Statement

The research idea is discussed by all authors. M.L. and Y.W annotate all 300 reddit comments. For those annotations without agreement between M.L. and Y.W., Y.Z. is involved to give a final decision. M.L., Y.W. and Y.Z. complete all experiments, error analysis and manuscript draft writing. Z.L. modifies the manuscript to a higher academic standard.


  1. B. Alipanahi, A. Delong, M. T. Weirauch and B. J. Frey (2015) Predicting the sequence specificities of dna-and rna-binding proteins by deep learning. Nature biotechnology 33 (8), pp. 831–838. Cited by: §2.
  2. K. J. Berry and P. W. Mielke Jr (1988) A generalization of cohen’s kappa agreement measure to interval measurement and multiple raters. Educational and Psychological Measurement 48 (4), pp. 921–933. Cited by: §3.1.
  3. R. Blanchard, L. H. Clemmensen and B. W. Steiner (1987) Heterosexual and homosexual gender dysphoria. Archives of sexual behavior 16 (2), pp. 139–152. Cited by: §1.
  4. S. A. Curiskis, B. Drake, T. R. Osborn and P. J. Kennedy (2020) An evaluation of document clustering and topic modelling in two online social networks: twitter and reddit. Information Processing & Management 57 (2), pp. 102034. Cited by: §3.3.
  5. J. Devlin, M. Chang, K. Lee and K. Toutanova (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. Cited by: §5.
  6. V. A. Fitri, R. Andreswari and M. A. Hasibuan (2019) Sentiment analysis of social media twitter with case of anti-lgbt campaign in indonesia using naïve bayes, decision tree, and random forest algorithm. Procedia Computer Science 161, pp. 765–772. Cited by: §2.
  7. F. A. Gers, J. Schmidhuber and F. Cummins (1999) Learning to forget: continual prediction with lstm. Cited by: §3.3.
  8. C. Hsieh, K. Chang, C. Lin, S. S. Keerthi and S. Sundararajan (2008) A dual coordinate descent method for large-scale linear svm. In Proceedings of the 25th international conference on Machine learning, pp. 408–415. Cited by: §3.2.
  9. J. M. W. Hughto, S. L. Reisner and J. E. Pachankis (2015) Transgender stigma and health: a critical review of stigma determinants, mechanisms, and interventions. Social science & medicine 147, pp. 222–231. Cited by: §1.
  10. M. Kanakaraj and R. M. R. Guddeti (2015) Performance analysis of ensemble methods on twitter sentiment analysis using nlp techniques. In Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015), pp. 169–170. Cited by: §2.
  11. A. Khatua, E. Cambria, K. Ghosh, N. Chaki and A. Khatua (2019) Tweeting in support of lgbt? a deep learning approach. In Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, pp. 342–345. Cited by: §2.
  12. J. S. H. Lam and A. Abramovich (2019) Transgender-inclusive care. CMAJ 191 (3), pp. E79–E79. Cited by: §1.
  13. Y. Li, H. Wang and Y. Luo (2020) A comparison of pre-trained vision-and-language models for multimodal representation learning across medical images and reports. arXiv preprint arXiv:2009.01523. Cited by: §2.
  14. Y. Li, L. Yao, C. Mao, A. Srivastava, X. Jiang and Y. Luo (2018) Early prediction of acute kidney injury in critical care setting using clinical notes. In 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 683–686. Cited by: §2.
  15. A. Lucksted (2004) Lesbian, gay, bisexual, and transgender people receiving services in the public mental health system: raising issues. Journal of Gay & Lesbian Psychotherapy 8 (3-4), pp. 25–42. Cited by: §1.
  16. P. Rajpurkar, J. Irvin, K. Zhu, B. Yang, H. Mehta, T. Duan, D. Ding, A. Bagul, C. Langlotz and K. Shpanskaya (2017) Chexnet: radiologist-level pneumonia detection on chest x-rays with deep learning. arXiv preprint arXiv:1711.05225. Cited by: §2.
  17. K. Schilt and L. Westbrook (2009) Doing gender, doing heteronormativity: “gender normals,” transgender people, and the social maintenance of heterosexuality. Gender & society 23 (4), pp. 440–464. Cited by: §1.
  18. H. Van den Akker, R. Van der Ploeg and P. Scheepers (2013) Disapproval of homosexuality: comparative research on individual and national determinants of disapproval of homosexuality in 20 european countries. International Journal of Public Opinion Research 25 (1), pp. 64–86. Cited by: §1.
  19. E. M. van den Broek-Altenburg and A. J. Atherly (2019) Using social media to identify consumers’ sentiments towards attributes of health insurance during enrollment season. Applied Sciences 9 (10), pp. 2035. Cited by: §2.
  20. H. Wang, Y. Li, H. Ning, J. Wilkins, D. Lloyd-Jones and Y. Luo (2019) Using machine learning to integrate socio-behavioral factors in predicting cardiovascular-related mortality risk.. In MedInfo, pp. 433–437. Cited by: §2.
  21. A. Yadav and D. K. Vishwakarma (2020) Sentiment analysis using deep learning architectures: a review. Artificial Intelligence Review 53 (6), pp. 4335–4385. Cited by: §2.
  22. L. Yao, C. Mao and Y. Luo (2019) Graph convolutional networks for text classification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, pp. 7370–7377. Cited by: §5.
  23. Y. Zhang, R. Jin and Z. Zhou (2010) Understanding bag-of-words model: a statistical framework. International Journal of Machine Learning and Cybernetics 1 (1-4), pp. 43–52. Cited by: §3.2.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description