SentiHood: Targeted Aspect Based Sentiment Analysis Dataset for Urban Neighbourhoods
In this paper, we introduce the task of targeted aspect-based sentiment analysis. The goal is to extract fine-grained information with respect to entities mentioned in user comments. This work extends both aspect-based sentiment analysis that assumes a single entity per document and targeted sentiment analysis that assumes a single sentiment towards a target entity. In particular, we identify the sentiment towards each aspect of one or more entities. As a testbed for this task, we introduce the SentiHood dataset, extracted from a question answering (QA) platform where urban neighbourhoods are discussed by users. In this context units of text often mention several aspects of one or more neighbourhoods. This is the first time that a generic social media platform in this case a QA platform, is used for fine-grained opinion mining. Text coming from QA platforms is far less constrained compared to text from review specific platforms which current datasets are based on. We develop several strong baselines, relying on logistic regression and state-of-the-art recurrent neural networks.
Marzieh Saeidi University College London firstname.lastname@example.org Guillaume Bouchard Bloomsbury AI email@example.com
Maria Liakata University of Warwick firstname.lastname@example.org Sebastian Riedel University College London email@example.com
This work is licensed under a Creative Commons Attribution 4.0 International Licence. Licence details: http://creativecommons.org/licenses/by/4.0/
Sentiment analysis is an important task in natural language processing. It has received not only a lot of interest in academia but also in industry, in particular for identifying customer satisfaction on products and services. Early research in the field [Das and Chen, 2001, Morinaga et al., 2002] of sentiment analysis only focused on identifying the overall sentiment or polarity of a given text. The underlying assumption of this work was that there is one overall polarity in the whole text.
Aspect-based sentiment analysis (ABSA) [Jo and Oh, 2011, Pontiki et al., 2015, Pontiki et al., 2016] relates to the task of extracting fine-grained information by identifying the polarity towards different aspects of an entity in the same unit of text, and recognizing the polarity associated with each aspect separately. The datasets for this task were mostly based on specialized review platforms such as Yelp where it is assumed that only one entity is discussed in one review snippet, but the opinion on multiple aspects can be expressed. This task is particularly useful because a user can assess the aggregated sentiment for each individual aspect of a given product or service and get a more fine-grained understanding of its quality.
Another line of research in this field is targeted (a.k.a. target-dependent) sentiment analysis [Jiang et al., 2011, Vo and Zhang, 2015]. Targeted sentiment analysis investigates the classification of opinion polarities towards certain target entity mentions in given sentences (often a tweet). For instance in the sentence “People everywhere love Windows & vista. Bill Gates”, polarity towards Bill Gates is “Neutral” but the positive sentiment towards Windows & vista will interfere with identifying it if the usual methods for sentiment analysis task are employed. However this task assumes only the overall sentiment for each entity. Moreover, the existing corpora for this task so far has contained only a single target entity per unit of text.
Both settings are obviously limited, and there exists many scenarios in which sentiments towards different aspects of several entities are discussed in the same unit of text. As a running example, we use urban areas: choosing which area to live or to visit is an important task when moving or visiting a new city. Currently there are no dedicated platforms for reviewing and rating aspects of neighbourhoods of a city. However we can find many discussions and threads on several blogs and question answering platforms that discuss aspects of areas in many cities around the world. In general, these conversations are very comprehensible: they often contain specific information about several aspects of several neighbourhoods. One example is the following (area names are highlighted in bold and aspect related terms are underlined):
“Other places to look at in South London are Streatham (good range of shops and restaurants, maybe a bit far out of central London but you get more for your money) Brixton (good transport links, trendy, can be a bit edgy) Clapham (good transport, good restaurants/pubs, can feel a bit dull, expensive) …”
The example above does not perfectly fit into the existing tasks in sentiment analysis mentioned earlier. In this work, we introduce a new task that not only subsumes the existing sub-fields of targeted and aspect-based sentiment analysis but it also makes less assumptions on the number of entities that can be discussed in the unit of text.
To compare with the existing aspect-based sentiment analysis task, take the following example from the restaurant dataset used by SemEval shared ABSA [Pontiki et al., 2016] task. “The design of the space is good but the service is horrid!”. The ABSA task aims to identify that a positive sentiment towards the ambiance aspect is expressed (opinion target expression is “space”). Moreover, a negative sentiment is expressed towards the service aspect (opinion target expression is “service”). In this example, it is assumed that both of these opinions are expressed about a single restaurant which is not mentioned explicitly. However, take the following synthetic example that ABSA is not addressing:
“The design of the space is good in Boqueria but the service is horrid, on the other hand, the staff in Gremio are very friendly and the food is always delicious.”
In this example, more than one restaurant are discussed and restaurants for which opinions are expressed, are explicitly mentioned. We call these target entities. Current ABSA task can only recognise that positive and negative opinions towards aspect “service” are expressed. But it can not identify the target entity for each of these opinions (i.e. Germio and Boqueria respectively). Targeted aspect-based sentiment analysis handles extracting the target entities as well as different aspects and their relevant sentiments.
In the following, we argue that this task is both very relevant in practice, and raises interesting modelling questions. To facilitate research on this task we introduce the SentiHood dataset. SentiHood is based on the text from a QA platform in the domain of neighbourhoods of a city. Table 2 shows examples of input sentences and annotations provided.
|The cheap parts of London are Edmonton and Tottenham and they||(Edmonton,price,Positive)|
|are all poor, crime ridden and crowded with immigrants||(Tottenham,price,Positive)|
|Hampstead area, more expensive but a better quality of living than||(Hampstead,price,Negative)|
|in Tufnell Park||(Hampstead,live,Positive)|
Our contributions in this paper can be summarised as follows:
We introduce the task of targeted aspect-based sentiment analysis as a further step towards extracting more fine-grained information from more complex text in the field of sentiment analysis.
We use the text from social media platforms, in particular QA, for fine-grained opinion mining. So far, all datasets in this field have utilised text from review specific platforms where certain assumptions can be made and data is more constrained and less noisy.
We propose SentiHood, a benchmark dataset that is annotated for the task of targeted aspect-based sentiment analysis in the domain of urban neighbourhoods.
We show that despite the fact that the texts in QA were not written with the goal of writing a review in mind, question answering platforms and online forums are in general rich in information.
We provide strong baselines for the task using both logistic regression and Long Short Term Memory (LSTM) networks and analysis of the results.
SentiHood is a dataset for the task of targeted aspect-based sentiment analysis. It is based on the text taken from question answering platform of Yahoo! Answers that is filtered for questions relating to neighbourhoods of the city of London. In this section we explain the data collection and annotation process and summarise properties of the dataset.
2.1 Data Collection Process
Entities in the dataset are locations or neighbourhoods. Yahoo! Answers was queried using the name of each neighbourhood of the city of London. Location (entity) names were taken from the gazetteer GeoNames111http://www.geonames.org/ and restricted to those within the boundaries of London. This list includes names of areas and boroughs and therefore entities are not always geographically exclusive (a borough contains several areas or neighbourhoods). The content of each question-answer pairs was aggregated and split into sentences. We keep only sentences that have a mention of a location entity name and discard other sentences.
The Number of location mentions in a single sentence in our dataset varies from one to over . To simplify the task, we only annotate sentences that contain one or two location mentions. These sentences were divided into two groups: sentences containing one location mention — Single, and sentences containing two location mentions — Multi. This is to observe the difficulty of annotating two groups by human annotators and by the models.
Like existing work in the aspect-based sentiment analysis task [Brychcın et al., 2014], a pre-defined list of aspects is provided for annotators to choose from. These aspects are: live, safety, price, quiet, dining, nightlife, transit-location, touristy, shopping, green-culture and multicultural. Adding an additional aspect of misc was considered. However in the initial round of annotations, we realised that it had a negative effect on the decisiveness of annotators and it led to a lower overall agreement. Aspect general refers to a generic opinion about a location, e.g. “I love Camden Town”.
For each selected aspect, annotators were required to select a polarity or sentiment. Most work in this area considers three sentiment categories of “Positive”. “Negative” and “Neutral”. In our annotation however, we only provided “Positive” and “Negative” sentiment labels. This is because in our data we rarely come across cases where aspects are discussed without a polarity.
2.5 Target Entity
Target entity is a location entity in which an opinion (aspect and sentiment) is expressed for. We also refer to target entity as target location.
2.6 Out of scope
For the sentences that do not comply with our schema, we define the two following special labels. Sentences marked with one of the these labels are removed from the dataset.
Irrelevant: When the identified name does not refer to a location entity: for example in the sentence “Notting Hill (1999) stars Julia Roberts and Hugh Grant use the characteristic features of the area as a backdrop to the action”, “Notting Hill” refers to the movie and not the area.
Uncertain: When two contradicting sentiments are expressed for the same location and aspect, e.g. “Like any other area, Camden Town has good and bad parts”. Moreover, when the opinion is expressed for an area without a direct mention in the sentences, e.g. “It’s a very trendy area and not too far from King’s Cross”.
We use the BRAT annotation tool [Stenetorp et al., 2012] to simplify the annotation task. Three annotators were initially selected for the task. None of the annotators are experts in linguistics. Annotators began by reading the guidelines and examples. Each annotator was then required to annotate a small subset of the data. After each round of annotation, agreements between annotators were calculated and discussed and this procedure continued until they reached a reasonable agreement. 10% of the whole dataset was randomly selected and annotated by all the three annotators. The annotator with the highest inter-annotator agreement was selected to annotate all the dataset.
Cohen’s Kappa coefficient() [COHEN, 1960] is often used for measuring the pairwise agreement between each two annotators for the task of aspect-based sentiment analysis [Gamon et al., 2005, Ganu et al., 2009] and other tasks [Liakata et al., 2010]. The Kappa Coefficient is calculated over aspect-sentiment pairs per each location. Pairwise inter-annotator agreement for aspect categories measured using Cohen’s Kappa is , and , which is deemed of sufficient quality. It is worth mentioning that agreements on different aspect categories varied, with some aspects having a higher agreement rate. Agreements for aspect expressions are , , . These agreements indicate reasonably high inter-annotator agreements [Pavlopoulos, 2014].
Main disagreements between annotators occurred in detecting the aspect rather than detecting the sentiment, aspect expression or the target location. For instance, some annotators associated the expression “residential area” with a “Positive” sentiment for aspect “quiet” or “live” and others did not agree that “residential” implies quietness or desirable for living. In the case of disagreements, the vote of the majority was considered as the correct annotation.
Some ambiguity was also observed with respect to detecting the target location. This occurred mainly when a location is confined in another location. For instance the sentence “Angel in Inslington has many great restaurants for eating out” expresses a “Positive” sentiment for the aspect “dining” of area Angel which is within the borough of Islington. Some annotators suggested that the sentence also implies the same opinion for Islington. However at the end all annotators agreed that in such cases no implicit assumptions should be made and only confined area should be labeled.
SentiHood currently contains annotated sentences containing one or two location entity mentions.222SentiHood data can be obtained at http://annotate-neighborhood.com/download/download.html SentiHood contains sentences with sentences containing a single location and sentences containing multiple (two) locations. Figure 1 shows the number of sentences that are labeled with each aspect, breaking down on the sentiment “Positive” or “Negative”. “Positive” sentiment is dominant for aspects such as dining and shopping. This shows that for some aspects, people usually talk about areas that are good for it as oppose to areas that are not. The general aspect is the most frequent aspect with over sentences while aspect touristy has occurred in less than sentences. Notice that since each sentence can contain one or more opinions, the total number of opinions () in the dataset is higher than the number of sentences.
Location entity names are masked by location1 and location2 in the whole dataset, so the task does not involve identification and segmentation of the named entities. We also provide the dataset with the original location entity names.
We define the task of targeted aspect-based sentiment analysis as follows: given a unit of text (for example, a sentence), provide a list of tuples (labels) , where is the polarity expressed for the aspect of entity . Each sentence can have zero to number of labels associated with it.
Within the current aspect-based sentiment analysis work, three tasks are defined [Brychcın et al., 2014]: detecting the aspect, detecting the opinion target expression and detecting the sentiment, with detecting the opinion target expression being an intermediary task for identifying the sentiment of the aspect.
Here we focus on identifying only the aspect and sentiment for each entity. We identify each aspect, its relevant sentiment and the target location entity jointly by introducing a new polarity class called “None”. “None” indicated that a sentence does not contain an opinion for the aspect of location . Therefore the overall task can be defined as a three-class classification task for each pair with labels “Positive”, “Negative”, “None”. Table 2 shows an example of the input sentence and output labels.
|location1 is very safe and location2 is too far||(location1,safety,Positive)|
Most existing work in aspect-based sentiment analysis field, report measure for aspect detection task, and accuracy for sentiment classification. The scores can be calculated over 2-class or 3-class sentiments [Pontiki et al., 2015]. In our results, score is calculated with a threshold that is optimized on validation set.
We also propose the AUC (area under the ROC curve) metric for both aspect and sentiment detection tasks. AUC captures the quality of the ranking of output scores and does not rely on a threshold.
Here we propose baselines for the task. In all our methods, we treat the task as a three-class classification for each aspect and use a softmax function as follows:
where is the sentiment label of aspect for location . and are the weights and the bias specific to each sentiment class , respectively. is a representation of location . This representation can be a BoW or a distributional representation. Each method that we propose here define their own specific representation for .
5.1 Logistic Regression
Many existing works in the aspect-based sentiment analysis task,333including participants of SemEval ABSA tasks use a classifier, such as logistic regression or SVM, based on linguistic features such as n-grams, POS information or more hand-engineered features. We can think of these features as a sparse representation that enter the softmax in equation 1. More concretely, we define the following sparse representations of locations:
Mask target entity n-grams:
For each location, we define an n-gram representation over the sentence and mask the target location using a special token. This can help to differentiate between representations of two locations present in the same sentence.
we create an n-gram representation for both the right and the left context around each location mention. We then concatenate these two representations to obtain one single feature vector.
Left right pooling:
Previously embedding representations over the left and right context have been used for automatic feature detection in the targeted sentiment analysis task [Vo and Zhang, 2015]. Inspired by this approach, we obtain max, min, average and standard deviation pooling over all the word embeddings for left and right context separately. We then combine the pooled embeddings of the left and right context to obtain a single feature vector. Word embeddings are obtained by running word2vec tool on a combination of our Yahoo! Answers corpus and a substantially big corpus from the web.444http://ebiquity.umbc.edu/redirect/to/resource/id/351/UMBC-webbase-corpus
5.2 Long Short-Term Memory (LSTM)
Inspired by the recent success of applying deep neural networks on language tasks, we use a bidirectional LSTM [Hochreiter and Schmidhuber, 1997] to learn a classifier for each of the aspects. Representations for a location () are obtained using one of the following two approaches:
Final output state (LSTM - Final):
is the output embedding of the bidirectional LSTM.
Location output state (LSTM - Location):
is the output representation at the index corresponding to the location entity as illustrated in Figure 2.
In this paper, we select the four most frequent aspects from the dataset which are: “price”, “safety”, “transit-location” and “general” but the same approach can be applied to the remaining aspects. We divide each collection of single and multiple location mentions into train, dev and test set, with each having , and of data respectively. We choose the best model with respect to the dev set.
In the case of the LSTM, we evaluate the loss on both training set and dev set after each iteration. We save the best model which has the lowest loss on the dev set over all the iterations. We then run this model on the test set and report the results. We report results separately on both categories of single location sentences and sentences with two locations and over all the data in the test set. Results on single location sentences mainly show the ability of the model to detect the correct sentiment for an aspect. On the other hand, results on two location sentences demonstrate the ability of the system not only on detecting the relevant sentiment of an aspect but also on recognising the target entity of the opinion.
We implement our LSTM models using tensorflow [ten, 2015]. To tackle the problem of having an unbalanced dataset (i.e. too many “None” instances), we train the LSTM model in batches with every batch having the same number of sentences selected randomly from each sentiment class. We tune the hyper parameters of the model on the dev set. The best model uses hidden units of size and batch sizes of size . The Adam optimizer is used for optimization with a starting learning rate of which is tuned to be the best performing on the dev set. Dropout is used both on initial word embeddings and on LSTM cells with the probability of . Tensorflow [ten, 2015] is used for the implementation of LSTM.
Training Logistic Regression
Logistic regression models were based on implementations from scikit-learn.555http://scikit-learn.org/ Since we have an unbalanced dataset, we use a weighted logistic regression. To obtain the best weights, we cross-validate them on the development set. Weights inversely proportional to the size of each class result in the best performance.
Table 3 shows the results (averaged over all selected aspects) in terms of both /accuracy and AUCs. It also shows the results of logistic regression based models versus LSTM models.
As we can see, the n-gram representation with location masking achieves slightly better results over the left-right context. N-grams include unigrams and bigrams. Also, by adding POS information, we gain an increase in the performance. We also experimented with adding tri-grams but it did not have a positive effect on the overall scores. Separating the left and the right context (LR-Left-Right) for BoW representation, does not improve the performance. Left-right pooling of dense embeddings performed weakly in comparison with other representations and therefore their results were omitted.
Amongst the two variations of LSTM, the model with final state embeddings does slightly better than the model where we use the embeddings at the location index, however they are not significantly different (with a value less than ). It is interesting to note that the best LSTM model is not superior to logistic regression model, especially in terms of AUC. This can be due to the fact that the amount of training data is not sufficient for LSTM to perform well. Moreover, while we provide some grammar information to logistic regression model through POS tags, such information is not incorporated into LSTM models. Another interesting observation is that the measure for logistic regression model with n-grams and POS information is very low while this model’s performance is superior to other models in terms of AUC. This is because in general, it is easier to rank prediction scores than to assign predicted labels to instances by choosing a hard threshold.
|Model||Aspect ()||Sentiment (Accuracy)||Aspect (AUC)||Sentiment (AUC)|
Table 4 shows the average AUC (over aspect and sentimentclassification tasks) for two categories of data: Single — sentences that contain one location entity and Multi — sentences that contain two location entities. While logistic regression can perform slightly better on son Single location sentences, LSTM performs slightly better on Multi location sentences.
|LR - Mask (n-gram + POS)|
|LSTM - Final|
Table 5 shows the break down of average AUC scores for each aspect. We can see that aspects such as “safety” can be predicted with a better AUC score than aspect “general”.
|LR - Mask (n-gram + POS)|
|LSTM - Final|
Table 6 shows examples of correct and incorrect predictions using the best logistic regression model. The top part of the table contains examples that each contain a single location entity. At the bottom of the table, a sentence with two location entities is provided. The system correctly identifies that a “Positive” sentiment is expressed for the general aspect about location2. However, no sentiment is expressed for this aspect for location1.
|location1 is not a nice cheap residential area to live trust me||Price||Positive||Negative|
|i was born and raised there|
|I think you’d find it tough to find something affordable||Price||Positive||Negative|
|I can’t recommend location1 for affordability||Price||Negative||Negative|
|I only know about location1, most people prefer location2||General||None||None|
|I only know about location1, most people prefer location2||General||Positive||Positive|
8 Related Work
The term sentiment analysis was first used in [et al, 2003]. Since then, the field has received much attention from both research and industry. Sentiment analysis has applications in almost in every domain and it raised many interesting research questions. Furthermore, the availability of a huge volume of opinionated data on social media platforms has accelerated the development in this area.
In the beginning work on sentiment analysis mainly focused on identifying the overall sentiment of a unit of text. The unit of text varied from document [Pang et al., 2002, Turney, 2002], paragraph or sentences [Hu and Liu, 2004]. However, only considering the overall sentiment fails to capture the sentiments over the aspects on which an entity can be reviewed or sentiment expressed toward different entities. Two remedy this, two new tasks have been introduced: aspect-based sentiment analysis and targeted sentiment analysis.
Aspect based sentiment analysis assumes a single entity per a unit of analysis and tries to identify sentiments towards different aspects of the entity [Lu et al., 2011, Lakkaraju et al., 2014, Alghunaim, 2015, Bagheri et al., 2013, Somprasertsri and Lalitrojwong, 2008, Alghunaim, 2015, Lu et al., 2011, Titov and McDonald, 2008, Brody and Elhadad, 2010]. This task however does not consider more than one entity in the given text.
Targeted (target dependent) sentiment analysis is another task that identifies polarity towards a target entity (as opposed to over entire unit of text) [Mitchell et al., 2013, Jiang et al., 2011, Dong et al., 2014, Vo and Zhang, 2015, Zhang et al., 2016]. [Jiang et al., 2011] was the first to propose targeted sentiment analysis on Twitter and demonstrates the importance of targets by showing that 40% of sentiment errors are due to not considering them in classification. However this task only identifies the overall sentiment and the existing corpora for the task consist only of text with one single entity per unit of analysis.
The task of targeted aspect-based sentiment analysis caters for more generic text by making fewer assumptions while extracting fine-grained information.
In this paper, we introduced the task of targeted aspect-based sentiment analysis and a new dataset. We also provide two strong baselines using logistic regression and LSTM. Ways to improve the baselines can involve using parse trees for identifying the context of each location. Data augmentation can be used to make the models and especially LSTM more robust to variations in the data. We also like to provide more detailed analysis of what each system can achieve.
- [Alghunaim, 2015] Abdulaziz Alghunaim. 2015. A Vector Space Approach for Aspect-Based Sentiment Analysis. Ph.D. thesis, Massachusetts Institute of Technology.
- [Bagheri et al., 2013] Ayoub Bagheri, Mohamad Saraee, and Franciska De Jong. 2013. Care more about customers: unsupervised domain-independent aspect detection for sentiment analysis of customer reviews. Knowledge-Based Systems, 52:201–213.
- [Brody and Elhadad, 2010] Samuel Brody and Noemie Elhadad. 2010. An unsupervised aspect-sentiment model for online reviews. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 804–812. Association for Computational Linguistics.
- [Brychcın et al., 2014] Tomáš Brychcın, Michal Konkol, and Josef Steinberger. 2014. Uwb: Machine learning approach to aspect-based sentiment analysis. SemEval 2014, page 817.
- [COHEN, 1960] JACOB COHEN. 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1):37–46.
- [Das and Chen, 2001] Sanjiv Das and Mike Chen. 2001. Yahoo! for amazon: Extracting market sentiment from stock message boards. In Proceedings of the Asia Pacific finance association annual conference (APFA), volume 35, page 43. Bangkok, Thailand.
- [Dong et al., 2014] Li Dong, Furu Wei, Chuanqi Tan, Duyu Tang, Ming Zhou, and Ke Xu. 2014. Adaptive recursive neural network for target-dependent twitter sentiment classification. In ACL (2), pages 49–54.
- [et al, 2003] Jeonghee Yi et al. 2003. Extracting sentiments about a given topic using natural language processing techniques; jeonghee yi et al; ibm. In Proceedings of the Third IEEE International Conference on Data Mining (ICDM’03) 0-7695-1978-4/03, volume 17.
- [Gamon et al., 2005] Michael Gamon, Anthony Aue, Simon Corston-Oliver, and Eric Ringger. 2005. Pulse: Mining customer opinions from free text. In Advances in Intelligent Data Analysis VI, pages 121–132. Springer.
- [Ganu et al., 2009] Gayatree Ganu, Noemie Elhadad, and Amélie Marian. 2009. Beyond the stars: Improving rating predictions using review text content. In WebDB, volume 9, pages 1–6. Citeseer.
- [Hochreiter and Schmidhuber, 1997] Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, 9(8):1735–1780.
- [Hu and Liu, 2004] Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 168–177. ACM.
- [Jiang et al., 2011] Long Jiang, Mo Yu, Ming Zhou, Xiaohua Liu, and Tiejun Zhao. 2011. Target-dependent twitter sentiment classification. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pages 151–160. Association for Computational Linguistics.
- [Jo and Oh, 2011] Yohan Jo and Alice H Oh. 2011. Aspect and sentiment unification model for online review analysis. In Proceedings of the fourth ACM international conference on Web search and data mining, pages 815–824. ACM.
- [Lakkaraju et al., 2014] Himabindu Lakkaraju, Richard Socher, and Chris Manning. 2014. Aspect specific sentiment analysis using hierarchical deep learning.
- [Liakata et al., 2010] Maria Liakata, Simone Teufel, Advaith Siddharthan, Colin R Batchelor, et al. 2010. Corpora for the conceptualisation and zoning of scientific papers. In LREC. Citeseer.
- [Lu et al., 2011] Bin Lu, Myle Ott, Claire Cardie, and Benjamin K Tsou. 2011. Multi-aspect sentiment analysis with topic models. In Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on, pages 81–88. IEEE.
- [Mitchell et al., 2013] Margaret Mitchell, Jacqueline Aguilar, Theresa Wilson, and Benjamin Van Durme. 2013. Open domain targeted sentiment.
- [Morinaga et al., 2002] Satoshi Morinaga, Kenji Yamanishi, Kenji Tateishi, and Toshikazu Fukushima. 2002. Mining product reputations on the web. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 341–349. ACM.
- [Pang et al., 2002] Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up?: sentiment classification using machine learning techniques. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10, pages 79–86. Association for Computational Linguistics.
- [Pavlopoulos, 2014] Ioannis Pavlopoulos. 2014. Aspect based sentiment analysis. Athens University of Economics and Business.
- [Pontiki et al., 2015] Maria Pontiki, Dimitrios Galanis, Haris Papageorgiou, Suresh Manandhar, and Ion Androutsopoulos. 2015. Semeval-2015 task 12: Aspect based sentiment analysis. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Association for Computational Linguistics, Denver, Colorado, pages 486–495.
- [Pontiki et al., 2016] Maria Pontiki, Dimitrios Galanis, Haris Papageorgiou, Ion Androutsopoulos, Suresh Manandhar, Mohammad AL-Smadi, Mahmoud Al-Ayyoub, Yanyan Zhao, Bing Qin, Orphée De Clercq, Véronique Hoste, Marianna Apidianaki, Xavier Tannier, Natalia Loukachevitch, Evgeny Kotelnikov, Nuria Bel, Salud María Jiménez-Zafra, and Gülşen Eryiğit. 2016. SemEval-2016 task 5: Aspect based sentiment analysis. In Proceedings of the 10th International Workshop on Semantic Evaluation, SemEval ’16, San Diego, California, June. Association for Computational Linguistics.
- [Somprasertsri and Lalitrojwong, 2008] Gamgarn Somprasertsri and Pattarachai Lalitrojwong. 2008. Automatic product feature extraction from online product reviews using maximum entropy with lexical and syntactic features. In Information Reuse and Integration, 2008. IRI 2008. IEEE International Conference on, pages 250–255. IEEE.
- [Stenetorp et al., 2012] Pontus Stenetorp, Sampo Pyysalo, Goran Topić, Tomoko Ohta, Sophia Ananiadou, and Jun’ichi Tsujii. 2012. Brat: a web-based tool for nlp-assisted text annotation. In Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics, pages 102–107. Association for Computational Linguistics.
- [ten, 2015] 2015. TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org.
- [Titov and McDonald, 2008] Ivan Titov and Ryan McDonald. 2008. Modeling online reviews with multi-grain topic models. In Proceedings of the 17th international conference on World Wide Web, pages 111–120. ACM.
- [Turney, 2002] Peter D Turney. 2002. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th annual meeting on association for computational linguistics, pages 417–424. Association for Computational Linguistics.
- [Vo and Zhang, 2015] Duy-Tin Vo and Yue Zhang. 2015. Target-dependent twitter sentiment classification with rich automatic features. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015), pages 1347–1353.
- [Zhang et al., 2016] Meishan Zhang, Yue Zhang, and Duy-Tin Vo. 2016. Gated neural networks for targeted sentiment analysis. In Thirtieth AAAI Conference on Artificial Intelligence.