Open Domain Suggestion Mining ††thanks: This research was funded by Science Foundation Ireland under grant no. SFI/12/RC/2289 (Insight Centre for Data Analytics), and European Union funded project MixedEmotions (H2020-644632). Ahold Delhaize, Amsterdam Data Science, the Bloomberg Research Grant program, the China Scholarship Council, the Criteo Faculty Research Award program, Elsevier, the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement nr 312827 (VOX-Pol), the Google Faculty Research Awards program, the Microsoft Research Ph.D. program, the Netherlands Institute for Sound and Vision, the Netherlands Organisation for Scientific Research (NWO) under project nrs. CI-14-25, 652.002.001, 612.001.551, 652.001.003, and Yandex. All content represents the opinion of the authors, which is not necessarily shared or endorsed by their respective employers and/or sponsors.
We propose a formal definition for the task of suggestion mining in the context of a wide range of open domain applications. Human perception of the term suggestion is subjective and this effects the preparation of hand labeled datasets for the task of suggestion mining. Existing work either lacks a formal problem definition and annotation procedure, or provides domain and application specific definitions. Moreover, many previously used manually labeled datasets remain proprietary. We first present an annotation study, and based on our observations propose a formal task definition and annotation procedure for creating benchmark datasets for suggestion mining. With this study, we also provide publicly available labeled datasets for suggestion mining in multiple domains.
Keywords:Suggestion Mining Opinion Mining Text Classification Datasets
Suggestion mining can be defined as the extraction of sentences that contain suggestions from unstructured text. Collecting suggestions is an integral step of any decision making process. A suggestion mining system could extract exact suggestion sentences from a retrieved document, which would enable the user to collect suggestions from a much larger number of pages than they could manually read over a short span of time.
Apart from suggestions that relate to general topics, industrial and other organizational decision makers seek suggestions to improve their brand or organization jijkoun-mining-2010. In this case, consumers or other stakeholders are explicitly asked to provide suggestions. Opinions towards persons, brands, social debates etc. are generally expressed through online reviews, blogs, discussion forums, or social media platforms, and tend to contain the expressions of advice, tips, warnings, recommendations etc. amigo-overview-2014. For example, online reviews may contain suggestions for improvements in the product or service (Table 1); and recommendation platforms often ask for specific tips from their users, which are then offered to other users; see Figure LABEL:roomtips for an example from travel site TripAdvisor.111https://www.tripadvisor.com