Of course we share! Testing Assumptions about Social Tagging Systems
Social tagging systems have established themselves as an important part in today’s web and have attracted the interest of our research community in a variety of investigations. Henceforth, several assumptions about social tagging systems have emerged on which our community also builds their work. Yet, testing such assumptions has been difficult due to the absence of suitable usage data in the past. In this work, we thoroughly investigate and evaluate four assumptions about tagging systems – covering social, retrieval, equality, and popularity issues – by examining live server log data gathered from the real-world, public social tagging system BibSonomy. Our empirical results indicate that while some of these assumptions hold to a certain extent, other assumptions need to be reflected in a very critical light. Our observations have implications for the understanding of social tagging systems, and the way they are used in open environments.
Of course we share! Testing Assumptions about Social Tagging Systems
A Study of User Behavior in BibSonomy
|University of Kassel|
|University of Würzburg|
|Technical University Graz|
|University of Würzburg|
|University of Würzburg|
|GESIS & U. of Koblenz|
H.3.4 [Information Storage and Retrieval]: Systems and Software—Information Networks
social tagging; assumptions; social sharing; folksonomy; bookmarking; tagging; behavior
Social tagging systems such as Delicious, BibSonomy or Flickr have attracted the interest of our research community for almost a decade [?, ?]. Significant advances have been made with regard to our understanding about the emergent, individual and collective processes that can be observed in such systems [?]. Useful algorithms for retrieval [?] and classification [?] have been developed that exploit the rich fabric of links between users, resources, and tags in social tagging systems for facilitating information organization, search and navigation. Other work has focused on the extraction or stabilization of emergent semantics [?, ?].
While this line of research has significantly increased our ability to describe, model, and utilize social tagging systems, our community has also built their work on certain assumptions about how these systems are used, which have emerged over time. For such assumptions, arguments and evidence have been discussed in literature. Yet, due to a lack of appropriate data and other issues, these assumptions have gone largely unchallenged and it is unclear to which degree they do hold in actual tagging systems. Some of these assumptions are controversial and researchers have argued for and against them in our community, providing thus all the more reason to evaluate them on real-world usage data. Only a few studies have analyzed user behavior in social tagging systems to better understand such assumptions, either by (i) conducting user surveys (e.g., by Heckner et al. [?]) or by (ii) tapping into the rich corpus of tagging data that is available on the web (i.e., the posts) (e.g., by Cattuto et al. [?]). However, such studies come with certain limitations such as self-reporting biases or the lack of detailed usage data – i.e., how users actually request information. In this paper we overcome these drawbacks by presenting and thoroughly investigating a detailed usage log of a popular social tagging system. This allows us to test and challenge a series of assumptions from related work leveraging usage data of the real-world, open social tagging system BibSonomy.111http://www.bibsonomy.org/
Research questions. We discuss and evaluate the following four controversial assumptions about tagging: (i) The social assumption establishes that tagging systems are supposed to be used collaboratively to tag and share resources. We investigate to which degree such sharing actually happens and discuss evidence for the interest of users in the content of others. (ii) The retrieval assumption states that users tag resources for later retrieval. (iii) The equality assumption claims, that the three sets of entities in a tagging system – users, tags and resources – are equally important. This assumption is inherent in the folksonomy model (e.g., [?]) that is a popular basis for recommendations and ranking algorithms. (iv) The popularity assumption suggests that the popularity of users, tags, and resources in posts is matched by their popularity in retrieval. In tagging systems, popularity is used for example in tag clouds where frequent tags have large font sizes to gain the users’ attention and to be easily accessible by a mouse click.
Findings. In our analysis of the social tagging system BibSonomy, we find evidence both for and against the social assumption. While some user actions indeed indicate social sharing purposes, others are evidence for individual purposes, suggesting that social tagging systems provide utility because they can support both kinds of modes in a flexible manner. We also find that while users post a large number of resources and tags to BibSonomy, they only retrieve a rather small fraction of them later, which provides first evidence that the retrieval assumption might not hold for systems such as BibSonomy. Next, we find a strong inequality between the use of users, tags and resources within BibSonomy. User pages are visited much more often than corresponding resource or tag pages, providing clear evidence that the equality assumption in BibSonomy is wrong. Finally, while we observe common usage patterns in post and request behavior on an aggregate level, the patterns are less pronounced on an individual level, suggesting that the popularity assumption only holds to certain extents.
Contributions. The paper makes contributions on three levels. (i) Methodical: We identify a series of assumptions and illuminate a way towards testing them with log data. While our findings are limited to a single system (BibSonomy), our method of testing these assumptions is general. The approach can well be applied to other social tagging systems to test the extent to which these assumptions hold in different contexts. (ii) Empirical: We challenge a number of assumptions by testing them with actual log data and report their validity for the popular social tagging system BibSonomy. (iii) Data driven: We make an anonymized BibSonomy server log dataset available to other researchers (see Section Of course we share! Testing Assumptions about Social Tagging Systems). This will enable our community to investigate similar or different questions on a unique dataset that has not been available yet.
Structure of this paper. After the discussion of related work in Section Of course we share! Testing Assumptions about Social Tagging Systems, we describe the BibSonomy datasets in Section Of course we share! Testing Assumptions about Social Tagging Systems. We then turn our attention on studying and evaluating the aforementioned four assumptions on social tagging in Section Of course we share! Testing Assumptions about Social Tagging Systems. In Section Of course we share! Testing Assumptions about Social Tagging Systems we discuss differences between BibSonomy and other tagging systems. Finally, Section Of course we share! Testing Assumptions about Social Tagging Systems concludes the paper.
Overall, our findings are relevant for researchers interested in user behavior and modeling in the context of social tagging systems and their adoption as well as to system engineers interested in improving the utility and usefulness of social tagging systems on the web.
In this section we discuss related literature on the investigation of tagging systems and log file analysis in general. Further related work, that is specifically relevant to individual assumptions, will be discussed in greater detail later in the corresponding context in Section Of course we share! Testing Assumptions about Social Tagging Systems.
Inception. Work on social tagging and emerging folksonomies began in late 2004, when the term folksonomy was first mentioned by Vander Wal222http://vanderwal.net/random/category.php?cat=153 and continued in 2005 in various blog posts and papers. One of the first reviews about social tagging systems was provided by Mathes in [?]. He noted that social tagging systems allow a much greater variability in organizing content than formal classification can provide. Mathes was also among the first to hypothesize that tag distributions may emerge to power law distributions which can characterize the semantic stabilization of such systems (see also [?]). Furthermore, Mathes identified some potentials and uses of tagging systems, such as serendipitous browsing.
User Surveys and Post Analysis. Abrams et al. [?] already discussed the management of website bookmarking long before the rise of social tagging on the Web using a user survey and bookmark files from participants. Their results showed that users are motivated to share bookmarks (still via email back then) as well as to retrieve them later. Heckner et al. [?] conducted a survey of tagging systems (namely Flickr, YouTube, Delicious and Connotea) with 142 users regarding their motivations. The results showed that there are mainly two motivations for users to post content: sharing resources with others and storing them primarily for personal retrieval later on. The strength of these motivations varies from system to system.
Using the post data of tagging systems, several studies analyzed aspects of posting behavior, e.g., the distributions of users, resources, and tags in posts [?], or the identification of different types of users – categorizers and describers – regarding their choice of tags [?]. However, these studies did not use log data for their analysis to explore the actual retrieval behavior. A review of social tagging regarding a variety of diverse aspects of such systems – including vocabulary, structure, visualization, motivation, or search and ranking – was created by Trant [?].
Web Log Mining. Predominantly, web logs have been used to investigate the query behavior in search engines or the usage of digital libraries in order to better understand a system’s users. This can help webmasters to tailor their systems more specifically to the users’ needs. A survey on such works about search engines is given by Agosti et al. [?]. Examples for the analysis of digital libraries can be found in the works of Nicholas et al. (e.g., [?]). Tagging systems exhibit aspects of both search engines and libraries. While they are collections of resources with description and categories, however not professionally organized like in a digital library, they are organized by users in their individual fashion of assigning tags and entering meta data. Nonetheless, the data is clearly more structured than data on the Web in general as posts are constructed according to a specified template.
Carman et al. [?] combine tagging data with log data from search engines and compare the distribution of tags to that of query terms in search. They find a large overlap in the systems’ vocabularies and correlations between the frequency distributions of queries and tags to the same URLs. However, they also provide evidence that both tag and query term samples do not come from the same distribution.
While there exists a large amount of literature on tagging systems, to the best of our knowledge, the only work utilizing and analyzing log data from a tagging system are those by Millen and Feinberg [?], Millen et al. [?], and by Damianos et al. [?]. Millen and Feinberg investigated user logs of the social tagging system Dogear (internally used at IBM) with a focus on social navigation in the system. They found strong evidence that social navigation – i.e., users who are regularly looking at bookmark collections of other people – is a fundamental part of the social tagging system. They also found a positive correlation between the assignment frequency of a tag in posts and the frequency of it being used for browsing. These findings have been highly relevant for the understanding of tagging behavior as they provide actual evidence of how users make use of a tagging system’s content. Millen et al. [?] combined log analysis and user interviews to investigate the way users retrieve resources. They observe diverse behavior patterns for different users and find that heavy users tend to spend more time with their own collections than user with only few bookmarks. Damianos et al. [?] introduced a tagging prototype called onomi to the organization MITRE. They use log data to determine how well the system was accepted and present several usage statistics from a ten month test period. They found that their users can be categorized into information providers and information consumers depending on their individual ratio of browsing and bookmarking activities.
We compare findings in our experiments to the above mentioned analyses where possible. However, all three works focus on local social tagging systems located inside the network of a particular company. Therefore, they represent private systems where users only tag resources inside the company’s field of interest and hence, the results are hard to compare with a real world tagging system. Millen et al. [?] already note, that these kinds of services require their users to use corporate identities instead of pseudonyms, which is typically not the case in public systems. Contrary, in this work we focus on the publicly available system BibSonomy to overcome this limitation. This leads to some interesting deviating insights that are discussed in Section Of course we share! Testing Assumptions about Social Tagging Systems regarding the social and the popularity assumption. While we not only extend the analyses in [?, ?, ?] by investigating a series of assumptions about social tagging systems, we also benefit from long-time log data allowing us to get a clearer overview over actual user behavior in an already established social tagging system.
The datasets used in this paper are based on web server logs and database contents of the social tagging system BibSonomy [?]. BibSonomy allows users to store, tag, and share links to websites and (scientific) publications and offers for example the following options to query for posts:333For details on the BibSonomy URL schema see http://www.bibsonomy.org/help_en/URLSchemeSemantics. A user can request to see all posts with one or several tags, or posts from a specific user or group, or use a combination of user and tag restrictions. For each resource, BibSonomy has a page that lists its tags and users from all posts. Publication posts have a details page that shows the meta data of the publication (as entered by the user who created the post) and offers export options. Posts of bookmarked websites can also contain meta data (like a description of the website), but requests to a bookmark are usually conducted by just clicking on a post’s title to reach the website. Such requests are not recorded in BibSonomy’s server logs and therefore, we must restrict some experiments exclusively to publication requests.
In BibSonomy, users can form groups or declare friendship to other users. Both friendships and groups are used in the visibility concept of posts. BibSonomy offers many further features like discussion forums, or a full text search, that exceed the usual tagging system functionality. Therefore, such features have been excluded from our experiments.
Due to its high rank in search engines, BibSonomy is a popular target for spammers. Spammers are users who store links to advertisement sites to increase their visibility on the web. BibSonomy uses a learning classifier [?] as well as manual classification by the system’s administrators to detect spam. In all experiments, we only used data of users that have been classified as non-spammers.
We restricted the datasets to data that had been created between the start of BibSonomy in 2006 and the end of 2011, since early in 2012 the login mechanism was modified, which introduced significant changes to the logging infrastructure. With this paper, we make anonymized datasets of logs and posts available to researchers.444http://www.kde.cs.uni-kassel.de/bibsonomy/dumps/
User and Content Dataset. We use tagging data from BibSonomy’s database, i.e., the users with their posts, containing resources and tags, as well as all data about groups and friendships. In the considered time frame, 852 172 people registered a user account of which 17 932 were classified as non-spammers. They created 551 606 bookmark posts and 2 391 721 publication posts using 250 344 tags.
In this section, we present our results. For each assumption, we (i) make the assumption explicit, (ii) provide evidence for the assumption in the literature, (iii) present the results of our research and (iv) discuss our findings.
Assumption. The social assumption states that users of social tagging systems use the system to (re)use resources that have been shared and tagged by others, either by viewing them or by copying them into their own collection.
Evidence. The social aspect of tagging has been subject to controversial discussion in the past. It has been praised and disputed already early in the history of tagging systems. Mathes [?] stated that folksonomies could “lower the barriers to cooperation” and Weinberger [?] named it as one of two aspects that “make tagging highly useful”. Marlow et al. [?] presented an early model for social tagging systems where they argued that social relations between users are a critical element. The authors point out that social interaction connects bookmarking activities of individuals with a rich network of shared tags, resources and users. Furthermore, Millen and Feinberg [?] found out that around of all page requests in Dogear – an internal social tagging service at IBM – refer to content that was bookmarked by other users. In contrast to that, Damianos et al. [?] noticed that users are looking more at their own (70%) than at other users’ collection in their system onomi. Yet, it is not self-evident that similar observations can be made for public tagging systems, where users use the system without direct company guidance that might influence their behavior. Contrary, users may choose to use such systems for individual purposes only, e.g., to create their own collections, and thus ignore the resources of other users. Vander Wal [?] already pointed out that personal information management may be one of the main reasons why people use social tagging systems which was also emphasized by Terdiman [?]. Porter [?] claimed that “Personal value precedes network value: Selfish use comes before shared use”. A user survey by Heckner et al. [?] found that about of all users store resources in tagging systems comparable to BibSonomy mainly to retrieve them themselves, not particularly to share them (in contrast to systems where images or videos are shared). However, it is also noted that “even users of systems who claim that personal information management is very important for them, state that sharing is also part of their motivation of using the systems” [?]. While this survey takes the perspective of motivation for posting we will rather take the viewpoint of the usage of posts. We conducted a first evaluation of social behavior in [?], which we extend and detail in the following.
Results. Visiting Content. First, we analyze the ownership of visited (retrieved) content. Table Of course we share! Testing Assumptions about Social Tagging Systems shows the number of requests to pages for different ownership categories.555The BibSonomy landing page was considered separately because, although it lists recent posts of any users, many users just visit it to start retrieval by using the provided input fields on the page, and thus ignore the displayed resources. We can observe that more than two thirds of all requests of logged-in users target their own pages. Users visit other pages in about of the requests to look at either general pages, i.e., pages containing posts of several users (about ), or content of individual other users or groups (about ). Hereby, requests to groups and friend pages are both rather infrequent (about ) indicating that these particularly social features (in BibSonomy they are used to control the visibility of posts) play only a minor role. Further, the share of visits to content of others is below the reported for a company internal tagging system by Millen and Feinberg [?], but similar to the reported share by Damianos et al. [?]. In summary, we see that the larger share of interactions in BibSonomy happens with the personal collection. However, the interest in other users’ content accounts for a significant part – almost one third of all retrieval requests – of the interaction with the system.
# request % requests user’s own 1 018 089 68.37 groups and friends 44 875 3.01 other users 190 394 12.79 general 235 808 15.83 landing page 296 788 - Table \thetable: Content visits. Request counts to the (logged-in user’s) own content, to content from other users, or to general (non-user-specific) pages. Requests to the landing page (see Footnote 5) are not considered in the percentage calculation.
Copying Resources. When users added new posts to their collections, in 11% of all cases a bookmark or a publication was copied from another user.666We ignore imports of bookmark or publication lists (e.g., browser bookmark or BibTEX files) because during such transfers of own collections to BibSonomy, it would not be meaningful to look for resources in other users’ collections. Users copied publications (18%) more often than bookmarks (3%). One reason for this difference might be the fact, that users leave the system when they follow a bookmarked link, while they stay within BibSonomy when they check out details of a publication. Thus, using e.g., a bookmarklet provided by BibSonomy is the easiest way to post a website and clicking the copy button on the page is more likely an option for a publication. We note, that the share of 3% of copied bookmarks is close to the share reported by Millen and Feinberg [?] for the IBM-internal system Dogear, while the share for publications (18%) exceeds that value by a factor of eight.
Since a resource could only be copied if another user had already posted that resource in BibSonomy, we have to take into account whether posted resources were already present in the system when a user posted them. Among those posts that were not created by copying from another user, only about 17% had the corresponding resource already in the system and thus could have been copied. Of all posts that could have been created through copying at the time of posting, a share of roughly 42% has indeed been copied. This can be regarded as a relatively large share, since looking up publications or websites in BibSonomy is only one out of many possible ways to find interesting bookmarks and publications on the web or elsewhere.
Copying Tags. Finally, we study whether not only resources, but also tags are copied. To that purpose we counted how often users who copied a resource used tags from their own vocabulary or tags of the original post to describe their new post. In 87% of all copy requests at least one tag from the own vocabulary was used. In 42% of all copies at least one of the original post’s tags was adopted. In the other copy events, 44% of the original posts had only special tags like “imported” that are probably not meaningful for the user copying the post. Similarly to the copying of resources we thus find that users copy tags in a large number of cases; although in the majority of cases own tags were used.
Discussion. We found evidence for both, personal information management and social interaction. In general the findings fit well to the result from [?] that the motivation for posting websites and publications is not predominantly social, e.g., the low share of visits to groups and friends. However, while users might not contribute content particularly to share it (like in social networks), we could yet observe evidence that they do profit from the availability of other users’ content. The shares of visited posts and copied resources and tags are evidence of social interaction and demonstrate, that the collaborative aspect of the tagging system is recognized and used. For webmasters of such systems our results confirm, that it is reasonable to assist users in discovering content of others e.g., through search functions or through recommendations.
Figure \thefigure: Revisitation behavior of users. All four figures are visualized on a log-log scale. \subreffig:revisit_count_res illustrates the number of times users revisit their own publications and \subreffig:revisit_time_res the number of days elapsed between the posting of a publication and its first retrieval by its owner. \subreffig:revisit_count_tags and \subreffig:revisit_time_tags display the revisit count and elapsed days for tags accordingly.
Assumption. With the retrieval assumption, we refer to the notion that tagging systems are used to manage personal collections of resources for their retrieval later on.
Evidence. In a study on the usage of browser bookmarks by Abrams et al. [?] it was found that users revisit about 96% of their own bookmarks within one year. This gives rise to the assumption that in tagging systems posts also serve an archival purpose. It was hypothesized already at the very beginning of social tagging research that personal information management may be one of the main reasons why people use social tagging systems, e.g., by Vander Wal [?]. Since the idea of social bookmarking can be seen as an extension of the classic browser based bookmarking and since we found in the previous section that users spend more requests on their own collection than on other users’ posts, it seems plausible to assume that users of tagging systems frequently revisit their own posts and tags. As already mentioned in the foregoing section, the user survey by Heckner et al. [?] identified personal management as the main motivation to post resources to systems like BibSonomy.
Results. We present statistics about revisiting patterns obtained for both publication posts777Requests to bookmarks could not be analyzed since they target pages outside BibSonomy and therefore requests for such pages are not recorded in the logs (see Section Of course we share! Testing Assumptions about Social Tagging Systems). and assigned tags. More precisely, we investigate how many times users revisit their own posts and tags and also the time difference between the posting of a resource or tag and its first retrieval, counted in days. In order to give users a reasonable amount of time for revisits, we capture all posts until the end of 2010 and all requests until 2011. This means that each user had at least a whole year to revisit their posted resources and tags.
The results are shown in Figure Of course we share! Testing Assumptions about Social Tagging Systems. Around 49% of all publications were revisited by their owner at least once. If a publication has been revisited at all, it mostly was revisited only once (see Figure (a)). Furthermore, we can observe in Figure (b) that most of the first revisits to a page took place shortly after the resource had been posted, often on the same day. These visits could well be control visits to check the created post, however, it could also mean that users posted a publication immediately before they used it (e.g., in a citation). The revisit investigations of tags show a more drastic picture. Only around 17% of tags are used in queries at least once by a user who previously assigned them to some post. In Figures (c) and (d) we can observe similar patterns as for publications. If revisited, tags mostly only have been revisited once and often shortly after the assignment.
Discussion. In the previous section we saw that interactions with the personal collection account for the dominant share of users’ retrieval requests. Although, according to Heckner et al. [?] users use the system for later retrieval, we now find that only about half of all publications are revisited. Particularly interesting about this observation is that it does not agree with the work by Abrams et al. [?] on browser bookmarks, where 96% of all bookmarked resources got revisited in the time span of one year. The difference might result from several factors: First of all, using a publication is different to revisiting a website – many websites often renew their content frequently and are easier to consume than scientific publications. Moreover, the user survey reported the difficulty of creating and organizing the bookmarked resources, whereas tagging systems aim to simplify the process of creating and ordering bookmarks as much as possible. This could implicate that users tend to store more, simply because the effort is low. Another reason for the lower retrieval rate is certainly, that the retrieval of single posts is only one way to make use of the own collection. Another reasonable way to use stored publications for citing them is to mass export (e.g., simply all or many publications in the collection) them into a suitable citation format and selecting the actually used publications offline. More surprising is the small share of own tags used for retrieval. An explanation for this observation might be that it is reasonable to use many tags for a resource to increase the chance of successful retrieval later on. Furthermore, we will show in the next section, that using tags is not the dominant way to query the system anyway. E.g., for tag recommendation the results indicate, that the relevance rankings of recommendations should not only take the quantity of posts into account (e.g., like all algorithms in [?]) but also the visits to them.
Assumption. The equality assumption states that the three entity sets in a tagging system – the sets of users, resources, and tags – are equally important, e.g., for navigation or retrieval.
Evidence. A folksonomy – the structure underlying tagging systems – has been defined as the sets of users , resources , and tags together with the tag-assignment-relation (compare [?]). In that model, users, resources, and tags are treated equally and in fact even symmetrically. The folksonomy model has been widely accepted and many algorithms build on it, e.g., the FolkRank by Hotho et al. [?] or the tensor factorization method by Rendle et al. [?]. Since tag assignments link entities of all three sets together, the idea of the typical folksonomy navigation is that these entities can be navigated following these links (e.g., clicking a tag to request all posts to which that tag is assigned to).
user resource tag # requests (per entity) 37.97 0.14 1.09 # requests (total) 680 815 327 703 272 566 % requests (total) 53.14 25.58 21.28 # requests (to other) 113 674 100 800 76 851 % requests (to other) 39.02 34.60 26.38 Table \thetable: Entity request shares in BibSonomy. We report for each set of folksonomic entities the average number of requests per entity in that set – e.g., dividing the total number of requests to tags by the total number of tags – as well as the total number and relative share of requests to entities of that set – among all requests (total) and among requests targeting content outside the own collection (to others), e.g., a request to the user Y by user X.
Results. Request shares. Like in the previous assumptions we analyze retrieval requests. We split them into requests querying specifically for users, tags, or resources.888Note, that requests to resources are generally underrepresented due to the lack of recorded requests to bookmarks (see Section Of course we share! Testing Assumptions about Social Tagging Systems). Hereby, queries with more than one queried entity have been assigned to the set of that entity that dominates the request. For example, the post’s details page belongs to the post’s owner, but the target is clearly the resource rather than the user. A request containing a user and a tag has been counted as a tag request. Requests that are not specific to some entity (like the landing page) have been ignored.
For each set of entities (users, resources, and tags), Table Of course we share! Testing Assumptions about Social Tagging Systems shows the average number of requests to an entity of that set (in the first row) – e.g., the number of requests to any tag divided by the total number of tags – and the total number of requests to entities of that set, together with their relative shares (in the second and third row) compared to the total number of requests. For comparison, we also report the requests to entities and their shares by only looking at requests where users have accessed content of other users. We can clearly see that the number of requests in total are not equally distributed. There are about 2.5 times more requests to specific users than to specific tags, the share of resources is slightly higher than that of tags. From the average requests per entity we can deduce that this strong imbalance is not caused simply by a similar imbalance in the size of the sets. Despite the fact that BibSonomy has far more tags and resources than it has users, a user page is queried much more often than a resource or a tag page on average.
As the use of a tagging system consists of both work with one’s own as well as with posts from other users, we analyze the same request shares for the latter case separately in rows four and five of Table Of course we share! Testing Assumptions about Social Tagging Systems. The share of user requests drops, because we excluded requests to the own user pages that are accountable for the larger share of requests in BibSonomy (see the section on the social assumption). Nevertheless, the queries for users still outnumber those for tags, however to a lesser extent. It is also interesting to note that the ratio between requests to tags and resources is approximately the same: 1.2 (total requests) vs. 1.3 (to others). This indicates a comparable user behavior within one’s own collection and within the content of other users.
With above mentioned assignment of each request to one dominating entity, we chose a rather conservative approach that tends to underestimate the relevance of requests to users. In a similar experiment we directly counted the requested entities. Thus a request to a post with requested resource and requested user was counted for both user and resource. The result (omitted here due to space limitations) shows an even stronger imbalance towards users, i.e., about 65% of the requested entities were users.
Transition Probabilities. Next, we look at navigational transition probabilities (determined from the requests’ referer attribute using first order Markov chain probabilities) from one entity set to another (see Figure Of course we share! Testing Assumptions about Social Tagging Systems). We can observe that self-transitions are dominant, suggesting that users tend to stay with the same type of entity in their navigational paths through BibSonomy. Aside from that, there are a lot of transitions from user pages to resource pages and tag pages. This is not surprising, as user pages consist of listings of a user’s resources, which can be reached with a single click. This also explains the transitions back to user pages and symbolizes the “browsing” in the system. The exception to that is that there are few transitions from a resource page to tag pages, meaning that users seem only rarely interested in resources with the same tags as the resource at hand.
Discussion. We can observe a strong inequality between the use of the three folksonomy entities of users, tags and resources. While the numbers of requests to tags and to individual resources are similar, they are dominated by the requests to user pages. This is surprising, as there are fewer user pages than tag or resource pages available in BibSonomy. When discussing navigation within folksonomies, resources are usually regarded as targets of queries. As navigational means to find or retrieve theses resources, often tags – compared to users – receive the larger interest, as they can function as resource descriptors. In BibSonomy, it seems however, that the user pages are the main means of navigation. From the transition probabilities, we can find that especially navigation from resources to tags (and thus to potential further resources to the same tag/topic) is rather rare. This observation is again surprising. It means, tag based navigation is less prominent, and algorithms like FolkRank, that model the transitions between entities, need to be revisited. In FolkRank, transitions between users, tags, and resources are modeled with equal probabilities, which – as we found out – does not reflect actual user behavior properly.
Figure \thefigure: Frequency distributions in requests and posts. In log-log scale, displayed are \subreffig:logLoggedInNoSpamDBNoSpamTagSimple the frequency distributions and for tags, \subreffig:logLoggedInNoSpamDBNoSpamTagFit fits of the respective complementary cumulative probability distributions to different standard cumulative probability distributions (the vertical lines indicate the corresponding values), and \subreffig:logLoggedInNoSpamDBNoSpamResBibtexSimple frequency distributions and for resources..
Assumption. The popularity assumption captures the idea that the popularity of folksonomic entities – the number of posts a user, a resource, or a tag occurs in or its frequency distributions – matches similar properties in requests.
Evidence. In tagging systems, the notion of popularity is exploited in several ways: (i) special “popular” pages summarize the most frequently posted resources or tags, (ii) next to a resource, the number of posts it occurs in is shown, (iii) users’ profile pages often show the number of their posts, and (iv) several algorithms for the recommendation of tags [?] and resources [?] suggest the most frequently used entities. Perhaps most prominently, tag frequency is exploited in tag clouds where the frequency of a tag corresponds to its font size and particularly rare tags sometimes are not displayed at all. Brooks and Montanez [?] point out that it is taken for granted that the tags a user assigns are the same as those a reader would select. Hence, the authors identified the relationship between the task of article tagging and information retrieval as an open question to investigate. In the user study by Sinclair and Cardew-Hall [?] it was found that tag clouds are perceived as visual summaries of resources and that clicking in tag clouds requires less cognitive effort than entering search queries. This indicates that the size of a tag is indeed relevant for the users in their query behavior, but to the best of our knowledge, the correlation between tag usage in posts and requests has not yet been investigated in a large-scale scenario other than for the company internal system Dogear [?] for which a correlation of between the frequencies of a tag in posts and in requests is reported. Regarding the overall behavior, Cattuto et al. [?] found that frequencies of entities in posts follow a heavy-tailed distribution – mostly clean power law fits. Power law functions are known to exhibit scale-invariance and are mostly explained by the Yule process which is also known as preferential attachment.
Results. Tags. Since tag clouds are one of the most popular applications of popularity, we begin the investigation with tags and their distributions of frequencies in the request logs () and in the posts ().999Hereby, we ignore posts from two users who are known to only automatically create posts from publication catalogues to provide more content in the system. More precisely,
counts how many tags have been requested exactly times (e.g., means, that exactly tags have been requested exactly times) and
counts how many tags have been assigned to exactly posts (and thus constitutes the usual node degree distribution described in [?]).
Both distributions are shown in Figure (a).101010 A close investigation of the notable peak in the distribution at frequency 8 reveals, that this anomaly is due to the activities of one single user, who used 28 989 tags exactly 8 times. We therefore ignore the peak in the following discussion.
requests posts 0.968 0.620 0.051 0.420 0.059 0.441 0.415 0.515 0.272 0.942 0.210 0.194 0.092 0.572 0.490 0.081 0.718 0.471 0.804 0.778 0.344 0.553 0.030 0.707 0.608 0.248 0.151 Table \thetable: Correlation and Divergence of request and tag distributions. Pearson’s correlation coefficient , Spearman’s rank correlation coefficient and the Jensen-Shannon divergence for pairs of distributions. In each row, a distribution (Entity is either tag, user, or resource) of requests (or their frequencies ()) is compared to a distribution of posts (or their frequencies ()).
The first observation is, that dominates , meaning that in total there are more tag assignments than requests for tags. Since tag frequency distributions in posts () are known to be heavy-tailed [?] – mostly power law – it was to be expected that the distribution of tag frequencies in the request logs () has similar properties. To confirm this, we first fitted the power law function ( where ) to the empirical data using the methods of Clauset et al. [?]. Next, we compared the corresponding fit to the exponential function as a lower barrier for heavy-tailed distributions as well as other heavy-tailed probability distributions, namely the lognormal function and the power law function with an exponential cutoff (which means that for large values the function deviates from the typical power law function). We visualize the empirical distributions, the best power law values (vertical lines) and the corresponding fits in Figure (b) for both as well as .111111For better visibility we omitted the (weak) exponential fit. For the fits of the power law function we obtained and for , and and for . The distributions are similar with regard to their slopes . Noteworthy is the higher result of for (in contrast to the small value for ), indicating that the power law fit only holds for a smaller portion of the distribution (the tail). Visual inspection suggests that there are slightly fewer tags with low frequencies than one would expect in a power law distribution. While an in-depth analysis of this phenomenon is beyond the scope of this work, we can speculate that it might be a consequence of the use of tag recommenders that typically suggest tags that are already frequently used, leading to an ignorance of low frequency tags.
A comparison between the fits to the different distributions showed that the power law function is a statistically significantly better fit to the data than the exponential fit. Both the lognormal as well as the power law function with an exponential cutoff are also good fits to the data confirming our assumption about heavy-tailed distributions and they are even slightly better fits to the data compared to the pure power law function as one can see in Figure (b). This can be explained by the slight decay in the distributions – visible where the line of the empirical distribution ( at and at ) falls below the straight line of the respective power law fit. Similar to the explanations by Mossa et al. [?] – which are also discussed by Cha et al. [?] – this may be reasoned due to information filtering which might hinder preferential attachment. However, we need to keep in mind that there is only a slight decay visible. Nevertheless, detailed investigations regarding this cutoff are necessary for a better understanding of this behavior. By and large, we can observe similar processes of how users post tags and how they request them – i.e., processes yielding heavy-tailed distributions.
Further, we directly compare both and with each other using Pearson’s correlation coefficient and Spearman’s .121212Note, that all correlation results in this section are statistically significant with a p-value below , which is why we do not directly report it explicitly for each calculation. From the first row in Table Of course we share! Testing Assumptions about Social Tagging Systems we can observe that the Pearson’s and Spearman’s correlations are high. An explanation for the smaller Spearman’s value is the fluctuation in the distributions (see Figure (a)) where the number of tags no longer decreases monotonously with increasing frequency. Finally, a comparison of the distributions using the Jensen-Shannon divergence confirms similarity.
In the tag frequency distributions, we found similarity in the way how users use and request tags. As a next step, we analyze the tag popularity on the level of individual tags, to see whether there are similarities regarding which tags users assign and request. Particularly, we look at the distributions and , where
is the number of requests to a tag (e.g., means, the tag “web” has been requested exactly times) and
is the number of posts that the tag occurs in.
Figure Of course we share! Testing Assumptions about Social Tagging Systems shows the scatterplot of these two tag distributions where each point in the diagram denotes one tag with its number of requests and its number of posts as coordinates. We can immediately see that despite the similarity in the behavior of tag frequencies, there are enormous differences on the level of individual tags. Only for very frequent tags (more than 100 requests) one could presume a correlation between both frequency counts. To quantify the effect, the second row of Table Of course we share! Testing Assumptions about Social Tagging Systems shows the correlation coefficients and the Jensen-Shannon divergence for the two distributions and . Different than for the previous distributions we can observe rather low correlation and a much higher divergence. This means – contrary to the popularity assumption – that the number of posts a tag is assigned to and the number of times a tag is queried are only mildly correlated. The found correlation of is also lower than the one reported for the company system Dogear ().
A closer look at the log data revealed, that many tags which have been used in posts were never queried at all and several tags have been queried but were never assigned to any post. Therefore, we look at similar distributions as before but we specifically ignore tags that only occur in one of the two tag distributions. We yield distributions and , reducing the number of considered tags significantly to only . Their distributions’ correlations and divergence can be found in the third row of Table Of course we share! Testing Assumptions about Social Tagging Systems. We can observe that the limitation to such “active” tags yields higher Spearman correlation and less divergence, as the active tags’ rankings exhibit far less ties than the full set of tags.
Users and Publications. As with tags, we investigated similar distributions of both users and resources, i.e., counting the requests to specific users, counting a user’s posts, counting the requests to a particular publication and counting the posts containing a publication. Similarly, we have the according frequency distributions (e.g., ) and the restricted distributions to active entities ignoring those that occur either only in posts or only in requests (e.g., ). Hereby again, we restrict resources to publications (and thus omit bookmarks), as visits of bookmarks are not recorded in the log files (see Section Of course we share! Testing Assumptions about Social Tagging Systems). The correlation results are depicted in rows four through nine in Table Of course we share! Testing Assumptions about Social Tagging Systems and for publications the frequency distributions are illustrated in Figure (c) (further figures have been omitted due to space limitations).
The distributions of user (publication) frequencies in requests () and in posts () are similar and yield relatively high correlation according to Pearson’s (Table Of course we share! Testing Assumptions about Social Tagging Systems, rows four and seven). Their Jensen-Shannon divergences are higher than for tags, but still the distributions are relatively similar. Since the distributions and are for the most part monotonically decreasing (Figure (c)), their rank correlation is high (unlike for the frequencies of users). Notable in both cases (users and publications) is that the distributions of frequencies in posts and requests are no longer “parallel” as they were in the case of tags (compare Figures (a) and (c)).
Power law fits for the publication frequency distributions of both posts (, ) and requests (, ) are decent fits with relatively low values. Not surprisingly, the fits of the power law function are statistically significantly better than those of the exponential function. However, it is extremely difficult to distinguish the fits of the lognormal function and the power law function with exponential cutoff from the power law fit – a strong indicator for the presence of heavy-tailed distributions. For user frequencies, our results also indicate a good power law fit for both (, ) and (, ). Similar to our investigations on tag frequencies, we obtain a higher value for the frequencies in posts than for those in requests. For all candidate functions are better fits than the exponential function; both the lognormal as well as the power law function with exponential cutoff are better fits to the data than the pure power law function. The power law with cutoff is even better than the lognormal. For the power law fit is better than the exponential function and it is difficult to distinguish from the other candidate distributions.
Regarding individual entities, we again measure correlations between the respective distributions in Table Of course we share! Testing Assumptions about Social Tagging Systems (for users in rows five and six and for publications in rows eight and nine): For the resources ( and ), we obtain similar results as previously for tags: Pearson’s correlation is moderate, the divergence is even higher than for tags, there is almost no rank correlation, and removing “inactive” publications (occurring either only in posts or in requests) yields higher rank correlation and lower divergence. The elimination of such publications leaves only about 12% of the original set of publications. By and large, we find only moderate correlation even among the actively posted and requested publications. A possible explanation for the correlation results might be based on the large number of publications that only get posted and requested infrequently. Slight changes in the post or request counts (e.g., once vs. twice) only change Pearson’s correlation slightly, but have a large influence on Spearman’s correlation. For users ( and ) we find different behavior: almost no correlation according to Pearson’s , moderate rank correlation (higher than for tags and publications) and divergence . This indicates that users with many posts indeed tend to be requested more, but not proportionally more.
Discussion. The obtained results do not clearly support the initial assumption. The overall behavior of tag (and to a smaller degree of user and resource) frequencies is similar in requests and posts and they are heavy-tailed as expected. In all examples we can find a good power law fit. However, in some occasions the distribution decays from the straight power law function which indicates the presence of other heavy-tailed distributions which might be based on distinct processes creating these distributions. This warrants further detailed investigations in the future.
On the level of individual entities, we observe weaker correlations and only among the more actively used entities. It is surprising that despite the fact that tag clouds are displayed in BibSonomy and users can click tags to find according resources, the choice of tags in requests is not stronger correlated to their popularity in posts. Also, we noted a strong difference to the company internal system Dogear where much stronger correlations could be observed for tags. For operators of a tagging system, the results indicate that it is reasonable to exclude rarely requested tags completely from tag clouds or to use request frequencies instead or in addition to post frequencies in tag clouds. These could even be personalized to a user’s query behavior.
In the previous section we have shown evidence indicating to what degree the four discussed assumptions do or do not hold in BibSonomy. While our findings in this paper are limited to BibSonomy, our approach is directly applicable to other tagging systems and we briefly discuss some aspects of such a transfer here. Like shown in the user study by Heckner et al. [?] different tagging systems yield different characteristics (in their case regarding the users’ tagging motivation). We can thus assume that similarly, the four discussed assumptions in this paper will hold to different degrees in other tagging systems and we can speculate about possible influences.
We have already mentioned the influence of the degree of openness. In contrast to public, openly available systems, company-internal systems can impose certain requirements on their users, like the use of real names instead of pseudonyms or boundaries for the tags and resources in the system. For example, the knowledge whose resources one browses could be a strong influence for the social behavior of sharing and visiting. Indeed we have found similarity but also pronounced differences between the usage behavior in BibSonomy compared to that in Dogear [?] in our investigation of the social assumption in Section Of course we share! Testing Assumptions about Social Tagging Systems and also in the popularity assumption in Section Of course we share! Testing Assumptions about Social Tagging Systems.
Another influence is surely the type of resources that are bookmarked. Heckner et al. [?] have shown, that motivations for tagging (sharing or personal information management) were different in the systems Youtube (resources are videos) and Flickr (images) compared to Delicious (web links) and Connotea (publication references). A major difference between those two pairs of systems is that the resources like links and publication references are taken from other available sources whereas images and videos are often published in the respective system for the first time. We thus expect that with regard to the social assumption we would find similar results on Delicious and Connotea, because BibSonomy allows users to tag web links (like Delicious) and references to publications (like Connotea). On the other hand, we can speculate that systems like Youtube and Flickr would show different results. The age of the system is another aspect. All three previous log file analyses [?, ?, ?] report results from periods of eight, ten, and twelve month respectively, shortly after the systems’ creation in 2005. In contrast to that, our log dataset covers a period of six years. Finally, the navigation concept and the graphical user interface can play a role. BibSonomy offers the typical folksonomy navigation by always presenting users, resources, and tags as linked entities. However, different tagging systems may make different design choices, e.g., regarding the visibility and accessibility of individual entities.
To investigate these questions further, one would have to conduct experiments on logs of other systems as well. However, the bottleneck hereby is the availability of such datasets. Therefore, our study is a first step towards analyzing user behavior using log files. We encourage other researchers and webmasters of tagging systems to conduct similar studies, using the here presented methods, on their tagging systems and to compare their results to ours.
In this work we have tested and challenged a number of prominent assumptions about social tagging systems using a web server log dataset from the system BibSonomy containing posting and requesting data. We have thus supplemented previous work – that has tapped into surveys and post data to tackle these issues – by also reflecting actual user behavior leveraging request data. Our findings paint a rather mixed picture about the four assumptions studied in this paper: While we find evidence both for and against the social assumption, we also find that the retrieval assumption might not hold for systems such as BibSonomy. Our results suggest that the equality assumption is wrong for BibSonomy, and the popularity assumption only holds partially.
Overall, our work contributes (i) a stepping stone for further studies on social tagging systems that require request log data and (ii) a basis for comparative studies, e.g., exploring the extent to which these different assumptions hold in different tagging systems. It is reasonable to assume that different tagging systems (such as Flickr, Delicious, BibSonomy and others) exhibit unique characteristics and dynamics that make them amenable to different uses and purposes. Further studies of request log data in other tagging systems would be helpful in uncovering these differences. In addition, we provide (iii) new insights about the relative importance of users, tags and resources in social tagging systems. Finding that the equality assumption does not hold generally has important implications for the layout of tagging systems and for the design and implementation of algorithms that address search and retrieval. For example, the FolkRank [?] algorithm might profit from the inclusion of weights reflecting popularity or transition probability in requests.
We hope our work triggers a new line of research on social tagging systems that utilizes traces of actual user behavior, to test and challenge our existing body of knowledge about these systems gained from other inquisition methods, such as surveys or post data.
This work is in part funded by the FWF Austrian Science Fund Grant I677 as well as by the DFG through the PoSTS project.
-  D. Abrams, R. Baecker, and M. Chignell. Information archiving with bookmarks: personal web space construction and organization. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’98, pages 41–48, New York, NY, USA, 1998. ACM Press/Addison-Wesley Publishing Co.
-  M. Agosti, F. Crivellari, and G. Di Nunzio. Web log analysis: a review of a decade of studies about information acquisition, inspection and interpretation of user interaction. Data Mining and Knowledge Discovery, 24(3):663–696, 2012.
-  D. Benz, A. Hotho, R. Jäschke, B. Krause, F. Mitzlaff, C. Schmitz, and G. Stumme. The social bookmark and publication management system BibSonomy. The VLDB Journal, 19(6):849–875, Dec. 2010.
-  T. Bogers. Recommender Systems for Social Bookmarking. PhD thesis, Tilburg University, Tilburg, The Netherlands, Dec. 2009.
-  C. H. Brooks and N. Montanez. Improved annotation of the blogosphere via autotagging and hierarchical clustering. In Proceedings of the 15th international conference on World Wide Web, WWW ’06, pages 625–632, New York, NY, USA, 2006. ACM.
-  M. J. Carman, M. Baillie, R. Gwadera, and F. Crestani. A statistical comparison of tag and query logs. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’09, pages 123–130, New York, NY, USA, 2009. ACM.
-  C. Cattuto, C. Schmitz, A. Baldassarri, V. D. P. Servedio, V. Loreto, A. Hotho, M. Grahl, and G. Stumme. Network properties of folksonomies. AI Communications Journal, Special Issue on “Network Analysis in Natural Sciences and Engineering”, 20(4):245–262, 2007.
-  M. Cha, H. Kwak, P. Rodriguez, Y.-Y. Ahn, and S. Moon. Analyzing the video popularity characteristics of large-scale user generated content systems. IEEE/ACM Transactions on Networking (TON), 17(5):1357–1370, 2009.
-  A. Clauset, C. R. Shalizi, and M. E. J. Newman. Power-law distributions in empirical data. SIAM Rev., 51(4):661–703, Nov. 2009.
-  L. E. Damianos, D. Cuomo, J. Griffith, D. M. Hirst, and J. Smallwood. Exploring the adoption, utility, and social influences of social bookmarking in a corporate environment. In Proceedings of the 40th Annual Hawaii International Conference on System Sciences, HICSS ’07, pages 86–95, Washington, DC, USA, 2007. IEEE Computer Society.
-  S. Doerfel, D. Zoller, P. Singer, T. Niebler, M. Strohmaier, and A. Hotho. How social is social tagging? In Proceedings of the 23rd International World Wide Web Conference, WWW 2014, New York, NY, USA, 2014. ACM.
-  S. A. Golder and B. A. Huberman. Usage patterns of collaborative tagging systems. Journal of information science, 32(2):198–208, April 2006.
-  M. Heckner, M. Heilemann, and C. Wolff. Personal information management vs. resource sharing: Towards a model of information behaviour in social tagging systems. In Proceedings of the 3rd International Conference on Weblogs and Social Media, ICWSM ’09, San Jose, CA, USA, May 2009.
-  A. Hotho, R. Jäschke, C. Schmitz, and G. Stumme. Information retrieval in folksonomies: Search and ranking. In Proceedings of the 3rd European Semantic Web Conference, volume 4011 of LNCS, pages 411–426, Budva, Montenegro, June 2006. Springer.
-  R. Jäschke, L. Marinho, A. Hotho, L. Schmidt-Thieme, and G. Stumme. Tag recommendations in social bookmarking systems. AI Communications, 21(4):231–247, 2008.
-  B. Krause, C. Schmitz, A. Hotho, and G. Stumme. The anti-social tagger - detecting spam in social bookmarking systems. In Proceedings of the 4th Int. Workshop on Adversarial Information Retrieval on the Web, 2008.
-  C. Marlow, M. Naaman, D. Boyd, and M. Davis. Ht06, tagging paper, taxonomy, flickr, academic article, to read. In Proceedings of the 17th conference on Hypertext and hypermedia, HYPERTEXT ’06, pages 31–40, New York, NY, USA, 2006. ACM.
-  A. Mathes. Folksonomies: Cooperative classification and communication through shared metadata. http://www.adammathes.com/academic/computer-mediated-communication/folksonomies.html, June 2004. Accessed: 2013-07-11.
-  D. Millen, M. Yang, S. Whittaker, and J. Feinberg. Social bookmarking and exploratory search. In L. Bannon, I. Wagner, C. Gutwin, R. Harper, and K. Schmidt, editors, ECSCW 2007, pages 21–40. Springer London, 2007.
-  D. R. Millen and J. Feinberg. Using social tagging to improve social navigation. In Workshop on the Social Navigation and Community based Adaptation Technologies, 2006.
-  S. Mossa, M. Barthélémy, H. E. Stanley, and L. A. N. Amaral. Truncation of power law behavior in “scale-free” network models due to information filtering. Physical Review Letters, 88(13):138701, 2002.
-  D. Nicholas, P. Huntington, and A. Watkinson. Scholarly journal usage: the results of deep log analysis. Journal of Documentation, 61(2):248–280, 2005.
-  J. Porter. Learning More about Structured Blogging. http://bokardo.com/archives/learning-more-about-structured-blogging/, 2005. Accessed: 2013-08-12.
-  S. Rendle, L. Balby Marinho, A. Nanopoulos, and L. Schmidt-Thieme. Learning optimal ranking with tensor factorization for tag recommendation. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 727–736. ACM, 2009.
-  J. Sinclair and M. Cardew-Hall. The folksonomy tag cloud: when is it useful? Journal of Information Science, 34(1):15–29, 2008.
-  M. Strohmaier, C. Körner, and R. Kern. Why do users tag? detecting users’ motivation for tagging in social tagging systems. In Proceedings of 4th International Conference on Weblogs and Social Media, ICWSM ’10, 2010.
-  D. Terdiman. Folksonomies tap people power. http://www.wired.com/science/discoveries/news/2005/02/66456?currentPage=all, Jan. 2005. Accessed: 2013-08-12.
-  J. Trant. Studying social tagging and folksonomy: A review and framework. 2009.
-  T. Vander Wal. Tagging for fun and finding. http://okcancel.com/archives/article/2005/07/tagging-for-fun-and-finding.html, July 2005. Accessed: 2013-08-12.
-  D. Weinberger. Tagging and why it matters. SSRN eLibrary, 2005.
-  A. Zubiaga, C. Körner, and M. Strohmaier. Tags vs shelves: from social tagging to social classification. In Proceedings of the 22nd ACM conference on Hypertext and hypermedia, HYPERTEXT ’11, pages 93–102, New York, NY, USA, 2011. ACM.