Contents
{CJK*}

UTF8min

PhD Dissertation



International Doctorate School in Information and

Communication Technologies

DISI - University of Trento


An effective end-user development approach through domain-specific mashups for Research Impact Evaluation


Muhammad Imran

Advisor:
Prof. Maurizio Marchese
Università degli Studi di Trento

Co-Advisor:
Prof. Fabio Casati
Università degli Studi di Trento


March

Abstract

Over the last decade, there has been growing interest in the assessment of the performance of researchers, research groups, universities and even countries. The assessment of productivity is an instrument to select and promote personnel, assign research grants and measure the results of research projects. One particular assessment approach is bibliometrics i.e., the quantitative analysis of scientific publications through citation and content analysis. However, there is little consensus today on how research evaluation should be performed, and it is commonly acknowledged that the quantitative metrics available today are largely unsatisfactory. The process is very often highly subjective, and there are no universally accepted criteria.

A number of different scientific data sources available on the Web (e.g., DBLP, Microsoft Academic Search, Google Scholar) that are used for such analysis purposes. Taking data from these diverse sources, performing the analysis and visualizing results in different ways is not a trivial and straight forward task. Moreover, the data taken from these sources cannot be used as it is due to the problem of name disambiguation, where many researchers share identical names or an author different name variations appear in the data. We believe that the personalization of the evaluation processes is a key element for the appropriate use and practical success of these research impact evaluation tasks. Moreover, people involved in such evaluation processes are not always IT experts and hence not capable to crawl data sources, merge them and compute the needed evaluation procedures.

The recent emergence of mashup tools has refueled research on end-user development, i.e., on enabling end-users without programming skills to produce their own applications. Yet, similar to what happened with analogous promises in web service composition and business process management, research has mostly focused on technology and, as a consequence, has failed its objective. Plain technology (e.g., SOAP/WSDL web services) or simple modeling languages (e.g., Yahoo! Pipes) do not convey enough meaning to non-programmers. We believe that the heart of the problem is that it is impractical to design tools that are generic enough to cover a wide range of application domains, powerful enough to enable the specification of non-trivial logic, and simple enough to be actually accessible to non-programmers. At some point, we need to give up something. In our view, this something is generality since reducing expressive power would mean supporting only the development of toy applications, which is useless, while simplicity is our major aim.

This thesis presents a novel approach for an effective end-user development, specifically for non-programmers. That is, we introduce a domain-specific approach to mashups that “speaks the language of users”, i.e., that is aware of the terminology, concepts, rules, and conventions (the domain) the user is comfortable with. We show what developing a domain-specific mashup platform means, which role the mashup meta-model and the domain model play and how these can be merged into a domain-specific mashup meta-model. We illustrate the approach by implementing a generic mashup platform, whose capabilities are based on our proposed mashup meta-model. Moreover, the thesis proposed an architectural design for mashup platforms, specifically it presents a novel approach for data-intensive mashup-based web applications, which proved to be a substantial contribution. The proposed approach is suitable for those applications, which deal with large amounts of data that travel between client and server.

Keywords[End-user development, Domain-specific mashups, Research evaluation]

Acknowledgements

This thesis would not have been possible without the support of many people, whom I want to acknowledge in this section. First of all, thanking God for giving me the amazing opportunity of coming to Trento to pursue my PhD degree.

I would like to express my sincere gratitude to my supervisors Prof. Maurizio Marchese and Prof. Fabio Casati for their valuable guidance, support and constructive comments throughout the journey toward my PhD.

I would also like to express my sincere gratitude to Dr. Florian Daniel for his immeasurable attentive guidance, valuable insights and technical advice throughout my PhD. Thank you Florian, this thesis would not have been possible without your support. I thank my fellows (Soudip roy chowdhury, Stefano Soi) and friends (Zeeshan Munir, Musawar Saeed, Talha Rehman), who have been with their kind behavior contributed to this work directly or indirectly.

This PhD is also the result of much love, encouragement and prayers from my parents and family. Especially my dearest dad, who has been a great source of support and encouragement for me throughout my life. He is truly a great father and a kind person. Dad & mom, I owe you everything I have. Finally, I want to thank my partner in life, my dear wife. Her constant support and love got me through this process. Thank you all!!

Muhammad Imran

 

Contents
List of Figures

Chapter 1 Introduction

The concepts of scientometrics (i.e., the science of measuring and analyzing science) and informetrics (i.e., the study of the quantitative aspects of information in any form) [1][2] are increasingly popular. More specifically, among the other fields that informetrics encompasses, the field of bibliometrics, which deals with the quantitative analysis of disseminated information of all forms, has received considerable interest over the last few years. The quantitative analysis of scientific and technological information, under bibliometric field, typically use citation and content analysis techniques. The ultimate goal of such an analysis is to determine the impact of a research work that then contributed to productivity and the impact of researchers (i.e., who actually conduct the research work). Bibliometrics has changed out the way the research assessment practices were following, and as it is now bibliometrics methods are widely being used to evaluate research groups, individual research’s, departments, universities and many more.

However, evaluating someone’s research output quality is a notoriously challenging problem which, so far, has no well accepted solution. The field of research is a competitive struggle for a researcher. These researchers throughout their career are evaluated on the basis of their research work, especially the disseminated work, which could be of different forms. For example, to name a few, among traditional quantitative indicators include journals publications count, or top tier conference publications count etc., and among citation-based methods include, journal impact factor, h-index value, or g-index value etc.111A more detailed presentation and discussion of such indicators will be given in chapter 2 Often times, the choice of an evaluation criterion depends on the purpose behind that evaluation practice.

Over the last few years, research impact evaluation received a substantial focus as the amount of contribution to science is increasing heavily, and the competition becomes tougher among researchers, and at large extent among research groups, departments, universities as well as research institutions. As the research landscape evolves, assessing the impact of researchers and their disseminated research outputs is in high demand for a variety of reasons, such as the self-assessment of researchers, evaluation of faculties or universities, faculty recruitment and promotion, funding, awards [3] as well as to support the search for attractive content within an ocean of scientific knowledge. An evaluation task, which determines the impact and the productivity of researchers, requires the selection one or more information sources, appropriate evaluation indicators, and an uncontroversial evaluation procedure. To this end, a vast collection of such evaluation indicators, information sources and procedures are becoming available, which make the evaluation exercise more subjective. In the next section, we present diversities along all the above mentioned dimensions.

1.1 Research Evaluation: A Multi-dimensional Field

Research productivity evaluation is a broad endeavor. Among the other goals, the fundamental and the important one is to assess the return of investment in scientific research in the form of quality output. As scientific research heavily funded by the funding bodies, governments and institutions around the world, to establish a consensus about the success or failure of a research project requires making evaluation procedures based on those enriched indicators that can monitor both the productivity of their public money and the quality/impact of research, in order to establish policies for future investments.

Mostly, the evaluators (i.e., university management, funding organizations etc.) produce a new or alter an existing evaluation procedure or its sub-elements (e.g., h-index, g-index etc). The alteration takes place in the form of customization of an indicator tailor it for fulfilling demands in-hand. Moreover, when it comes to the selection of a data source, one may want to use a private data source, one could consider blog posts, keynotes and the like to be used as a performance indicator beside the traditions dissemination activities. Mainly, we observed that an evaluation procedure comprised of three basic, but diverse elements. These are as follows:

  • The selection of one or more appropriate information sources. These are the sources which fulfill data requirements (e.g., digital libraries, scholarly search engines). Recently, the presence of a large number of such information sources has provided an opportunity to choose one source over the others.

  • Second, the selection of a set of indicators. These are the smallest units in an evaluation procedure, which hold the logic to determine one particular impact factor. For instance, h-index is a citation based metric.

  • Finally, the formation of an overall procedure, which comprised of both, the information sources and the metrics that collectively determine research impact of researchers. A procedure may also include a customized version of a metric or a private data-source.

In the following sub-sections, we elaborate each of these aspects in more detail.

1.1.1 Diverse Information Sources

An important dimension in the research impact evaluation domain lies in the exponential growth of freely available scientific/scholarly digital content. Bibliographic information sources (aka, digital libraries) maintain and provide bibliographic information. The information sources (e.g. Web of Science (WoS), Scopus, DBLP, Google Scholar etc.)222Each one of these information sources will be described in detail in chapter 2 as well as information production sources (e.g., authors, journals, books, articles etc.) are growing day by day. Moreover, universities and research institutes also maintain local repositories, which are then used by researchers to keep record of their dissemination activities.

Information integration is an important aspect in the research impact evaluation, which is to collect data from different sources and to apply merging techniques. For example, several authors can be merged in many ways, like (1) taking an author’s papers’ information from one source and getting citation information from another (2) comparing two authors with data coming from different sources (3) using one’s own private data source in comparison with other sources.

Today, the presence of so many digital data sources overcomes the problem of data availability. On one side, the excess of data and the data sources is a constructive development, but on the other side it becomes more challenging to decide the selection of one data source over the others. For instance, it is commonly accepted that DBLP data source is a good choice for computer science field in terms of its completeness. It provides a list of published articles for a researcher, but on the other hand it does not provide citation data, which then forces to include other citation sources.

1.1.2 Diverse Evaluation Indicators

In parallel with the growth of scholarly information sources and scholarly literature, people have established richer assessment indicators and metrics than before. These metrics not only incorporate traditional quantitative factors such as publication count or citation count, but also consider various other aspects such as researcher academic age, researcher positions, normalization. To name a few of these bibliographical research quality indicators that are considered to be well established and well-known in different communities include h-index, g-index, citation count, ar-Index etc.

Over the years, these indicators have received a tremendous success, even though different communities prefer to use customized versions of them. These customizations often varies from community to community and often based on a community trends, normalizations and many other factors. The point here is that, after so many efforts from different communities, it is still not guaranteed that a single metric can reflect the in-house demands of an evaluation committee. We also believe that with so many rapid developments in evaluation indicators, it will be extremely helpful to provide a way for research impact evaluation that could provide flexibility and customization support as well as the freedom of expressiveness to the evaluators.

1.1.3 Diverse Evaluation Procedures

As research landscape evolves, universities and research institutions start developing their personal research assessment procedures to meet specific local requirements. As of today, the availability of variety of information sources and also the assessment indicators, on one side gives more freedom to evaluators to choose among several options, but on the other side overall evaluation procedures become more subjective. These evaluation procedures often differ from traditional ones. For example, factors such as customization of the definitions of the traditional metrics such as h-index to contemporary h-index, inclusion of public as well as local private data sources, strict data filtering checks collectively makes an evaluation procedure tailored yet complex. Indeed, software developers cannot anticipate these customizations therefore not able to provide a largely well-accepted solution.

We have gathered a number of such evaluation procedures, which we describe in chapter 4 in more detail to understand their insights. These specific, customized evaluation procedures demand expertise and skills in various ICT-related technical areas that those assessors lack. For example, a typical set of tasks required by these procedures include; fetching a list of publications from a source, applying cleaning process (i.e., to exclude publications which do not belong the queried researcher) and then to send the filtered list for a metric computation and in the end visualizations of results. In the following sections, we describe in detail all the problems and challenges in this area and state our objectives.

1.2 Problems, Challenges and Objectives

Despite the fact that, the researchers must be evaluated on the basis of their research work; however, there is little consensus today on how an evaluation procedure should be designed and performed, and it is commonly acknowledged that the quantitative metrics available today are largely unsatisfactory. Indeed, today people judge research contributions mainly through publication in venues of interest and through citation-based metrics (such as the h-index), which attempt to measure research impact. However, there are different opinions on how citation statistics should be used, and they have well-known flaws. For instance, [4] pointed out shortcomings, biases, and limitations of citation analysis. In another work [5], authors criticize the use of journal impact factor for evaluating research.

Furthermore, current metrics are limited to papers as the unit of disseminated scientific knowledge, while today there are many other artifacts that do contribute to the Science, such as blogs, datasets, experiments, or even reviews, but that are not considered in research evaluation. Besides the flaws of current metrics, the fact remains that people have - and we believe will always have - different opinions on which criteria are more effective than others, also depending on the task at hand (that is, the reason why they are conducting the evaluation). For example, in our department, the evaluation criteria for researchers are defined in a detailed document of 10 pages full of formulas and are mostly based on publications in venues that considered important in the particular community and are normalized following a particular agreed criteria. For instance, other institutions use citation counts normalized by the community to which the authors belong and then grouped by research programs to evaluate each research group, not individuals. Examples are numerous and, much like in the soccer world cup, everybody has an opinion on how it should be done.

Not only individuals may choose different metrics, but also different sources (e.g., Google Scholar vs. Scopus), different normalization criteria (e.g., normalizing the value of metrics with respect to averages in a given community), different ways to measure individual contributions (e.g., dividing metrics by the number of authors), or different ways to compare (e.g., compare a candidate with the group that wants to hire them to determine the autonomy and diversity of the candidate from the group), with different aggregation functions (e.g., aggregated h-index of a scientists co-authors, aggregated citation count, etc.).

We believe that this kind of personalization of the assessment processes (as well as many other personalization of the evaluation process, like for instance, the need of normalizing a traditional metric for a specific community) is a key element for the appropriate use and practical success of the various evaluation tasks. Moreover, people involved in such evaluation processes, most of the time are not IT experts, and not capable of building appropriate software for crawling data sources, automatically parsing relevant information, merging data and computing the required personalized metrics. Therefore, in order to empower the interested end-users, we need to design an appropriate and possibly easy-to-use IT platform, which could make life easier of those domain-experts who do not expert in IT. Indeed, supporting custom metrics for research evaluation is a non-trivial issue and requires addressing interesting research questions like:

  • What is the set of key features that may enable a user to express its own evaluation metrics, i.e., what is the expressive power needed to do so? For instance, assessing the independence of a set of young researchers requires fetching all publications by the researchers, cleaning out papers that have been co-authored by the researchers’ PhD supervisor, computing their h-index metrics, and ranking them according to their h-index.

  • How to enable less technical end-users to perform both easy and more complex data integration tasks? We have seen that being able to access an evaluation body (e.g., a set of papers) that is as complete as possible is at least as important as expressing custom metrics over the evaluation body. For example, fetching all publications of the young researchers may imply fetching data from Google Scholar, DBLP, and Scopus as well as fusing the obtained data and cleaning it.

  • Which is the best paradigm or formalism that may allow users to model/express their custom evaluation metrics? A metric may, for example, be expressed in text form via a dedicated domain-specific language, or modeled visually by means of suitable graphical modeling constructs, composed with the help of a guided wizard, and so on.

  • What type of software support does the computation of custom evaluation metrics need? Depending on the logic needed, the actual computation of a metric may be achieved via generated code, a dedicated evaluation engine, a query engine, or similar.

One of the most important issues that need especial consideration while addressing the problem is the kind of target end-users. We intend non-IT experts (i.e., non-programmers) as our end-users, who will get benefited from our research work. In following we introduce our proposed solution for all the aforementioned problems. It must be well-understood that throughout the different stages of our work, we always refer and give examples from the selected domain to convey understanding whenever needed. However, this does not mean that the proposed solution is only valid for the selected domain. Instead, we aim at to keep separate those aspects that purely based on the chosen domain from those of generic type. In essence, we first aim at proposing a generic approach, and a methodology that then given a set of domain-specific aspects we show how to adapt it for that particular domain.

1.3 Solution Overview and Contributions

1.3.1 Overview

After about two decades of research in workflow management and more or less one decade of web service composition, two research streams whose initial ambitious goal was to enable non-technical users to design processes or compose services with little or no help from developers, we are still in a situation in which these forms of process modeling and execution technologies can only be mastered by specifically trained developers. One of the best examples of this situation is probably the recent standardization of Version 2.0 of the Business Process Modeling Notation (BPMN) [6], which brings together the two worlds of BPM and service composition, but that also has become much more like a programming language and less like a modeling instrument targeted at non-programmers (as the size of the documentation also indicates). As a result, people that are not fully familiar with the modeling notation are reluctant to use it since they know that they will not be able to draw a correct and consistent process model.

While this is a concrete issue in business process modeling and service composition, it is even more so in a relatively new, yet highly-related area: web mashups. The recent emergence of mashup tools has refueled research on end-user development, i.e., on enabling end-users without programming skills to compose their own applications.

Mashups are typically simple web applications (most of the times consisting of just one single page) that, rather than being coded from scratch, are developed by integrating and reusing available data, functionalities, or pieces of user interfaces accessible over the Web. For instance, housingmaps.com integrates housing offers from Craigslist with a Google map adding value to the two individual applications. Likewise, Mashup tools, i.e., online development and runtime environments for mashups, ambitiously aim at enabling non-programmers (regular web users) to develop their own applications, sometimes even situational applications developed ad hoc for a specific immediate need [7].

However, we think that doing so is even harder than enabling non-programmers to model an own process or service composition, because developing full applications is simply complex. While the component-based reuse approach is certainly lowering part of the complexity, developing an own application, however, also means dealing with data integration, application logic, and content presentation issues, all aspects the common web user is not even aware of. Yet, similar to what happened in web service composition, the mashup platforms developed so far tend to expose too much functionality and too many technicalities so that they are powerful and flexible but suitable only for programmers. Alternatively, they only allow compositions that are so simple to be of little use for most practical applications.

For example, mashup tools typically come with SOAP services, RSS feeds, UI widgets, and the like. Non-programmers do not understand what they can do with these kinds of compositional elements [8; 9]. We experienced this with mashup tools in our own group, mashArt [10] and MarcoFlow [11], which we believe to be simpler and more usable than many composition tools, but that still failed in being suitable for non-programmers [12].

Yet, being amenable to non-programmers is increasingly important as the opportunity given by the wider and wider range of available online applications and the increased flexibility that is required in both businesses and personal life management raise the need for situational (one-use or short-lifespan) applications that cannot be developed or maintained with the traditional requirement elicitation and software development processes.

We believe that the heart of the problem is that it is impractical to design tools that are generic enough to cover a wide range of application domains, powerful enough to enable the specification of non-trivial logic, and simple enough to be actually accessible to non-programmers. At some point, we need to give up something. In our view, this something is generality, since reducing expressive power would mean supporting only the development of toy applications, which is useless, while simplicity is our major aim. Giving up generality in practice means narrowing the focus of a design tool to a well-defined domain and tailoring the tool’s development paradigm, models, language, and components to the specific needs of that domain only.

1.3.2 Contributions

This chapter presented an introduction of the reference domain and the problems and challenges faced by the users. However, a more detailed discussion and requirements that are of domain-specific type will be presented in chapters 2, 4. Moreover, the requirements those are related to the end-users (i.e., non-programmers) will be presented in chapter 3. In following we summarize contributions of this thesis.

  1. First of all, we present the novel idea of domain-specific mashups and describe what they are composed of, how they can be developed, how they can be extended for the specificity of any particular application context, and how they can be used by non-programmers to develop complex mashup logics within the boundaries of one domain.

  2. We detail and exemplify all design artifacts that are necessary to implement a domain-specific mashup tool, in order to provide expert developers with tools they can reuse in their own developments.

  3. We show what developing a domain-specific mashup tool means, which role the mashup meta-model and the domain concept model, the domain syntax model play and how these can be merged into a domain-specific mashup meta-model.

  4. We describe a methodology for the development of domain-specific mashup tools, defining the necessary concepts and design artifacts. As we will see, one of the most challenging aspects is to determine what is a domain, how it can be described, and how it can both constrain a mashup tool (to the specific purpose of achieving simplicity of use) and ease development. The methodology targets expert developers, who implement mashup tools.

  5. We apply the methodology in the context of a mashup platform that supports the development of domain-specific mashup tools. To achieve this, we present a baseline platform, which is then used to develop and tailor a mashup tool to support a domain most scientists are acquainted with, i.e., research evaluation. This mashup platform targets domain experts (i.e., non-programmers).

  6. In this thesis, we also present an efficient approach for mashup-based web application, those communicate big data between client and server. The proposed approach prevents heavy data communication using suitable communication-pattern (i.e., among the four proposed patterns) and a server-side cache.

  7. To evaluate our work, we performed twofold validations. First, we performed a usability and comparative evaluation, which is to understand end-users preference between a generic versus a domain-specific mashup tool and to learn the right balance a mashup tool should offer in terms of complexity, flexibility, and expressiveness. Second, we performed a user studies in order to assess advance usability aspects of the developed platform and the viability of the respective development methodology.

While we focus on mashups, the techniques and lessons learned in the thesis are general in nature and can easily be applied for other domain sand to other composition or modeling environments, such as web service composition or business process modeling.

1.4 Structure of the thesis

Literature reviews and the aforementioned contributions of this thesis are presented in different chapters as described below:

  • Chapter 2, presents state of the art related to the domain of research evaluation. We present different evaluation indicators, data sources and techniques, which are being used for different evaluation purposes by different communities. We also present the related tools that are currently available for performing research evaluation.

  • Chapter 3, presents state of the art related to the End-user development. We present different approaches that end-user development based upon. Various programming paradigms especially for the end-user are reported. Moreover, we present mashups approaches, and see how this paradigm can be used for effective end-user development.

  • Chapter 4, describes a few real-life research evaluation procedures, which we have collected from different sources, to devise a set of concrete requirements and in the end we present our analysis in terms of major design-principals that are to facilitate end-users for their development tasks.

  • Chapter 5 states a set of methodological steps. We present the definitions of important concepts, various design artifacts, formalisms, and a detailed methodology for the development of domain-specific mashup tools. We show what role a domain-model, meta-model and a domain-specific meta-model play in the development of a domain-specific mashup tool.

  • Chapter 6 shows an implementation of a generic mashup tool, its design principals, architecture and shows how and where domain knowledge can be injected for tailoring it to a domain-specific mashup tool.

  • Chapter 7 presents ResEval Mash, a mashup tool that is tailored to the domain of research evaluation. We present how different domain related artifacts are used in the development following the methodological steps presented in the chapter 5.

  • Chapter 8 reports on a few user studies that we conducted to evaluate of our approach, methodology and domain-specific mashup tool.

  • Chapter 9 concludes the thesis. We present future work, lessons learned specific of the selected domain and of related to the development of mashup tool in general.

Chapter 2 Research Impact Evaluation: State of the Art

2.1 Overview

This chapter presents comprehensive insights of the research impact evaluation field. Exploring fundamental questions, like what is research impact evaluation?, why is it needed?, how is it performed? and who performs it?, provide us a consolidated base through which we tend to understand various associated aspects of the field. In response to the how, we also present different evaluation indicators that are developed over the years and are being used by different communities. Although, these communities have adopted and tailored these indicators to meet their community-specific trends and requirements, even understanding those specific details lead us to a solid understanding. This chapter also reports on the impact evaluation tools that have been developed and used over the years and we explain why these tools failed to support the current practices in research evaluation field. In response to the who, we present end-users who perform such evaluation tasks and what are their expertise level with respect to this domain and to the technology.

2.2 Multiple Faces of Research Impact Evaluation

Impact evaluation, in terms of a project, program or policy, assesses the changes that could happen after a particular intervention. In essence, the impact evaluation is a comparison between what happened and what would have happened if we take those interventions aside. In theory, the concept of impact evaluation is slightly different from ”outcome monitoring”, which is to check on whether targets have been achieved or not. While the field of research impact evaluation deals with the growing concerns related to the productivity assessment of a research work, sometimes, both in terms of research inputs and outputs. The research assessment could be of various types, for instance, ranging from the traditional ways (i.e., peer review process which usually performed before dissemination, for an early evaluation) to more sophisticated assessment methods (i.e., using citation-based, content-based indicators; mainly performed after dissemination). Likewise, the evaluation can be an ongoing process that monitors the progress of work, or it can be a process that evaluates at some certain stages (e.g., midterm evaluation, final-stage evaluation).

From the point of view of an early or pre-dissemination evaluation approach (i.e., peer review), the assessment takes place by the recognized experts in a particular field. In practice, peer review usually performed by experts with general expertise in a specific field, which is largely an accepted way, however, sometimes this particular scrutiny process considered controversial, as according to some others, the evaluation committee should be comprised of specialists of the field rather than a general competence committee. On the other side, the post-dissemination evaluation process, which is the main focus of our discussion, is much more controversial than of pre-dissemination. Over the years, many approaches have been proposed and to some extent fulfill a general set of evaluation requirements. However, despite many efforts, different communities have developed new or tailored exiting evaluation methods for their specific needs. In the last few years, It has been observed that the research spectrum crosses the boundaries, researchers are becoming more collaborative than ever, research groups are formed of experts from different affiliations and different continents. In such a conducive environment for research to grow, the amount of research dissemination to science is rapidly increasing. In parallel to this increase, the assessment of research outputs has become a crucial issue for a wider range of stakeholders (e.g., funding bodies, universities, research institutions etc.). The field of research impact evaluation primarily focuses on a number of aspects that need to be considered first. For example, amongst many others, the fundamentals are:

  • For whom the evaluation procedure is taking place? A clear vision of a body (e.g., individuals, groups, universities etc.) to be evaluated is a core element before performing further steps.

  • What types of research artifacts to be considered in the evaluation? After the selection of whom, the next step is to agree upon what research outputs of the selected unit will be considered in the evaluation.

  • What evaluation methods to adopt? This aspect addresses the most controversial part of the evaluation process i.e., evaluation approach, method, the nature of the process.

The first and the fundamental aspect, that must be considered before investigating further into the details, is for whom the evaluation procedure will be performed. That is the selection of an unit to be evaluated (i.e., whose research work to be evaluated). The units of assessment include individuals, research groups, departments, universities, research fields and even countries. The complexity of an evaluation procedure is directly proportional to the selected unit. To determine the productivity of an individual researcher is far easier than to determine the productivity of a university where normally hundreds of researchers work. The second noteworthy aspect in the research impact evaluation field is the selection of the types of research outputs to be evaluated. To this end, different disciplines prefer different types of research output to be considered. Usually these types include, to name a few of them, journals, conference and workshop proceedings, book chapters, books, prototypes etc. Amongst the other important aspects, the selection of appropriate assessment indicator is highly important, and to some extent is highly controversial in some cases. Often, one’s opinion on an indicator for an assessable unit differs from others as everyone has his own opinion on what criteria/indicator should be used.

Based on these diversities, in 2010, a multi-dimensional research assessment matrix was published by the Expert Group on the assessment of University Based Research (AUBR) [13], operated under European Commission. The matrix presents five basic units of assessment, obviously one can think of a different one. The matrix also represents a few purposes (i.e., why a particular research work conducted) for each unit to be assessed. Moreover, the matrix also shows a very basic set of bibliometric as well as a few other emerging indicators that can be applied to various assessable units. In essence, the matrix shows a glimpse of the diversity of the field and clearly it is not restricted to only these aspects, one can think of many other trivial as well as non-trivial aspects.

In the field of research impact evaluation, the central role in an assessment procedure holds by the selected assessment indicators. Over the years, many different indicators have been proposed. These include quantitative as well qualitative ones. In a report published by Scopus 111http://www.researchtrends.com/wp-content/uploads/2011/06/Research_Trends_Issue23.pdf in 2011, amongst the others, they only focused on bibliometric indicators. According to the report, bibliometric indicators are divided into three generations. In table 2.1, we show the division of all three types of bibliometrics indicators. The first generation corresponds to a basic set of indicators (e.g., publications count, citations count etc.), which are easily available and can be obtained from various sources. The second generation, which is relatively more advance than the first ones, includes indicators that used to be normalized based on a specific filed to remove the biases and so on. The third and the most non-trivial set of indicators were categorized in this generation that include influence weights, Journal Rank, SCImago and other more sophisticated indicators etc.

Type (generation) Description Typical examples
First Basic indicators; relatively easy to obtain from sources that have available for decades Number of publications; number of citations; journal impact metrics
Second Relative or normalized indicators, correcting for particular biases (e.g., differences in citation practices between subject fields) Relative or field-normalized citation rates
Third Based on advance network analysis using parameters such as network centrality Influence weights; SCImago Journal Rank; ’prestige’ indicators
Table 2.1: Generations of bibliometric indicators

To practically devise an evaluation procedure, it requires making decisions about which unit needs to be assessed, for what purposes, on which output dimensions, using which assessment indicator (i.e., a bibliometric or other emerging indicators). Clearly, there is not a single answer to these questions, it is entirely, on one side, based on the purpose of an evaluation, the selected unit to be assessed, and on the other side the selection of appropriate indicators. In our opinion, the field of research impact evaluation is highly diverse, and the use of one indicator over the others is highly subjective. Even the citation-based approaches can alone raise significant challenges, but a proper use of these can also provide a clear indication of someone’s performance. Many studies, for example, according to [14], quantification through citation analysis of past performance can be used to predict future performance. Moreover, in a similar study that is based on several related aspects of citation analysis has been presented in [15], where author presented a detailed analysis of accuracy, theory, and effective use of citation analysis in parallel to its strengths and weaknesses.

2.2.1 Quantitative and Qualitative Research Evaluation

By and large, the impact evaluation approaches can be divided into two basic methods: 1) quantitative 2) qualitative. Both methods can be distinguished based on the type of evaluation experiments conducted on the data produced by some research work. In general, quantitative methods focus more and deal with real numbers. For instance, count on the number of publications, count on the number of citations, and other indicators that rely on such numbers in one way or the other, like H-Index, G-Index etc. While qualitative methods are more based on the descriptive properties of the data. For examples, evaluation practices those involve aspects like reputation, peer ranking analysis through participatory studies, interviews, and other socially enhanced indicators. Quantitative approaches are typically used and kind of considered standard method. Whereas, qualitative approaches are less common and rarely used. We mostly focus and study bibliometric methods that are quantitative in nature than of qualitative ones.

2.2.2 Bibliometrics, Scientometrics and Informetrics

Often interchangeably used terms: Bibliometrics, Scientometrics and Informetrics, refer to the methods that study various aspects related to the science and information (i.e., the information present in any form). To some extent, there has been confusion for these closely related terminologies. Over time, people have defined these terminologies for the field they belong, but still all definitions show considerable overlap among different terms that they used.

In 1969 Pritchard introduced the term Bibliometric in his paper [16] as “the application of mathematical and statistical methods to books and other media of communication”. He stressed more on quantitative aspects, like count on the number of articles, publications, citations, books and in general any statistically significant measures of recorded information. The term Scientometrics was introduced as a science for analyzing and measuring science through relationships and social structure and also to check the status of an individual within a group [17].

A field that encompasses both the bibliometrics and scientometrics fields is Informetrics. In [18], the author defined it as a study of the quantitative aspects of information in any form that include the production, dissemination and use of the information regardless of its form. In the following section, we mainly focus on the bibliometrics based approaches and indicators.

2.3 Research Evaluation Through Bibliometrics Approaches

Over the last few years, bibliometric indicators are considered to be a standard and popular way to assess research impact. All significant indicators heavily rely on publication and citation statistics and other, more sophisticated bibliometric techniques. In particular, the concept of citation[19; 20] became a widely used measure of the impact of scientific publications, although problems with citation analysis as a reliable method of assessment and evaluation have been acknowledged throughout the literature[4]. Indeed, a research work not always gets citations because of its merits, but also for some other reasons such as flaws, drawbacks or mistakes. A number of other indicators have been proposed to balance the shortcomings of citation count and to ”tune” them so that they could reflect the real impact of a research work in a more reliable way. As with the increase of scholarly literature, different communities introduced new indicators for the assessment. Although these indicators widely based on citation analysis, but they gained popularity over simple citation indicators like a simple publication or citation count.

Of the many famous indicators, like h-index that is proposed by [21] by Jorge Hirsch, considered as a more comprehensive indicator to assess the scientific productivity and the impact of an individual researcher. The h-Index is among the recent and most successful indicators over the last few years because it is straightforward to compute based on the citations of a researcher’s publications. The h-index takes into account both the quantity and the impact of the researcher’s contributions. That is why some of the most significant journals[22] take interests into it. The original definition of the h-index by Hirsch is as:

  • A scientist has index missingh if h of his or her papers have at least h citations each and the other papers have h citations each.

The h-index has been widely acknowledged because of the good properties it holds, for example in [23], authors considered this index as an objective indicator and based on this they stated that it can play a significant role when allocating funds, making decisions about personnel or awarding prizes. In [24] highlighted another advantage of the h-index, where author reported that the h-index does not care much about the low cited papers, which is a good thing that makes this index viable than others. According to them, as the majority of the confusions and errors tend to occur in the lower part of someone’s citation record so neglecting that part certainly reduces possible errors.

However, some flaws and drawbacks of the h-index have been identified over time and often different authors have tried to solve those errors by introducing new indicators or its variations. Hirsche in his paper [21], himself mentioned that due to differences in the productivity of different fields, there are differences in values. Hence, comparing two researchers based on their h-index values those belong to two different disciplines is not an appropriate comparison. Another disadvantage of the h-index is that, it is used to compare researchers which are at a different level of their career, since h-index depends on the scientist’s entire career, but publications and citations increases over time, claimed in [25].

To overcome the shortcomings of the h-index, recently a number of variations of the h-index have been proposed. One of the proposals presented in [26], where authors considered the h-index is quite arbitrary. From their point of view Hirsche could defined h-index as: ”a scientist has h-index of if of his papers have at least citations each and the other papers have citations each”. That is how they extended the h-index to -index, which is formally defined as:

  • ”A scientist has -index of if of his n papers have at least citations each and the other papers have fewer than citations each.” Where .

    In [27], author proposed , according to which they proposed to use average of the citations in the Hirsch core [28]. Formally A-index is defined as:


    In the above definition of A-index, is the h-index value and is the total citations received by most cited paper. Another problem that is also solved by the A-index is that the index increases its value if the most cited papers receive more citations, while in case of h-index, it does not increase if a most cited paper gets more citations. To the best of this side, it is crucial that if an indicator which should indicate quality of a researcher, should consider the performance of top cited papers too. To this end, an indicator which is known as g-index was proposed by Egghe [29]. The formal definition of the g-index according to Egghe is as follows:

    • A set of papers has a g-index if is the highest rank such that the top papers have, together, at least citations. This also means that the top papers have less than cites.

      Egghe’s concern with the h-index was, once the h-index is computed, for the highly cited paper it remains insignificant that those receive further citations as new citations do not effect the value. The consequences of this would impact highly cited researchers, as they may have h-index similar or equal to moderate researchers. However, the g-index also suffers from problems. For instance, if a researcher receives a high number of citations in one paper, but for other papers he gets average citations. The g-index for that researcher would be higher as compared to other scientists with higher average citations in their papers, reported by [30].

      To overcome the limitations of both and , a new index has been proposed in[30] with the aim to combine the good properties of both indices and to minimize the disadvantages. This index is known as hg-index, and is defined as , which is the geometric mean of the and -index. It is easily understandable that and that . Indeed this index is very simple to compute once both and -index values have been obtained. It has more granularity, which makes it even easier to compare researchers with similar or -index values.

      In [31] authors proposed a new index, which is known as -index. This particular index not only takes into account citations of a researcher and also the publication age. As with the time, the performance of a researcher can increase or decrease, which is an aspect that was ignored before. However, the -index claims to observe these changes and can increase or decrease with time. The AR-index is formally defined as follows:

      .

      Where is the h-index value, is the total number of citations of the -th most cited paper, is the number of years since the publication of the -th paper. In another work [32] in which the authors proposed the idea to give weights to citations. This variation of the h-index is known as -index and is defined as follows:

      .

      Where is the number of citations for the j-th most cited paper, is the largest row index i such that and .

      In [33], author presented the -index. In this work, the authors proposed to give more weight to the most cited papers, as this idea originally been presented in the g-index. Based on this idea, the -index is defined as: ”A scientist’s -index is defined as the highest natural number such that his most cited papers received each at least citations”. This index is easier to compute because it only focuses on highly cited paper. It can be used with data where some uncertainty exists, especially in low cited papers. This index also sufferd by problems identified in [34], where author emphasized that as a small set of papers are needed to compute -index, and since researchers with different number of publication and citation rate, which is not suitable for this type of index. Thus, they proposed the normalized h-index, which is defined as: . Where is the -index and is the total number of publications of a researcher. This index is also considered more suitable for younger researchers, as they can less productive at the beginning of their career.

      In [35], author proposed an interesting index, which is called tapered h-index. They propose to incorporate all citations for all papers of a researcher. One of the Shortcomings of the h-index is that it ignores very low cited papers as well as new citations to highly cited papers. However, this index claims to consider complete citation records of a researcher despite a paper has low or high citations. It uses the idea of representing the citations of the papers in a Ferrers graph, where columns represent the partition of the citations among the papers. The largest filled square in Ferrers graph, is called the Durfee square. In another similar approach[36], authors presented the rational h-index -index, which is defined as: where h is the h-index, is the number of citations. Intuitively .

      There are some other factors that might implicitly influence the interpretation of the results using a citation-based metric. Therefore, the evaluation process may produce incorrect results. One of these factors could be the self-citation count. The controversial phenomenon of self-citation is generally believed to create problems for those who would attest to the reliability of citation analysis for evaluative purposes[37; 38]. The inclusion of self-citation in the calculation of citation statistics inflates the research impact of a given artifact, thus taking out self-citations from citation count would be better in quantification of a more realistic research impact.

      Michèle Lamont’s book [39] holds a complete analysis on how evaluation is performed by professors. In the book, she analyzed the complicated details of peer reviews and 12 panels of experts in the humanities and social science, extrapolating subjective criteria for decision-making in each different discipline, giving an interesting overview of possible features that influence reputation of researchers. The Altmetrics Initiative [40] goes one step further and aims at using social interactions for proposing new indicators of research impact more related to the reputation of the researchers.

      We have presented a number of different metrics that have been proposed and used. We can clearly see that the present literature on research impact evaluation emphasizes that there are so many different criteria, proposals and thoughts for conducting the evaluation and there are different opinions on which criteria are more effective than others (depending on the reason why they are conducting the evaluation). We provide a more detailed critical analysis of all these metrics in the section 2.6. However, in the next section we present a comprehensive review of the different information sources (i.e., bibliographic databases) and various tools developed support providing evaluation services.

      2.4 Bibliographic Databases

      Bibliographic databases also known as digital libraries maintain and provide bibliographic records such as, journals, conference proceedings, technical reports, books, patents etc. A bibliographic database can be a multidisciplinary in terms of coverage (i.e., covering various disciplines like computer science, physics etc.) or can be a discipline-specific (i.e., covering one discipline). Of the several bibliographic databases, a few of them are proprietary, available under licensing, and other are freely available on the Internet. The ones, freely available either offer their services as a scholarly search engine or as a digital library (i.e., a system that store content in digital formats and accessible via computers through an API). In the next section, we present a few of these bibliographic databases and present services these databases provide. We also report on diversities, completeness, and coverage issues related to these databases.

      2.4.1 Web of Science

      A decade ago, researchers had essentially a very few bibliographic data sources available, among those the Web of Science222http://scientific.thomson.com/products/wos/, which is an online academic citation index provided by Thomson Reuters, was very popular. Web of science provides access over 12,000 journals worldwide, including 150,000 conference proceedings333 Recorded on Jan 10, 2013. Web of Science provides coverage of nearly 256 disciplines that include science, social science, arts, humanities etc. Along with the bibliographic data, web of science also provides a few numbers of indicators that can be used for research impact evaluation. The commonly used indicators provided by WOS include: p-index (number of articles of an author), cc-index (number of citations excluding self-citations), cpp (average number of citations per article), productivity (quantity of papers per time-unit). To some extent, these indicators can be used to determine the impact of communities, journals, academic institutes using various aggregations. Another, academic citation indexing search service known as Web of Knowledge, is also provided by Thomson Reuters. This wrapper service covers a few disciplines like sciences, social sciences, arts, humanities, that also include a number of journals from the web of science. It provides tools to analyze the bibliographic content over several databases.

      Despite all the benefits the web of science and web of knowledge provide, they still have some limitations, and thus become very crucial in some assessments tasks. Among these drawbacks, the limited coverage of these services that only targets, as mentioned above, a few high impact peer-reviewed journals. These journals only represent a fraction of research work that is published. In various disciplines internationally recognized high impact journals are not the only way to disseminate research work, so those cannot take advantage of the Thomson Reuters services. Moreover, the web of science does not provide free access to their data and tools, which can also be considered as a drawback for these kinds of bibliographic database.

      2.4.2 SciVerse Scopus

      Recently, many other competitors of the Web of Science emerged that also provide bibliographic data. One of these is Scopus444http://www.info.sciverse.com/scopus, that maintains bibliographic records including citations, abstracts, journal articles. As of today555Scopus database status published on their website on Jan 17, 2013, Scopus claims of having a bibliographic database that contains more than 20,500 peer-reviewed titles from more than 5,000 international publishers. In case of scopus, it only indexes journals, book series, conference proceedings that have an ISSN assigned to them. Scopus does not index an article whose author is not the person behind the presented material such as obituaries or book reviews. Scopus provides various tools that work on their own database and provide value-added services. For instance, citation tracker is a tool that can be used to find highly cited author in a field or hot topic in some subject areas.

      Similar to the web of science approach, Scopus is also a paid source of bibliographic type of information. Elsevier that operates Scopus also operates a free service called Scirus. It is a science-specific search engine that only works for Computer science field. One can search bibliographic records using this service; however, they do not provide any kind free public API to take advantage of the data they maintain.

      2.4.3 Microsoft Academic Search

      On the contrary to both Web of science, and Scopus services as mentioned above, the Microsoft Academic Search666http://academic.research.microsoft.com/ is a free academic search engine. This search engine is developed by Microsoft Research and it came into being during the recent years. This multidisciplinary search engine covers more than 48 million publications and more than 20 million authors from various domains. The service is free and provides an easy to use interface to query scholarly literature. Moreover, Microsoft Academic Search provides a few basic indicators (e.g., h-index, g-index etc.) for assessment, and it also provides a visual explorer where one can visualize a researcher’s co-authors graph or a citation graph.

      Another appealing yet highly demanding feature, which is researchers name disambiguation, is also provided by Microsoft Academic Search. This feature to some extent works, but we personally observed that it too does not completely disambiguate many cases. To disambiguate a researcher, it shows a list of authors who share the same names along with their affiliations. From the given list a user can select one among many based on the affiliation. However, the problem still exists and the service does not completely disambiguate more complex cases. In the beginning their data service suffered by the problem of coverage. Until the year 2010, they only covered the computer science field, but quiet recently the coverage has been increased to other disciplines like biology, chemistry, mathematics etc., which makes the service more useful.

      2.4.4 Google Scholar

      Likewise the Microsoft Academic Search service, Google also started in 2004 a bibliographic search service named Google Scholar777http://scholar.google.com/. Google Scholar provides a very simple interface to search bibliographic content over a large set of disciplines from many sources. Google Scholar maintains its database by crawling data from quite a large number of sources. The type of bibliographic data that Google Scholar indexes include peer-reviewed online journals, conference proceedings, books, non-peer reviewed journals, preprints, technical reports, theses etc. Moreover, Google scholar maintains the citation records of scholarly literature.

      It does not guarantee that an article indexed by Google Scholar can be freely available, though a request made through certain universities, institutes those subscribed to various services can access articles freely. Google Scholar claims and apparently considered trusted bibliographic source in terms of its coverage. Moreover Google Scholar seems the most updated scholarly data providers, though nobody knows when and which journals Google scholar crawls. However, the data quality in some cases seems compromised. Google Scholar does not provide the support for name disambiguation problem, that is, for example in the case where two or more authors share the same name [41].

      2.4.5 Dblp

      DBLP is largely a computer science specific bibliographic database hosted in Germany by the Universitat Trier. As of November 2012 DBLP maintains 2.1 million bibliographic data. DBLP provides a browser-based user interface for performing search over the data and also it allows to download the entire dataset in XML format. Moreover, DBLP offers an API that developers can use to query specific records. The service is free, though as it is today, a disadvantage of this service is that it only covers the computer science field. Moreover, DBLP does not maintain citations references. Despite these flaws, the DBLP service considered a clean and reliable source for bibliographic data.

      The above mentioned bibliographic services are just the tip of the iceberg. Over the years, a number of other bibliographic data sources have been emerged. Among these bibliographic databases, CiteSeerX888http://citeseerx.ist.psu.edu/, arXive999http://arxiv.org/, Association for Computing Machinery (ACM)101010http://www.acm.org/, GoPubMed111111http://www.gopubmed.org/, Science.gov121212http://science.gov/, SpringerLink131313http://www.springer.com/ are the popular ones.

      The proliferation of data sources makes it evident that the scholarly data and the data providers are numerous, however, the main problem for non-experts users is the lack of technical expertise that are required to use these sources to crawl, call API etc. For simple scenarios, for instance, to get a list of publications of a researcher seems reasonable and can be performed manually. However, tasks such as to get all the publications and citations of all the researchers of a university poses serious challenges that cannot be performed manually as it requires huge human efforts. Thus, an easy-to-use, flexible and as much as automated software support is required that could perform such complex tasks. Recently, a number of such tools have emerged. In the next section we report on these tools that provide the research evaluation services based on the different data sources mentioned in this section.

      2.5 Research Impact Evaluation Tools

      2.5.1 Publish or Perish

      Based on the existing bibliographic data sources, new tools are beginning to be available to support people in their research evaluation analysis. Such a tool named Publish or Perish was developed by [42]. The tool is freely available to download on the Internet. It is a desktop software that crawls Google Scholar pages for a given query and then analyses the data for further computation of citation based metrics. It provides a few numbers of famous metrics like h-index, g-index, zhang’s e-index and a few more. A user can filter out publications of his/her interest from a given list of publications that the tool actually crawls. To some extent, this approach is useful for someone who intends to perform analysis of his own data, because it’s easy to determine what publication data belong to him. But the very approach does not work in those cases where users want to search other researchers as it is less likely and hard to remember about someone’s else complete publication details. Among the other weaknesses that this tool has, include, (1) its reliance on only one information source i.e., Google Scholar; (2) the need for manual cleaning of the obtained data (for example for author disambiguation and self-citations among others); (3) the lack of Application Programming Interface (API) over which other applications or web services could use their services; (4) the tool does not provide a way to call a third party API, a feature which is useful if provided. Moreover, a user cannot customize or provide a new user-defined evaluation procedure.

      2.5.2 Scholarometer

      A different approach is provided by Scholarometer [43], which is a kind of social tool that is used for citation analysis and also for the evaluation of the impact of an author’s research work. It is a browser-based free add-on for Firefox and Chrome that provides a smart interface for fetching data from Google Scholar. However, the service requires users to tag their queries with one or more discipline names from a predefined list of disciplines. This generates annotations that go into a centralized database, which collects statistics about the various disciplines, such as average number of citations per paper, average number of papers per authors, etc. The impact measures are then dynamically recalculated based on the user’s manipulations. Scholarometer has a server where information about the queries performed and their results are stored. However, it does not offer an API to retrieve or use this information. This tool also only depends on Google Scholar data, and no other data providers can be injected or used or linked with it. Moreover, the functionality to add or to customize existing evaluation indicators is not provided, so it is not suitable for those users who want to implement a very specific evaluation procedure. The use of predefined disciplines makes this tool more restricted to only tool provider’s chosen fields, no provision is provided to introduce new disciplines though.

      2.5.3 ResEval

      Over the time, information sources and evaluation enabler tools are becoming available but they still have many shortcomings. For example they differ in data coverage, data quality as the same case for Scholarometer. Moreover, these tools are data-source specific and cannot be extended to use other data sources. Moreover, personalization of metrics, an important feature for the diverse field of research evaluation, is still missing.

      With an aim to overcome the above mentioned deficiencies of the existing solutions, we introduced our own tool for the research evaluation purposes as a part of LiquidPub project [44]. Lessons learned from the existing experiences, in our own tool ResEval [45], we focused on the computation of more informative citation based measures. The tool focuses on providing an open and resource-oriented research impact ways and stresses the customization of existing evaluation procedures, such as the h-index and g-index measures. ResEval provided the provision to introduce new customized evaluation procedures in the form of web services. Likewise, new data sources can also be added with the help of web services, which actually encompasses the logic of calling a data source API or crawling data from its web pages. That data then can be used to leverage various metrics provided by the tool.

      By and large, the functionalities that ResEval provided mainly targeted only the experience developers as the implementation of new web services, crawling data from web pages, performing filtering, aggregating results etc. are all aspects that an experienced developer is capable to perform [46]. That is the reason, the tool failed to achieve its objective as no end-user (non-technical user) support was provided, which is the main requirement of this field. The lessons learned from other and our own tool motivated us to think about a solution that stays in the boundaries of an end-user’s expertises.

      2.5.4 Research Gate

      Research Gate, is a new and a different kind of entry in the list of already existing tools. The tool is not built on the same theme as other tools aimed at, however, it aims at providing a social networking platform for scientists and researchers. It is more towards finding collaborations, sharing papers, asking and answering questions than performing research evaluation. Although, we believe that in near future new and advanced research evaluation methods will be used instead of the traditional ones. These methods could be based on social reputation of a researcher that the researcher might gain based on his/her social interaction in the form of valuable shares of scientific papers, datasets, experiments, and likewise answering peers’ questions and the like.

      To the best of our knowledge, there are not so many other tools left that are built for the purpose of research evaluation for a broader audience. However, there are efforts within different communities and those only addresses the specific problems of a specific community. The lack of a general purpose, flexible, yet end-user oriented tool left a huge gap for the growing community of researchers, which is why complex research evaluation tasks still pose challenges for non-technical users and these challenges still have not been addressed yet by the existing solutions.

      2.6 Analysis and Discussion

      This section presents a critical analysis of all aforementioned bibliographic indicators, data sources, and impact evaluation tools. We have presented different indicators that have been developed and used for the assessment purposes over the years. We also noticed that these indicators evolved over time, and scientific communities have adopted these indicators in one or the other way (e.g., a customized version of an indicator). However, we have not found any consensus on a commonly accepted indicators, and that proves the fact that the field of research impact evaluation is a diverse field, where everyone has its own interpretation of what an evaluation procedure should be. To further support the justification for this fact, in following we present studies that have been conducted and showed the same claim as we do.

      In [23], authors analyzed the relationship of the well-known h-index with other bibliometric indicators. Their analysis was based on a set of publications downloaded from the Web of Science(1994-2004) for Spanish CSIC scientists in Natural Resources, where the actual impact assessment conducted through the h-index. Their claim was to give more weight to those researchers who do not produce a high number of publications but who achieve a very significant impact. As the h-index considers both quantity and impact of publications, however, a researcher’s maximum h-index value cannot exceed his publication count. They emphasized the use of diverse indicators for the better productivity assessment instead of just h-index, moreover they noticed that widespread use of a single index (e.g., h-index) might influence their publication behavior. Several other different bibliometric indicators have been analyzed to distinguish between researchers. For example in [47], author analyzed h-index with other indicators using Bayesian statistics, in order to confirm which indicator performs better with respect to publication data. They concluded that, in order to achieve long term scientific productivity of a researcher, most indicators require minimum 50 publications as input.

      It is widely accepted that some indicators show a strong bias towards some scientific fields. For instance, in case of h-index, when it is used to compare researchers from different fields tends to create problems, as also identified in a related study conducted by [48], where they analyzed the level of a researcher with the academic reward system in the Netherlands. They compared the h-index with other different bibliometric indicators in different fields. They concluded that comparing scientists from different fields using the h-index is not appropriate. Another interesting analysis has been conducted by [49] among different types of scientists such as, low producers, big producers, selective scientists141414Those researchers who do not produce a very high number of documents but who do attain a high impact and top scientists in the Natural Resources field at Spanish CSIC. Their analysis was based on the g-index and h-index. They found that these indicators clearly distinguish between low producers and top scientists. However, in the case of selective scientists and big producers, these indicators do not perform well. Their results show that g-index is more sensitive than the h-index. Therefore, this research work shows that both indicators do not replace each other, and both have their own advantages and disadvantages. Another similar conclusion deduced in [50]. They analyzed 26 practical cases of physicists from the Institute of Physics from Chemnitz University of Technology.

      Some studies have been conducted regarding most criticized aspects of these indicators, which is the possible influence of self-citation. The inclusion of self-citation in the computation of citation-based indicators inflates the reflection of research impact of a scientist. In [51], author presented the results conducted on several bibliography datasets. They showed that self-citations do have an impact on the h-index, particularly in the case of young researchers. They proposed to discern self-citations while checking the impact. Mainly various scientific communities have a consensus on the exclusion of the self-citations before performing research evaluation tasks.

      The correct usage of the indicators has been the primary concern of many studies and even in Hirsch’s h-index proposal, he presented that the h-index, when applied to compare scientists from different communities is not appropriate. Factor such as normalization varies based on different fields, thus reference practices and traditions in different fields should also be considered.

      A particularly interesting aspect in the computation of these indicators is the data sources used to fulfill data requirements. Until a few years ago there was essentially only a very few data sources available (e.g., ISI Web of Science, Scopus etc.) to compute various indicators. However, this number has increased during the recent years and now a number of different alternatives have become available as also presented in the section 2.4. Some of these sources only cover single discipline, like Chemical Abstract produced by the American Chemical Society, MatchSciNet by American Mathematical Society etc. On the other hand, a number of multidisciplinary data sources have emerged, like Google Scholar, Scopus, CiteSeer. These sources have been used in many studies and also for the scientific evaluation purposes as compared to discipline-oriented sources.

      In a study [52], author analyzed three main data sources (Google Scholar, Scopus and Web of Science). The study focussed on the analysis of pros and cons of these three largest, cited-reference-enhanced, multidisciplinary databases. They proposed that, some of the aspects to determine the h-index need scrutiny because they believe that content from reference databases can influence the h-index values due to problems such as completeness of data, the scope of data source and coverage. In another study [53], authors examined the citation counts, ranking by citation and h-index values for top 22 researchers belongs to human-computer interaction (HCI) field. They used Scopus and Web of Science as data sources. Their results show that Scopus provides more coverage in this field as compared to Web of Science. They found significant differences in the value of the h-index, where Scopus performs much better which is near to the actual case.

      In our literature review, we observed that the usage of bibliometric indicators in different perspective is highly subjective. We noticed that a number of studies showed their concerns about data sources problems in terms of completeness, coverage and their scope. A number of studies have been conducted regarding most sensitive issues about the use of proper indicators. Moreover, we also found that their usage is highly variable aspect across different scientific communities. Research executives, institutes, and communities have different assessment requirements hence it is hard to say that a single indicator would be truly effective. Some studies proposed to use one indicator, and on the other hand some propose to use its variation or they recommend using other indicators. Moreover, the issues related to the comparison of researchers, research groups and institutes those belong to different community have not been addressed yet and rely on a single indicator is not a recommended practice.

      We have also noticed that, all the currently available tools lack, in our view, some key features, mainly: (1) completeness of data, (2) flexibility and personalization features (3) languages to support users’ defined evaluation procedures, queries and metrics and (4) data processing features. The possibilities to define customized metrics is an essential feature in order to have a personalized access to the information, e.g., one might want to exclude self-citation from the h-index value of a researcher or see how an index could change excluding citations coming from the top co-authors [54]. To this end, in this thesis, we propose an approach to tackle these challenges, which we believe mainly the reason that this field is highly diverse. Thus, providing ingredients to be used in research evaluation procedures will be more beneficial than to restrict users to a fixed set features. Moreover, the people responsible for performing these tasks often lack technical skills which is also a main setback for the current solutions as they do not aim at these non-technical users. To this end, in the next chapter we explore techniques that could enable these users to easily and effectively involve in such complex and technical tasks.

      Chapter 3 End-user Development & Mashups: State of the Art

      3.1 Overview

      By and large, in the current era, most people are familiar with the use of computers, at least with the basic functionalities and user-experience that computers provide. These computer users include engineers, teachers [55], doctors, salesmen, scientists [56], managers, and children [57]. Based on a survey conducted by the U.S. Bureau of Labor and Statistics, Boehm et al. in his paper [58], predicted that in 2005 there would be 55 million such end-users (i.e., computer users using spreadsheets, databases, writing formulas, and queries for their daily work requirements). In another work [59], which was also based on a survey conducted in 2005 by the U.S. Bureau of Labor and Statistics, reported that these end-users population already increased to 80 million. Moreover, in the same work, based on the rate of increase from 1995 to 2005, they also predicted that this number will be 90 million in 2012.

      The nature of work that many of these users involved - vary - rapidly on the basis of months or even days. Thus, the requirements for more intuitive, easy-to-use and flexible enabling development environments increased as with the growth of end-users. Despite many efforts, it is still a challenging endeavor for the end-users to develop or modify applications that support and fulfill their goals. As this process requires considerable expertise in programming languages that these users lack. On the other hand, traditional requirement elicitation methods and computer programmers simply cannot anticipate and meet all of these requirements.

      End-user development (EUD) is a way to solve this problem. EUD helps to empower less skilled users in such a way that they can easily and effectively be involved in development processes so to develop and tailor applications by their own. More specifically, EUD provides different techniques, methods, and tools that allow users to easily cope with the new requirements within the boundaries of a particular user’s expertise [60]. Over the time, different EUD techniques emerged that target different classes of end-users having different expertises [61], [62], [63], [64], [65].

      In this chapter, we present state of the art methods that have been proposed in the field of end-user development and we also present an analysis of the major techniques, methods, and tools used for this purpose. We also discuss major paradigms those considered as a fundamental base for EUD. Moreover, this chapter introduces the newly emerging field of Mashups, especially in the context of EUD along with various developments in mashups field that have been proposed. In the end we discuss on how mashups can be better choice for less-skilled users.

      3.2 End-user Development

      The term end-user typically refers and uses for a user of computer applications. The user in this context considered a non-technical or less skilled and a non-programmer. The intentions of these users are to use the computer applications to fulfill their daily life work requirements. While the term end-user development refers to, when an end-user, who is not an expert on conventional computer programming languages, writes computer programs using either declarative or imperative programming techniques111More details on declarative and imperative techniques will be presented in the next section. Thus, end-user development, for these kinds of users (i.e., end-user), provides enabling techniques, method and tools that facilitate them to configure, tailor, modify or write new computer programs. Among various forms of end-user development, to name a few, include use of spreadsheets, writing database queries, configuring software programs, visual programming, use of Wikis etc.

      Early efforts in the field of EUD were focused around the concepts like customization, parameterization of software programs and some other on tailoring and writing small scripts [66], [67]. These enabling techniques allow end-users, for example, to write scripts in the form of macros for MS Word using Visual Basic syntax, or to perform complex computations or data processing with the help of spreadsheets (e.g., MS Excel), or configuring a software settings using different parameters (e.g., use of various graphical settings). With the passage of time and in parallel the increase in more complex users’ requirements made some of these technologies (e.g., writing scripts or macros), due to their richer technical usage demands, off-track and out of non-programmers technical expertise domain and others (e.g., use of spreadsheets) become simply useless for performing non-trivial tasks.

      However, new ways emerged and among those, for instance, programming by example also known as programming by demonstration, to some extent, reduces the efforts a user needed to learn traditional programming abstractions [61]. In this approach a computer program records the user’s action and after generalizing those set of actions it performs the same actions (not necessarily exactly same) in some other similar situations. With the passage of time, the presence of the Internet, especially with the growth of newly emerged Web 2.0 technologies, made it possible to provide a common platform for everyone to produce and consume resources at any time, in any form and from anywhere. For example, among these resources, open data access, Web Services, Online APIs, feeds (i.e., RSS/ATOM feeds) are the most popular. Although the requirement for more intuitive development environments and design support for end-users clearly emerge from research on end-user development, for example for web services [8; 9], not many tools and frameworks are yet available to satisfy this need. From a conceptual point of view, there are currently two main approaches to enable less skilled users to develop programs, which are simplifying development practices and enabling reusability. That is, in general development can be eased either by simplifying it (e.g., limiting the expressive power of a programming language) or by reusing knowledge (e.g., copying and pasting from existing algorithms).

      Among the simplification approaches, the workflow and Business Process Management (BPM) community was one of the first to propose that the abstraction of business processes into tasks and control flows would allow also less skilled users to define their own processes. Yet, according to our opinion, this approach achieved little success and modeling still requires training and knowledge. The advent of the Service-Oriented Architecture (SOA) substituted tasks with services, yet the composition is still a challenging task even for expert developers [9] [8]. The reuse approach is implemented by program libraries, services, or templates (such as generics in Java or process templates in workflows). It provides building blocks that can be composed to achieve a goal, or the entire composition (the algorithm -– possibly made generic if templates are used), which may or may not suit a developer’s needs.

      In recent years, several research projects such as Search Computing222http://www.search-computing.it/ [68], mashArt [69], FAST333http://fast-fp7project.morfeo-project.org [70] and even our own old tool ResEval [71] spent substantial effort towards empowering end-users ( as for some of these tools refer end-users sometimes as expert users, to distinguish them from generic, completely unskilled users), with tools and methods for software development. In the following we look at this field from a different perspective and we elaborate on which paradigms and ingredients best aid end-users in performing development tasks, and most notably formulating complex tasks. We also discuss various dimensions of end-user programming, including vertical versus horizontal language definition, declarative versus imperative approaches.

      3.3 Enabling Practices and Techniques

      Enabling end-users to develop own applications or compose application programs by combing together the different pieces available online in the form of public web services, APIs or data in various forms, requires simplifying current end-user development practices. To this end, a variety of approaches may help simplify the end-user development, as also discussed a few of these approaches in the previous section. However, in this section we discuss in detail the most important ones, in order to use them in the next section to analyze these approaches that partly aim at supporting end-users for composing complex applications.

      3.3.1 Simple Programming Models

      The first issue is to understand which programming paradigms are best suited for end-user programming. The solution to this issue can take inspiration from existing experiences in the orchestration and mashup languages which are targeted at process automation and at relatively inexperienced users. Although they have not been that successful in reaching out to non-IT experts, as yet. The aim is to find programming abstractions that are simple enough to appeal to domain experts and at the same time complex enough to implement enterprise procedures and Web application logic.

      For instance, some mashup approaches heavily rely on connections between components, which is for instance, the case of Yahoo! Pipes and IBM Damia [72], and therefore are inherently imperative; other solutions completely disregard this aspect and only focus on the components and their pre- and post-conditions for automatically matching them, according to a declarative philosophy like the one adopted in choreographies. For instance, as also stated in the FAST European project [70].

      3.3.2 Domain-specific Modeling.

      The idea of focusing on a particular domain and exploiting its specificities to create more effective and simpler development environments is supported by a large number of research works [73] [74] [75] [76]. Mainly these areas are related to Domain Specific Modeling (DSM) and Domain Specific Language (DSL).

      In DSM, domain concepts, rules, and semantics are represented by one or more models, which are then translated into executable code. Managing these models can be a complex task that is typically suited only to programmers but that, however, increases users’ productivity. This is possible thanks to the provision of domain-specific programming instruments that abstract from low-level programming details and powerful code generators that ”implement” on behalf of the modeler. Studies using different DSM tools (e.g., the commercial MetaEdit+ tool and academic solution MIC [73]) have shown that developers’ productivity can be increased up to an order of magnitude.

      3.3.3 Domain-specific Languages (DSLs)

      Simple programming models are not enough. Typically, end-users simply do not understand what they can do with a given development tool, a problem that is basically due to the fact that the development tools does not speak the language of the user and, hence, programming constructs do not have any meaning to the user. Domain-specific languages aim at adding domain terminology to the programming model, in order to give constructs domain meaning.

      In the DSL context, although we can find solutions targeting end-users (e.g., Excel macros) and medium skilled users (e.g., MatLab), most of the current DSLs target expert developers (e.g., Swashup [77]). Also here the introduction of the ”domain” raises the abstraction level, but the typical textual nature of these languages makes them less intuitive and harder to manage and less suitable for end-users compared to visual approaches. A number of benefits and limits of the DSM and DSL approaches are summarized in [76] and [75].

      In some fields, such as database design, domain-specific languages are a consolidated practice: declarative visual languages like the ER model are well accepted in the field. Other, more imperative approaches, like WebML [78], address developers that are willing to embrace conceptual modeling. Business people, on the other hand, are well aware of workflow modeling practices and are able to work with formalisms like BPMN, completely ignoring what happens behind the scenes both in terms of technological platform and of transformations applied to get to a running application. Another example in this category is Taverna [79], a workflow management system well known in the biosciences field. As DSL approach is more closely related to our proposed solution so we present a more precise classification of DSLs in Section 3.4.

      3.3.4 Web Service Composition.

      BPEL (Business Process Execution Language) [80] is currently one of the most used solutions for web service composition, and it is supported by many commercial and free tools. BPEL provides powerful features addressing service composition and orchestration but no support is provided for UI integration. This shortcoming is partly addressed by the BPEL4People [81] and WS-HumanTask [82] specifications, which aim at introducing also human actors into service compositions. Yet, the specifications focus on the coordination logic only and do not support the design of the UIs for task execution. In the MarcoFlow project [11], they provide a solution that bridges the gap between service and UI integration, but the approach, however, is still complex and only suited for expert programmers.

      3.3.5 Intuitive Interaction Paradigms

      The user interfaces of development tools may not be a complex theoretical issue, but acceptance of programming paradigms can be highly influenced by this aspect too. The user interface comprises, for instance, the selection of the right graphical or textual development metaphor so as to provide users with intelligible constructs and instruments. It is worth investigating and abstracting the different kinds of actions and interactions the user can have with a development environment (e.g., selecting a component, writing an instruction, connecting two components), to then identify the best mix of interactions that should be provided to the developer.

      3.3.6 Reuse of Development Knowledge

      Finally, even if a tool speaks the language of the user, it may still happen that the user does not speak the language of the tool, meaning that he/she still lacks the necessary basic development knowledge in order to use the tool profitably. Such a problem is typically solved by asking more expert users (e.g., colleagues or developers) for help – if such is available. The challenge is how to reuse or support the reuse of development knowledge from more expert users in an automated fashion inside a tool, e.g., via recommendations of knowledge [83].

      Recommendations can be provided based on several kinds of information, including components, program specifications, program execution data, test cases, simulation data, and possibly mockup versions of components and program fragments used for rapid prototyping. Information may or may not be tagged with semantic annotations. When present, the annotations can be used to provide better/more accurate measures of similarity and relevance. In a general sense, the approach we envision is an alternative to design patterns for exploiting the expertise of good developers, thus allowing reuse of significant designs.

      Programming, testing, and prototyping experiences of peers or of more experienced developers may support the entire development lifecycle. If knowledge is harvested and summarized from peers (e.g., by analyzing their mashup definitions), this opens the door to what we can call ”implicit collaborative programming” or ”crowd programming”, where users, while going through a software engineering lifecycle for implementing procedures of their own interest, create knowledge that can be shared and leveraged by other domain experts for their own work.

      3.4 Domain-Specific Languages: Discussion

      We have seen that Domain-Specific Languages (DSLs), i.e., design and/or development languages that are designed to address the needs of a specific application domain, are important to provide the end-user with familiar concepts, terminology and metaphors. That is, DSLs are particularly useful because they are tailored to the requirements of the domain, both in terms of semantics and expressive power (and thus do not enforce end-users to study more comprehensive general-purpose languages) and of notation and syntax (and thus provide appropriate abstractions and primitives based on the domain). In following we highlight a few possible classifications of these languages, which can become handy for EUD. In particular, we describe the dimensions of focus, style and notation.

      The focus of a DSL can be either vertical or horizontal. Vertical DSLs aim at a specific industry or field. Examples of vertical DSLs may include: configuration languages for home automation systems, modeling languages for biological experiments, analysis languages for financial applications, and so on. On the other side, horizontal DSLs have a broader applicability and their technical and broad nature allows for concepts that apply across a large group of applications. Examples of horizontal DSLs include SQL, Flex , WebML , and many others.

      The style of a DSL can be either declarative or imperative. Declarative DSLs adopt a specification paradigm that expresses the logic of a computation without describing its control flow. In other words, the language defines what the program should accomplish, rather than describing how to accomplish it. Imperative DSLs instead specifically require defining an executable algorithm that states the steps and control flow that needs to be followed to successfully complete a job.

      The notation of a DSL can be either graphical or textual. The graphical DSLs (also known as Domain Specific Modeling Languages, DSML) imply that the outcomes of the development are visual models and the development primitives are graphical items such as blocks, arrows and edges, containers, symbols, and so on. The textual DSLs comprise several categories, including XML-based notations, structured text notations, textual configuration files, and so on.

      Despite the various experiences in DSL design and application, there is no general assessment on the preferences of the developers for one or the other kind of language depending on the user profile. However, typically languages oriented to the end-users tend to be more visual and declarative, while the ones for developers are often textual and imperative.

      3.5 Mashups from an End-User Development Prospective

      3.5.1 Web 2.0 & Enabling Technologies

      During the last decade, the advent of Web 2.0 has been drastically and successfully proved as an enabling environment for normal web users to enable them to involve into the creation and consumption of Web resources of various types, like blogs, Wikis, Social Media etc. In respect to Web 1.0 which was known as ”web as information source”, web 2.0 is called ”web as participation platform”. Of the major key features of Web 2.0 from EUD point of view include ”rich user experience”, ”user as a contributor”, and ”user participation”.

      Among relevant Web 2.0 technologies, the Service-Oriented Architecture (SOA) field emerged as a paradigm in software development. The emerging visions of an Internet of Services (IoS) and a Web Service Ecosystems [84] [85] supported SOA and have shown much potential in the field. However, the major focus of these technologies remained on the technical level of a service to service based interactions systems [86] and a little on service to user (i.e., non-programmer) communication. Due to the high technical complexity of the relevant standards (e.g., WSDL, SOAP, UDDI, REST), we think that doing so is even harder than enabling non-programmers to model an own process or service composition, because developing full applications is simply complex as it require a lot of programming knowledge to deal with data, application and presentation issues

      In parallel to Web 2.0 technologies evolution, the Web mashup [87] phenomenon emerged, which provided easier ways to glue these services and data together [88] and claiming to enable also non-programmers to use and mash pre-built components that provide an abstraction of complex programming concepts. Before investigating further on mashups and to set the context, lets just introduce the terminologies that are mostly used. Typically the term mashup refers to those web applications that, rather than being developed from scratch, are developed using various available data, functionalities or user interfaces over the Web. While, Mashup tools, provide development and runtime environments for the composition and execution of applications (i.e., mashup applications) to non-programmers to enable them to create their own situational applications [89].

      Based on the Web 2.0 philosophy, a new type of mashup-based approach emerged, which is known as Enterprise Mashups [90]. As it is more adapted and evolved in large companies where more rapid requirements require employees to be dealt with more sophisticated information technologies. Within an organization the key components of Enterprise Mashups include ”resources”, ”widgets” and ”mashups”, that deal with data (i.e., actual content), application logic (i.e., implementing actual business logic) and mashup application (i.e., assembling together a collection of widgets) respectively. In [91] author introduces mashup concepts and present a mashup model for syntactically composing mashups. In this model, a mashup is defined as a network of mashlets. These mashlets are the main mashup components and consist of a set of relations, e.g. internal relations, I/O relations and web service relations. They can be GUI-based and can be organized in a hierarchical way, i.e., complex mashlets can contain simpler ones. In this model mashlets are defined by means of rules that state which the input, the output and the possible services calls are. The authors also explain the necessity of allowing the user to query and update the data dynamically in the mashup, as well as to add, update or remove mashlets at run time.

      As mashups aim to bring together the benefits of both simplification and component reuse. We believe that, in order to make application development from programmers-centric to end-user centric, we need to achieve simplicity from both ends (i.e., from the technology as well as from end-user ends). While the component-based reuse approach is certainly lowering part of the complexity, developing an own application, however, also means dealing with data integration, application logic, and content presentation issues, all aspects the common web user is not even aware of. However, in the case of domain-specific mashup environments, as also in our case, we aim to push simplification even further compared to generic mashup platforms by limiting the environment (and, hence, its expressive power) to the needs of a single, well-defined domain only. Reuse is supported in the form of reusable domain activities, which can be mashed up.

      3.5.2 Tool-Assisted Mashup Development

      In this section we present and review a number of representative mashup tools, and evaluate them based on those main aspects we consider fundamental for addressing real-life end-user needs. Of the main assessment aspects, the support for the integration of data, services and user interface is fundamental. This functionality is known as universal integration. Moreover, we also present our analysis based on the requirements we gathered during our domain analysis, those best suited for end-users, which are intuitiveness of UI, modeling constructs, execution paradigm, and data-mappings.

      Yahoo Pipes!444http://pipes.yahoo.com is a well-known mashup tool by Yahoo. It provides a number of built-in components and a visual composition editor that allows to design data processing logics. The Yahoo Pipes composition editor offers a set of components and works in a drag-drop fashion, where a user can drop, connect and configure components. Yahoo pipes is quite an attractive with its composition environment, which allows web users to make data centric compositions. The components follow data-flow based approach, as each component waits until data becomes available at its input port. The data-flow based approach is more intuitive for an end-user as compared to a control-flow approach as long as it stays trivial. As in case of Yahoo pipes, it mainly offers a very technical set of components (i.e., modeling constructs) like loops, regular expression, URL builder, RSS feeds etc. that makes composition task complex for a non-technical users. A non-technical user by no means can understand these components and consequently unable to make compositions. Moreover, these programming-related components, which require the basic expertise of programming concepts, may have multiple input and configuration parameters through which connection between two components take place. In essence, the complex data mappings have to be performed to compose a valid mashup. Moreover, Yahoo pipes does not support UI integration, and support for service integration is still poor and is out of an end-user technical reach.

      Likewise, instead of domain-specific, the generic nature of the components that Yahoo Pipes offers are only understandable by programmers. However, we believe that a domain expert (i.e., a non-IT user) is still not able to get fruitful results from this tool. Because, one of the main reasons is that it restricts domain experts by not offering those domain constructs and terminologies they are familiar with. It is almost as generic as only understandable by IT experts as it exposes programming notations.

      Microsoft Popfly555http://www.popfly.ms Among the other popular mashup-based tools, Microsoft introduced Microsoft Popfly. This mashup tool targets universal integration (i.e., data, application and UI). Of the other tools (e.g., game creator, web creator) it offers mashup creator, a tool that offers pre-built components and let users to mash them together to make applications, but also in this case the end-users were generally not able to develop real-life applications. Popfly has been discontinued from August 2009 onwards.

      Intel Mash Maker [92] provides a different kind of mashup approach which mainly focuses on data (i.e., online content) of a user’s interest. Intel mash maker is a Firefox extension that runs on the client-side browser and adds a toolbar to the browser with a set of buttons representing various functionalities. It basically monitors the user’s behavior that checks what information a user visit or is interested in and automatically builds a mashup application that could be of interest to the user, even when the user was not aiming at building a mashup application. The tool mainly extracts relevant data from Web pages but does not provide any data integration functionality. Moreover, no UI or data presentation features are provided, likewise it does not allow service composition. The proper use of the tool, especially use of the advanced features, requires programming skills that non-technical end-users lack.

      mashArt Among the academic projects a noticeable example is mashArt [93] project. The tool mainly aimed at a universal integration approach for UI components and end-user centric development. Aiming at these objectives, mashArt comes with models and languages able to accommodate all the three types of needed components (i.e., data, services and UIs) and with a simple web-based editor and an integrated lightweight runtime environment (allowing for instantaneous previewing) targeted at non IT-expert –skilled web users. Although mashArt achieved universal integration, yet it is not able to effectively target end-users. The tool does not solve the problem of complex data mappings, and also mainly the components that mashArt provides are of generic types.

      JackBe Presto JackBe Presto666http://www.jackbe.com is one of the popular commercial products. The Presto suite is constituted by several distinct tools. One constitutes the composition development environment, Presto Wires, which adopts a Pipes-like approach for mashing up data from enterprise internal and external sources. It also allows a portal-like aggregation of UI widgets (so-called mashlets developed through the Presto Mashlet tool) visualizing the output of such mashups on a dashboard. Each mashlet is independent from the others, thus, and the synchronization at presentation level is limited. This enterprise solution focuses on integrating enterprise internal or external data and on visualizing them in the form of widgets. The portal-like approach, in general, provides a satisfying level of usability for end-users. However, universal integration is not actually achieved.

      Taverna is a mashup like application, which allows the integration of the existing data sources (i.e, molecular biology sources) available on the Web [94; 95]. The tool allows users to design, execute and share workflows made-up using web services in the domain of molecular biology and bio-informatics. Components can be added and connected visually in a drag drop way and different kinds of services can be added to the service panel of the tool. Because mashups are intended to integrate data from one or more sources, the previous version of Taverna [94] cannot be considered as a mashup tool, since it only focused on the integration of services, but the current version, named Taverna 2 [95], provides support for data streaming through pipelining and so data-driven workflow computation can be performed. Despite many claims, although the tool focuses on a particular domain, even then it is not suitable for the non-technical users because of its complexities related to the web service usage, and complex data mapping etc.

      Similarly, the CRUISe project [96] specifically focuses on composability and context-aware presentation of UIs, but does not support the seamless integration of UI components with web services.The ServFace project 777http://www.servface.eu, instead, aims to support normal web users in composing semantically annotated web services. The result is a simple, user-driven web service orchestration tool, but UI integration and process logic definitions are rather limited and again basic programming knowledge is still required.

      Although a number of other mashup-related tools and platforms exist (e.g., Deri Pipes888http://pipes.deri.org/, , Dapper999http://dapper.net/, to name a few), they all show similar limitations as of the others presented solutions (i.e., lack of universal integration support and/or simplicity of use for non-technical users, complex data mappings and so on). Our analysis on current the mashup initiatives highlights that none of the proposed solutions is able to successfully empower end-users to develop the applications actually supporting their daily activities. This is mainly due to the fact that, although through intuitive visual metaphors, most of them still expose programming concepts which, according to [97], have semantics that end-users do not understand and do not want to learn. In the following section we summarize and present of analysis of all of these directions.

      3.6 Analysis and Discussion

      End user development comprises several alternative approaches, spanning from mashup development, to software configuration, to simple programming tasks. These approaches are often authentic, but sometimes they can be combined together to exploit the respective strength points [98]. For instance, while users are getting more and more used to configure applications, also thanks to the pervasiveness of mobile and gaming software, mashup platforms for the development of simple Web applications are also gaining popularity.

      Yet, mashups were actually born as a hacking phenomenon, where very expert developers build applications by integrating reusable content and functionality sourced from the Web, for instance, see programmableweb101010www.programmableweb.com, and – despite the numerous attempts – mashup development is still for skilled programmers only. For instance, a very popular mashup tool Yahoo Pipes! (as mentioned in the previous section) provides a mashup environment with a variety of components. These components wrapped very generic programming features thus providing a set of high-level functionalities such as loops, if-conditions, parameter passing, web-service binding etc. These high-level functionalities neither understandable by end-users nor used in their daily life application development purposes.

      Actually, mashup tools initially targeting end-users slowly moved towards the expert user, then to the developer, and finally to the expert developer. To this end, in fact, both model-driven web engineering [99] and mashup development [69] has shown that there are basically only two users classes in the real world. The first class represents developers, who want to see the source code and to write imperative code by their own. These users do not trust model-driven approaches, because they feel this can reduce their freedom in application development. The second class represents non-developers, who want to ignore all the technical issues and have simple, possibly visual or parameter-based configuration environments for setting up their applications.

      A possible stratification of users into ”developer” class could be expert users, entry-level developers, developer/designer that can be theoretically defined does actually not exist. Recognizing the distinction of only two major user classes, empowering non-developers become more focused and challenging, yet non-trivial. As presented in this chapter that many approaches have been proposed to help these users develop their own applications, and we see largely they failed to do so. Among the reasons non-technical users found these enabling solutions difficult for their practical use is the language they speak, which is what constructs, concepts, modeling paradigm they use, is not understandable by the users. To provide non-developers, which is our target user class, an end-user development platform whose main theme to speak the language of a user. That means, we present a domain-specific approach that leverage mashup strengths to offer an intuitive, easy-to-use yet flexible end-user development platform that would ultimately speak a user’s language by incorporating domain concepts, terminologies, rules, and syntax a user is familiar with.

      Chapter 4 Research Evaluation Example Scenarios and Requirements Understanding

      4.1 Overview

      To obtain important conceptual as well as those low level details of a domain that can never be considered and incorporate without thorough analysis, we first present a few real evaluation procedures related to the domain of research evaluation. For our selected domain, we asked and gather different evaluation procedures from different domain-experts working in different departments in our and other Universities. The domain-experts who perform or were involved in these evaluation tasks include professors, PostDoc, administrative personnel and also PhD students. They were involved in some kind of research evaluation tasks ranging from simple tasks to complex ones. The obtained procedures helped us to examine the domain thoroughly and so to extract domain as well as users’ requirements. In the following sections we state these evaluation procedures , their relevant details to better understand the problems, requirements, and associated important concepts. In the end we also present a set of general requirements those extracted from the analysis of all the procedures.

      4.2 University of Trento Department Evaluation Procedure

      As an example of a domain-specific application scenario, let us describe the evaluation procedure used by the central administration of the University of Trento (UniTN) for checking the productivity of each researcher who belongs to a particular department. The evaluation is used to allocate resources and research funds to the university departments. In essence, the algorithm compares the quality of the scientific production of each researcher in a given department of UniTN with respect to the average quality of researchers belonging to similar departments (i.e., departments in the same disciplinary sector) in all Italian universities. Impact measure of each researcher then collectively contributed to their particular department. The comparison uses the following procedure based on one simple bibliometric indicator, i.e., a weighted publication count metric.

      1. A list of all researchers working in the selected department as well as in the Italian universities is retrieved from a national registry, and a reference sample of faculty members with similar statistical features (e.g., belonging to the same disciplinary sector) of the evaluated department is compiled.

      2. Publications for each researcher of the selected department and for all Italian researchers in the selected sample are extracted from an agreed-on data source (e.g., Microsoft Academic, Scopus, DBLP, etc.).

      3. The publication list obtained in the previous step is then weighted using a venue classification. That is, the publications are classified by an internal committee in three categories, which represent quality of a particular venue, mainly based on ISI Journal Impact Factor: A/1.0 (top), B/0.6 (average), C/0.3 (low). For each researcher a single weighted publication count parameter is thus obtained with a weighted sum of his/her publications.

      4. The statistical distribution – more specifically, a negative binomial distribution – of the weighted publication count metric is then computed out of the Italian researchers’ reference sample.

      5. Each researcher in the selected department is then ranked based on his/her weighted publication count by comparing this value with the statistical distribution. That is, for each researcher the respective percentile (e.g., top 10%) in the distribution of the researchers in the same disciplinary sector is computed.

      In Figure 4.1 we illustrate the steps a user has to perform to complete the described evaluation task. As it is shown in the Figure (step-1), the process starts from fetching researchers from UniTN local repository, and then also fetching list of all the researchers those belong to all Italian universities from a national repository (i.e., a web site, which provides data in excel format) those belong to the same discipline as of the UniTN discipline. In step 2, user has to retrieve publications from a data source for both UniTN and Italian researchers, a task that is beyond an effort a human can perform. Next, these publications must be annotated with the venue classification defined by the University management. That means, each publication is assigned a weight depending what venue it belongs to. These annotated publications are then used to compute the statistical distribution (i.e., negative binomial distribution) and then ranked based on the percentile accordingly. Finally, the results have to be presented in some visual format (e.g., charts, graphs etc.).

      Figure 4.1: University of Trento department evaluation procedure, depicting steps a user performs manually

      The percentile for each researcher in the selected department is considered as an estimation of the publishing profile of that researcher and is used for comparison with other researchers in the same department. As one can notice, plenty of effort is required to compute the performance of each researcher, which is currently mainly done manually. Fervid discussion on the suitability of the selected criteria often arises, as people would like to understand how the results would differ changing the publications ranking, the source of the bibliometric information, or the criteria of the reference sample. Indeed all these factors have a big impact on the final result and have been locally at the center of a heated debate. Many researchers would like to use different metrics, like citation-based metrics (e.g., h-index). Yet, computing different metrics and variations thereof is a complex task that costs considerable human resources and time and thus beyond human capacity.

      4.3 Italian Professorship Selection Scenario

      This section presents an evaluation procedure, which was adopted by the National Agency for the Evaluation of Universities and Research Institutes (ANVUR) in 2012 for hiring and promoting professors. The actual procedure is written in Italian language, however, in following we present the English translation.

      According to the original evaluation procedure document, it states that based on the regulations for national scientific qualification establish that some of the indicators/indexes, when used for candidates for the national scientific qualification, should be normalized according to the academic age (i.e., the number of years starting from the first publication of a researcher) of the candidate. The normalization criteria varies based on a particular type of research output. In following we describe the normalization procedure in detail.

      1. The number of articles in magazines/journals present in the major international databases and published in the consecutive 10 years previous to the date of publication of the decree (i.e., another regulation), normalization must be performed only if academic age is 10 years, and will be performed multiplying number of articles by 10, and dividing by academic age.

      2. The total number of received citations related to the whole scientific production, normalization should be performed dividing the number of citations by academic age.

      3. The number of books with ISBN published in the consecutive 10 years previous to the date of publication of the decree, the normalization must be done only if academic age is 10 years, and is performed multiplying the number of articles by 10, and dividing by the academic age.

      4. The articles in magazines/journal and chapters in books with ISBN published in the consecutive 10 years previous to the date of publication of the decree, only if academic age is 10 years, normalization is done multiplying number of articles in magazines and chapters of books by 10 and dividing by the academic age.

      5. The number of articles in magazines/journals that belong to ”class A” published in the consecutive 10 years previous to the date of publication of the decree, normalization must be done only if academic age is 10 years, and will be performed multiplying the number of articles by 10, and dividing by the academic age.

      In addition to the normalizations that described above, the procedure also uses a customized version of the h-index. The customized version is called contemporary h-index. The ch-index is different from the , that is, it uses normalized citations of the normalized papers those selected for evaluation. The ch-index is defined using following formula.

      for

      where the value of is the number of citations observed in the database at time for i-th article. is the year of publication of the article. Thus is the value of citation indicator for the i-th article at time .

      The normalized results of all the indicators are then used to compare the threshold values that ANVUR has selected as the research quality threshold for a specific research area. The intention is that a candidate performing well above the defined thresholds will then be considered for hiring or promotion.

      4.4 Analysis and Domain-Specific Requirements

      If we carefully look at the described scenarios, we see that these are all of domain-specific type, i.e., these are entirely based on concepts that are typical of the research evaluation domain. For instance, the described evaluation procedures process domain objects (researchers, publications, metrics, and so on), use domain-specific computation logic, specific data sources (i.e., localized venue classification), customized evaluation metrics (i.e., ch-index), and likewise these procedures use a set of domain-specific normalization rules. Despite many research evaluation approaches and tools those are made for evaluation purposes, as presented in the chapter 2, could hardly anticipate these domain-specific requirements thus resultantly failed to facilitate end-users.

      For example, the requirement we extract from these scenarios are that we need to empower people involved in the evaluation process (i.e., non-programmers, the average faculty member or the administrative persons in charge of it) so that they can be able to define and compose relatively complex evaluation processes, taking and processing data in various ways from different sources, and visually analyze the results. A tool having such provisions should allow to extract, combine, and process data and services from multiple sources, and to integrate these ingredients as user-defined way, finally representing the information in visual components. These are all the characteristics that a mashup can have, especially if the mashup logic comes from the users.

      In order to enable the development of an application for the described evaluation procedures, there is no need for a composition or a mashup environment that supports as many composition technologies or options as possible. The intuition we elaborate is that, instead, a much more limited environment that supports exactly the basic tasks described in the scenarios (e.g., fetch the set of Italian researchers) and allows its users to mash them up in an as easy as possible way (e.g., without having to care about how to transfer data between components) will be more effective. However, to this end, the challenge lies in finding the right trade-off between flexibility and simplicity. The former, for example, pushes toward a large number of basic components, the latter towards a small number of components. As we will see, it is the nature of the specific domain that tells us where to stop.

      To convey better understandings, in the following sections, we will therefore show how the development of a mashup tool that capable to run these example scenarios can be aided by being domain-specific. Moreover, based on the type of people involved who perform these evaluation tasks, we learned a number of requirements that we present in following. Turning the previous consideration into practice, the development of this tool will be driven by the following key principles:

      4.4.1 End-user centric requirements

      1. Intuitive user interface Enabling domain experts to develop their own research evaluation metrics, i.e., mashups, requires an intuitive and easy-to-use user interface (UI) both in terms of a tool’s overall user experience as well as the modeling metaphors used for building mashups based compositions. For example, starting from the very first step, that is when users choose some components, those themselves be visually understandable for the users.

      2. Intuitive modeling constructs Next to the look and feel of the platform, it is important that the functionalities provided through the platform (i.e., the building blocks in the composition design environment) resemble the common practice of the domain. For instance, we need to be able to compute metrics, to group people and publications, and so on.

      3. No data mapping Our experience with prior mashup platforms, i.e., mashArt [10] and MarcoFlow [11], has shown that data mappings are one of the least intuitive tasks in composition environments and that non-programmers are typically not able to correctly specify them. We therefore aim to develop a mashup platform that is able to work without the definition of data mappings.

      4. Intuitive execution paradigm When it comes to the question about how mashup tools, during run-time, exchange and flow data between components, end-users feel unattended about what is happening behind the scene. However, we aim to follow a data flow paradigm that end-users are familiar with in their daily life work. Moreover, we aim to reflect the execution states so that they become aware of what is being processed and how.

      The state of the art analysis about end-user development and mashup presented in the chapter 3 show that service composition, business process management (BPM), and mashup tools fail in providing end-users with intelligible concepts and constructs. That is because of various reasons like, complex user-experience, complex modeling constructs (i.e., components), complex data mappings and so on. Moreover, we will see that the naive approach of simply equipping a mashup tool with a set of domain-specific components is not enough, in order to obtain a tool that can be called domain-specific and that can be amenable to end-users require a comprehensive analysis of domains to be considered along with a proper methodology that we present in the next chapter.

      Chapter 5 End-User Oriented Mashup Platform Development Methodology

      5.1 Overview

      In the previous chapter, we presented a few real research evaluation scenarios, their analysis and the requirements that must be addressed for the practical success of a mashup tool. In chapter 3, we presented in detail various aspects related to the mashups and end-user development. We reported on the well-known approaches and also analyzed that these approaches, to a large extent, failed to facilitate end-users for their daily life development needs. We mainly identified that the generic nature of these approaches restricted end-users to comfortably adapt them. The reason behind is the interaction gap between the two sides (i.e., the end-user and technology). An end-user (i.e., a domain expert) lives and knows better within his domain of expertise, whereas, demands for more technical interaction kept increasing that certainly keeping apart both ends.

      However, in this chapter we present our proposed methodology for the development of mashup based tools that can lower the barriers for end-users by providing them a tool that speaks their language. For this reason, throughout this chapter we show how we have developed a mashup platform for our reference domain, in order to illustrate how its development can tackle the challenges systematically mentioned in the previous chapters. The development of the platform has allowed us to conceptualize the necessary tasks and ingredients and to structure them into a methodology for the development of domain-specific mashup platforms. The methodology encodes a top-down approach, which starts from the analysis of the target domain and ends with the implementation of the specifically tailored mashup platform. In the next section, we first start from the essential concepts and definitions which are required to be defined before we proceed to the domain analysis step.

      5.2 Concepts & Definitions

      Before going into the details, we introduce the necessary concepts. First of all, leveraging from the interpretation of web mashups [87]:

      • A missingweb mashup (or mashup) is a web application that integrates data, application logic, and/or user interfaces (UIs) sourced from the Web. Typically, a mashup integrates and orchestrates two or more elements.

      Most of the scenarios mentioned in chapter 4 require all three ingredients listed in the definition: we need to fetch researchers and publication information from various Web-accessible sources (the data); we need to compute indicators and rankings (the application logic); and we need to render the output to the user for inspection (the UI). We generically refer to the services or applications implementing these features as components. Components must be put into communication, in order to support the described evaluation algorithm.

      Simplifying this task by tailoring a mashup tool to the specific domain of research evaluation first of all requires understanding what a domain is. We define a domain and, then, a domain-specific mashup as follows:

      • A missingdomain is a delimited sphere of concepts and processes; domain concepts consist of data and relationships; domain processes operate on domain concepts and are either atomic (activities) or composite (processes integrating multiple activities), defined according to domain rules.

      • A missingdomain-specific mashup is a mashup that describes a composite domain process that manipulates domain concepts via domain activities and processes following domain rules. It is specified in a domain-specific, graphical modeling notation.

      A domain-specific mashup is therefore a web mashup specified with a domain-specific model. The domain defines the ”universe” in the context of which we can define domain-specific mashups. It defines the information that is processed by the mashup, both conceptually and in terms of concrete data types (e.g., XML schemas). It defines the classes of components that can be part of the process and how they can be combined, as well as a notation that carries meaning in the domain (such as specific graphical symbols for components of different classes).

      As we will see later in detail, every mashup can only use components that conform to the domain process model and that exchange data which belongs to the conceptual model. This means that each component can send or receive data based on the entities or relationships of the conceptual model. Finally, the domain defines rules that represent invariants to be met by each mashups. It has a static part, which describes the concepts that are proper of the domain, and a dynamic part, which describes the modifications the concepts may be subject to. For instance, in our reference scenario, concepts include publications, researchers, metrics, etc. The process models define classes of components such as data extraction from digital libraries, metric computation, or filtering and aggregation components. A domain rule could, for instance, disallow the use of a specific information source for the computation of a given metric. These domain restrictions and the exposed domain concepts at the mashup modeling level is what enables simplification of the language and its usage.

      Generic mashup tools are neither aware of these concepts, nor of these operations. Given Definition 5.2 we can therefore say that our reference scenarios ask for a mashup that is specific to the domain of research evaluation, i.e., it asks for a domain-specific mashup. So following this we can define a domain-specific mashup tool as:

      • A missingdomain-specific mashup tool (DMT) is a development and execution environment that enables domain experts, i.e., the actors operating in the domain, to develop and execute domain-specific mashups via a syntax that exposes all features of the domain.

      A DMT is initially ”empty”. It then gets populated with specific components that provide functionality needed to implement mashup behaviors. For example, software developers (not end-users) will define libraries of components for research evaluation, such as components to extract data from Google Scholar, or to compute the h-index, or to group researchers based on their institution, or to visualize results in different ways. Because all components fit in the classes and interact based on a common data model, it becomes easier to combine them and to define mashups, as the DMT knows what can be combined and can guide the user in matching components. The domain model can be arbitrarily extended, though the caveat here is that a domain model that is too rich can become difficult for software developers to follow.

      5.3 Challenges and problems

      Given these definitions, the problem we solve is that of providing the necessary concepts and a methodology for the development of domain-specific mashup models and DMTs. The problem is neither simple nor of immediate solution. While domain modeling is a common task in software engineering, its application to the development of mashup platforms is not trivial. For instance, we must precisely understand which domain properties are needed to exhaustively cover all those domain aspects that are necessary to tailor a mashup platform to a specific domain, which property comes into play in which step of the development of the platform, how domain aspects are materialized (e.g., visualized) in the mashup platform, and so on.

      The DMT idea is heavily grounded on a rich corpus of research in Human-Computer Interaction (HCI), demonstrating that consideration of user knowledge and prior experience are required to create truly usable and inclusive products, and are key considerations in the performance of usability evaluations [100]. The prior experience of products is important to their usability, and the transfer of previous experience depends upon the nature of prior and subsequent experience of similar tasks [101]. Familiarity of the interface design, its interaction style, or the metaphor it conforms to if it possesses one, are key features for successful and intuitive interaction [102].

      More familiar interfaces, or interface features, allow for easier information processing in terms of user capability, and the subsequent human responses can be performed at an automatic and subconscious level. [103] identified that the use of semantics could be an effective tool for enhancing product design and use, particularly for novel users, as they can indicate how the product or interface will behave and how interaction is likely to occur. Similarly, [104] stressed that to be usable and accessible, interfaces need to be easily understood and learned, and in the process, must cause minimal cognitive load. Effective interaction consists of users understanding potential actions, the execution of specific action, and the perception of the effects of that action.

      As we cannot exploit the users’ technical expertise, we propose here to exploit their knowledge of the task domain. In other words, we intend to transform mashups from technical tools built around a computing metaphor to true cognitive artifacts [105], capable to operate upon familiar information in order to ”serve a representational function that affect human cognitive performance.”

      5.4 Methodology

      In order to develop a DMT, we have to look into the details of three incremental aspects, i.e., the domain concepts, the domain processes, and the implementation of the DMT. In following we state and define all the ingredients for developing a domain-specific mashup platform. Specifically, developing a domain-specific mashup platform requires:

      1. Definition of a domain concept model (DCM) to express domain data and relationships. The concepts are the core of each domain. They drive the implementation of the DMT and of its data types and components. It is therefore crucial to precisely delimit the concepts that characterize the domain, in order to instruct the tool how to use them and to develop components that understand them. The specification of domain concepts allows the mashup platform to understand what kind of data objects it must support. This is different from generic mashup platforms, which provide support for generic data formats, not specific objects.

      2. Identification of a generic mashup meta-model111We use the term meta-model to describe the constructs (and the relationships among them) that rule the design of mashup models. With the term instance we refer to the actual mashup application that can be operated by the user. (MM) that suits the composition needs of the domain and the selected scenarios. A variety of different mashup approaches, i.e., meta-models, have emerged over the last years, e.g., ranging from data mashups, over user interface mashups to process mashups. Before thinking about domain-specific features, it is important to identify a meta-model that is able to accommodate the domain processes to be mashed up.

      3. Definition of a domain-specific mashup meta-model. Given a generic MM, the next step is understanding how to inject the domain into it so that all features of the domain can be communicated to the developer. We approach this by specifying and developing:

        1. A domain process model (PM) that expresses classes of domain activities and, possibly, ready processes. Domain activities and processes represent the dynamic aspect of the domain. They operate on and manipulate the domain concepts. Injecting the domain into the tool means introducing domain-specific extensions into the mashup meta-model, e.g., to take into account the nature of domain activities. The activities that can be composed in order to form new processes indicate which mashup components in terms of data, application logic, and UI components are needed to implement the domain-specific mashups. In the context of mashups, we can map activities and processes to reusable components of the platform.

        2. A Domain rule model that may constrain the use of processes or activities, in order to guarantee the correct use of concepts and components in the tool. We specify domain rules in a domain rule model.

        3. A domain syntax that provides each concept in the domain-specific mashup meta-model (the union of MM and PM) with its own symbol. The claim here is that just catering for domain-specific activities or processes is not enough, if these are not accompanied with visual metaphors that the domain expert is acquainted with and that visually convey the respective functionalities.

        4. A set of instances of domain-specific components. This is the step in which the reusable domain-knowledge is encoded, in order to enable domain experts to mash it up into new applications.

      4. Implementation of the DMT as a tool whose expressive power is that of the domain-specific mashup meta-model and that is able to host and integrate the domain-specific activities and processes.

        1. DMT. The DMT must support all features that are specified in both the domain-specific mashup meta-model and the domain concept model. Specifically, the extended mashup meta-model determines the expressive power of the DMT.

        2. Components. The components instantiate the concepts in the domain-specific meta-model extension and implement the domain activities identified in step 3(d).

      The above steps mostly focus on the design of a domain-specific mashup platform. Since domains, however, typically evolve over time, in a concrete deployment it might be necessary to periodically update domain models, components, and the platform implementation (that is, iterating over the above design steps), in order to take into account changing requirements or practices. The better the analysis and design of the domain in the first place, the less modifications will be required in the subsequent evolution steps, e.g., limiting evolution to the implementation of new components only.

      In the next subsections, we expand each of the above design steps starting from the domain concept model.

      5.5 The Domain Concept Model

      It is important to precisely delimit the concepts that characterize the domain, in order to instruct the tool how to use them and to develop components that understand them. We specify domain knowledge in the form of a domain concept model. The domain concept model is constructed by the IT experts via verbal interaction with the domain experts or via behavioral observation of the experts performing their daily activities and performing a suitable task-analysis. The heart of each domain is represented by the information items each expert of that domain knows and understands.

      The concept model represents the information experts know, understand, and use in their work. Modeling this kind of information requires understanding the fundamental information items and how they relate to each other, eventually producing a model that represents the knowledge base that is shared among the experts of the domain. In domain-specific mashups, the concept model has three kinds of stakeholders (and usages), and understanding this helps us to define how the domain should be represented.

      • The first stakeholders are the mashup modelers (domain experts), i.e., the end-users that will develop different mashups from existing components. For them it is important that the concept model is easy to understand, and an entity-relationship diagram (possibly with a description) is a commonly adopted technique to communicate conceptual models.

      • The second kind of stakeholders are the developers of components, which are programmers. They need to be aware of the data format in which entities and relationships can be represented, e.g., in terms of XML schemas, in order to implement components that can interoperate with other components of the domain.

      • The third stakeholder is the DMT itself, which enforces compliance of data exchanges with the concept model.

      Therefore:

      • The missingdomain concept model (DCM) describes the conceptual entities and the relationships among them, which, together, constitute the domain knowledge.

      A DCM is an example of data that is used to be as input to or output from a mashup component. Modeling DCM is also an attempt to separate out what doesn’t vary much from what does in a particular domain. These first-class concept types are constrained by the domain rules.

      Figure 5.1: Domain concept model, covering main concepts required for the referenced research evaluation scenarios

      We express the domain-model as a conventional entity-relationship diagram. It also includes a representation of the entities as XML schemas. For instance, in Figure 5.1 we put only main concepts we could identify in our reference scenarios into a DCM, detailing entities, attributes, and relationships. The core element in the evaluation of scientific production and quality is the publication, which is typically published in the context of a specific venue, e.g., a conference or journal, by a publisher. It is written by one or more researchers belonging to an institution. Increasingly – with the growing importance of the Internet as an information source for research evaluation – also the source (e.g., Scopus, the ACM digital library or Microsoft Academic) from which publications are accessed is gaining importance, as each of them typically provides only a partial view on the scientific production of a researcher and, hence, the choice of the source will affect the evaluation result. The actual evaluation is represented in the model by the metric entity, which can be computed over any of the other entities.

      In order to develop a DMT, the ER (Entity-Relationship) model has to be generated through several interactions between the domain expert and the IT expert, who has knowledge of conceptual modeling. The IT expert also generates the XML schemas corresponding to the ER model, which are the actual artifacts processed by the DMT.

      In fact, although the ER model is part of the concept model, it is never processed itself by the DMT. It rather serves as a reference for any user of the platform to inform them on the concepts supported by it. In principle, other formalisms can be adopted (such as UML Class diagrams). We notice that each concept model implicitly includes the concept of grouping the entities in arbitrary ways, so groups are also an implicitly defined entity.

      5.6 The Generic Mashup Meta-Model

      When discussing the domain concept model we made the implicit choice to start from generic (i.e., domain-independent) models like Entity-Relationship diagrams and XML, as these are well established data modeling and type specification languages amenable to humans and machines. For end-user development of mashups, the choice is less obvious since it is not easy to identify a modeling formalism that is amenable to defining end-user mashups (which is why we endeavor to define a domain-specific mashup approach). If we take existing mashup models and simply inject specific data types in the system, we are not likely to be successful in reducing the complexity level. However, the availability of the DCM makes it possible to derive a different kind of mashup modeling formalism, as discussed next.

      To define the type of mashups and, hence, the modeling formalism that is required, it is necessary to model which features (in terms of software capabilities) the mashups should be able to support. Mashups are particular types of web applications. They are component-based, may integrate a variety of services, data sources, and UIs. They may need an own layout for placing components, require control flows or data flows, ask for the synchronization of UIs and the orchestration of services, allow concurrent access or not, and so on. Which exact features a mashup type supports are described by its mashup meta-model.

      Besides specifying a type or class of mashups, the mashup meta-model (MM) specifies how to draw the actual mashup (process) models. In the following, we first define a generic mashup meta-model, which may fit a variety of different domains, then we show how to define the domain-specific mashup meta-model, which will allow us to draw domain-specific mashup models.

      • The generic missingmashup meta-model (MM) specifies a class of mashups and, thereby, the expressive power, i.e., the concepts and composition paradigms, the mashup platform must know in order to support the development of that class of mashups.

      The MM therefore implicitly specifies the expressive power of the mashup platform. Identifying the right features of the mashups that fit a given domain is therefore crucial. For instance, our research evaluation scenario asks for the capability to integrate data sources (to access publications and researchers via the Web), web services (to compute metrics and perform transformations), and UIs (to render the output of the assessment). We call this capability universal integration. Next, the scenario asks for data processing capabilities that are similar to what we know from Yahoo! Pipes, i.e., data flows. It requires dedicated software components that implement the basic activities in the scenario, e.g., compute the impact of a researcher (the sum of his/her publications weighted by the venue ranking), compute the percentile of the researcher inside the national sample (producing outputs like ”top 10%”), or plot the department ranking in a bar chart. Figure 5.2 depicts our mashup meta-model that supports the universal integration and also enforce various rules that of a domain-specific type or a of generic nature. In following we describe the details of the proposed mashup meta-model.

      Figure 5.2: Mashup Meta-model supporting, domain-specific concepts, processes, rules, and universal integration

      5.6.1 The mashup meta-model

      We start from a very simple MM, both in terms of notation and execution semantics, which enables end-users to model own mashups.

      1. As shown in the Figure 5.2, a mashup , defined according to the meta-model MM, consists of a set of components , a set of connectors (i.e., data pipes) , a set of rules , a set of view ports that can host and render components with own UI, and a layout that specifies the graphical arrangement of components.

        A mashup compiles with a set of rules. These rules can be of various types, for example, data type inclusion: validates when a new data type introduced to the mashup; data type dependency confirms inheritance of those data types that are already exist;component compatibility validates components compatibility upon connecting two components. Sometimes two components seem compatible to each other even then their ordering (i.e., the position of a component in a mashup) could make problems. So component order checks for right ordering. Finally, there could be many domain-specific rules that a mashup must consider. The detail of domain-specific rules are given later in this chapter.

      2. A component , where , is like a task that performs some data, application, or UI action.

        Components have ports through which pipes are connected. Ports can be divided in input () and output ports (), where input ports carry data into the component, while output ports carry data generated (or handed over) by the component. Each component must have at least either an input or an output port. Both IPTs and OPTs can have parameters of specific data types. The data types include both primitive and domain-specific types once the MM gets extended for a domain.

        Configuration ports () are used to configure the components. They are typically used to configure filters (defining the filter conditions) or to define the nature of a query on a data source. The configuration data can be a constant (e.g., a parameter defined by the end-user) or can arrive in a pipe from another component. Conceptually, constant configurations are as if they come from a component feeding a constant value.

        A component can be of type information source, information processor, or information sink. Components with no input ports are called information sources and work as data source by supplying data to other components. Components with no output ports are called information sinks. All UI components are always of information sink type. They do not perform business logic on the consumed data, but to visualize it to users. Components with both input and output ports are called information processors. These components take data, process it and produce results.

        The type () of the components denotes whether they are UI components, which display data and can be rendered in the mashup’s layout, or application components, which either fetch or process information or a data source components. Mainly the type information is used by the internal’s logic but it could also be used to arrange components for better presentation for end-users.

        Components can also have a description desc at an arbitrary level of formalization, whose purpose is to inform the user about the data the components handle and produce.

      3. A pipe (i.e., connector) carries data (e.g., XML/JSON documents) between the ports of two components, implementing a data flow logic. So, .

      4. A view port identifies a place holder, e.g., a DIV element or an IFRAME, inside the HTML template that gives the component its graphical identity. Typically, a template has multiple placeholders.

      5. Finally, the layout defines which component with own UI is to be rendered in which view port of the template. Therefore .

      Each mashup following this MM must have at least a source and a sink, and all ports of all components must be attached to a pipe or manually filled with data (the configuration port).

      This is all we need to define a mashup and as we will see, this is an executable specification. There is nothing else besides this picture. This is not that far from the complexity of specifying a flowchart, for example. It is very distant from what can be an (executable) BPMN specification or a BPEL process in terms of complexity.

      In the model above there are no variables and no data mappings. This is at the heart of enabling end-user development as this is where much of the complexity resides. It is unrealistic to ask end-users to perform data mapping operations. Because there is a DCM, each component is required to be able to process any document that conforms to the model. This does not mean that a component must process every single XML element. For example, a component that computes the h-index will likely do so for researchers, not for publications, and probably not for publishers (though it is conceivable to have an h-index computed for publishers as well). So the component will ”attach” a metric only to the researcher information that flows in. Anything else that flows in is just passed through without alterations. The component description will help users to understand what the component operates on or generates, and this is why an informal description suffices. What this means is that each component in a domain-specific mashup must be able to implement this pass-through semantics and it must operate on or generate one or more (but not all) elements as specified in the DCM. Therefore, our MM assumes that all components comply to understand the DCM.

      Furthermore, in the model there are also no gateways as in BPMN, although it is possible to have dedicated components that, for example, implement an if-then semantics and have two output ports for this purpose. In this case, one of the output ports will be populated with an empty feed. Complex routing semantics are virtually impossible for non-experts to understand (and in many cases for experts as well) and for this reason if they are needed we delegate them to the components which are done by programmers and are understood by end-users in the context of a domain.

      5.6.2 Operational semantics

      The behavior of the components and the semantics of the MM are as follows:

      1. Executions of the mashups are initiated by the user. A user have to explicitly start the execution using some user interface means (e.g., a button click).

      2. Components that are ready for execution are identified. A component is ready when all the input and configuration ports are filled with data, that is, they have all necessary data to start processing.

      3. All ready components are then executed. They process the data in input ports, consuming the respective data items from the input feed, and generate output on their output ports. The generated output fills the inputs of other components, turning them executable.

      4. The execution proceeds by identifying ready components and executing them (i.e., reiterating steps 2 and 3), until there are no components to be executed left. However, during the execution if in case some component requires user interaction (e.g., an input) before it proceed, then the execution stops and starts again after user acts as needed. This means it is possible to interact with the mashup execution during runtime. At this point, all components have been executed, and all the sinks have received and rendered information.

      5.6.3 Generic mashup syntax

      Developing mashups based on this meta-model, i.e., graphically composing a mashup in a mashup tool, requires defining a syntax for the concepts in the MM. In Figure 5.3 we map the above MM to a basic set of generic graphical symbols and composition rules. In the next section, we show where to configure domain-specific symbols.

      Figure 5.3: Basic syntax for the concepts in the mashup meta-model.

      5.7 The Domain-Specific Mashup Meta-Model

      The mashup meta-model (MM) described in the previous section allows the definition of a class of mashups that can fit in different domains. Thus, it is not yet tailored to a specific domain, e.g., research evaluation. Now we want to push the domain into the mashup meta-model constraining the class of the mashups that can be produced to that of our specific domain. Despite the relative simplicity, providing users with a DCM-restricted mashup meta-model is still not likely to be sufficient in terms of ease of use. The user will still be faced with a large number of possible components to be placed on a canvas.

      The next step is therefore understanding the dynamics of the concepts in the model, that is, the typical classes of processes and activities that are performed by domain experts in the domain, in order to transform or evolve concrete instances of the concepts in the DCM and to arrive at a structuring of components as well as to an intuitive graphical notation. What we obtain from this is a domain-specific mashup meta-model. Each domain-specific meta-model is a specialization of the mashup meta-model along four dimensions:

      1. Domain-specific activities and processes

      2. Domain-specific rules

      3. Domain-specific syntax

      4. Domain instances

      The domain-specific meta-model extension extends the MM with domain-specific sub-types of the component entity in the MM. Sub-types allow the injection of classes of domain processes or activities into the MM and, hence, the introduction of domain-specific terminology and syntax. In figure LABEL:fig:mm-extension we show domain-specific meta-model extension, and describe its details as follows.

      5.7.1 Domain process model

      • The missingdomain process model (PM) describes the classes of processes or activities that the domain expert may want to mash up to implement composite, domain-specific processes.

      Operatively, the process model is again derived by specializing the generic meta-model based on interactions with domain experts, just like for the domain concept model. This time the topic of the interaction is aimed at defining classes of components, their interactions and notations. In the case of research evaluation, this led to the identification of the following classes of activities, i.e., classes of components:

      For simplicity, we discuss only the processes that are necessary to implement the reference scenarios.

      1. Source extraction activities. They are like queries over digital libraries such as DBLP or Google Scholar. They may have no input port, and have one output port (the extracted data). These components may have one or more configuration ports that specify in essence the ”query”. For example a source component may take in input a set of researchers and extract publications and citations for every researcher from Google Scholar.

      2. Metric computation activities, which can take in input institutions, venues, researchers, or publications and attach a metric to them. The corresponding components have at least one input and one output port. For example, a component determines the h-index for researchers, or determines the percentile of a metric based on a distribution.

      3. Aggregation activities, which define groups of items based on some parameter (e.g., affiliation).

      4. Filtering activities, which receive an input pipe and return in output a filtering of the input, based on a criterion that arrives in a configuration port. For example we can filter researchers based on the nationality or affiliation or based on the value of a metric.

      5. UI widgets, corresponding to information sink components that plot or map information on researchers, venues, publications, and related metrics.

      5.7.2 Domain rules

      As a domain comprises of domain concepts, activities/processes and rules (i.e., constraints, restrictions on concepts and activities) to prescribe and/or restrict the way in which domain experts use domain activities and processes to achieve their goals. These domain rules can be defined in a way like integrity constraints e.g., from the cardinalities between concepts in a domain concept model. However domain rules not only cover integrity constraints but usually also allow or restrict domain behaviors (i.e., domain activities/processes). For example, a rule could be that DBLP cannot be used for computing H-index metrics and thus can be instantiated as a Component compatibility (mashup) rule disallowing the usage of these two components in the same mashup composition). The rule enforcement in the DMT provides assistance and guidance to the domain-experts improving usability, composition correctness and development errors reduction.

      Figure 5.4: Extension to the domain-specific rules.

      A well know way to define these rules is through ECA structure (event, condition, activity) which means: if the event occurs and conditions are met, then execute the activity [106]. Figure 5.4 depicts the extension of MM along with rule model. Domain rules can be classified in many different ways. When analyzing rules in the context of domain processes, following are the two rule types we identify: Activity rules: domain rules related to the a particular activity or subset of an activity. Integrity rules: are related to the domain objects and their relationships, for example, the value of the H-index metric cannot be negative.

      5.7.3 Domain syntax

      A possible domain-specific syntax for the classes in the PM (derived from the generic syntax presented in Figure 5.3) is shown in Figure 5.5.Its semantics is the one described by the MM in Section 5.6. In practice, defining a PM that fully represents a domain requires considering multiple scenarios for a given domain, aiming at covering all possible classes of processes in the domain.

      Figure 5.5: Domain-specific syntax for the concepts in the domain-specific meta-model extension

      5.7.4 Domain instances

      Domain instances are fully functional domain-specific components that are ready to be used in mashup compositions. These domain-specific components with domain syntax (i.e., domain symbols) implements domain activities and processes, consuming and producing domain-specific concepts at input and output ports. Figure 5.6 actually exemplifies the use of instances of domain-specific components. For example, the Microsoft Academic Publications component is an instance of source extraction activity with a configuration port (SetResearchers) that allows the setup of the researchers for which publications are to be loaded from Microsoft Academic. The component’s symbol is an instantiation of the parametric source component type in Figure 5.5 without static query. Similarly, the Italian Researchers (source extraction activity), the Venue Ranking (source extraction activity), the Impact (metric computation activity), the Impact Percentiles (metric computation activity), and the Bar Chart (UI widget) components.

      Figure 5.6: An example of the use of instances of domain-specific components

      In summary, what we do is limiting the flexibility of a generic mashup tool to a specific class of mashups, gaining however in intuitiveness, due to the strong focus on the specific needs and issues of the target domain. Given the models introduced so far, we can therefore refine our definition of DMT given earlier as follows:

      • A missingdomain-specific mashup tool (DMT) is a development and execution environment that (i) implements a domain-specific mashup meta-model, (ii) exposes a domain-specific modeling syntax, and (iii) includes an extensible set of domain-specific component instances.

      Once the domain models are ready, the IT expert can then customize a mashup platform that meets the requirements that emerge from the domain model. The DMT will therefore expose not only a concept model, but also a process model that specializes MM and that presents to the user a set of components grouped in a domain-meaningful way and with a graphical appearance that makes sense for the domain. Doing so implies, first, understanding which type of mashups the platform should support and, then, tailoring the mashup platform to the specific domain. To this end, the next chapter presents the implementation related details that implements a generic mashup platform following the mashup-meta model.

      Chapter 6 Domain-Specific Mashup Platform Development

      6.1 Overview

      In the previous chapter we have presented the methodology for the development of a domain-specific mashup platform. The methodology clearly separate domain-independent concerns (i.e., in the form of mashup meta-model) from what of domain-specific ones (i.e., domain-specific mashup meta-model and its extensions). A mashup platform whose development follows the defined methodology initially stays empty in terms of domain-specific knowledge (i.e., terminologies, concepts, rules, activities etc.) that is then injected tailoring it to a domain-specific mashup tool. That is how we enable generic platform to be tailored for a specific domain. In this chapter we present how we developed the domain-specific platform that will be then tailored for our reference domain (that is presented in the next chapter). However, in this chapter we specifically focus on the technical concerns and technological design decisions that we have taken.

      The mashup meta-model proposed in the previous chapter explains well the capabilities a mashup platform can offer whose implementation follows the specified model. For example, the provision of universal integration is achieved through the support of components those can be of type service, UI or data, that is, the platform from its architectural design supports this capability. The easy-of-use feature, that is to effectively enable non-technical users in development, is achieved through via no-complex mapping concept. Intuitiveness is achieved via introducing domain-specific syntax for composition constructs. These are all the fundamental characteristics that help our platform to provide an effective end-user development environment. In this chapter we not only present steps in the development of a mashup platform that should on one side reduce the complexity of creating mashups for non-technical users but on the other side support developers in the process of developing new components. Primarily focusing on domain-specific mashups greatly help us achieve these objectives.

      To this end, we first present our baseline mashup engine that is comprised of various modules, which are explained later in this chapter. We aim for a very lightweight yet powerful mashup engine that can easily run in web browsers, that is, at the client-side with no need to download any extra software. For this purpose, the engine is implemented using JavaScript language and runs at client-side in a web browser. The choice of using JavaScript language over other languages is highly motivated by the fact that most Web 2.0 Ajax based web applications whose major goals are to offer fast interactive yet attractive designs and user interfaces, use client-side languages like JavaScript.

      6.2 Components & Compositions Execution Insights

      Before describing the technical details of the mashup platform, we first present a few design aspects that must be considered in order to get maximum benefit of our mashup meta-model. Just to clarify a few terminologies, we refer a ”composition” to a set of components connected together to make a mashup. Components are fundamental units contain presentation or application logic and perform certain operations. They usually take some input and generate some output (i.e., the result of their operation). Several components can be connected, so that the output of one component serves as input for another component, which forms a mashup composition (i.e., also referred as a mashup or simply a composition). In the next sections, both ”mashup” and ”composition” terms interchangeably used, though representing the same meaning.

      6.2.1 Orchestration style

      Of the many important aspects, orchestration is a key aspect to be considered prior to the development of a mashup platform. Generally, the orchestration, where multiple complex computing units involved, manages their coordination while their execution. Similarly, in our case, it specifies how to synchronize the execution of components in a composition making better coordination among them. Mostly, there are three prominent approaches that have been adopted [107]:

      Flow-based approach maintains orchestration as sequencing of components that is also a kind of flowchart based approach, where multiple units (e.g., components) connect together whose execution happens according to a defined sequence.

      Event-based approach offers a publish-subscribe way, where pub/sub models maintain synchronous behavior among components. When a component behaves like publisher sends messages to a queue which then consumed by all those components (subscribers) who are interested.

      Layout-based approach place components in a composition into a common layout that then each component’s behavior is specified individually by accounting for the other components’ reactions to user interactions.

      We use a combination of flow-based and event-based approaches. That is, generally components execution takes place following flow-based style, however, components’ operations can subscribe to the various data buses that then received required data when an event triggered.

      6.2.2 Data-passing style

      The choice of data-passing approach is another pivotal aspect, which describes the behavior through which data flows among various components. This important property alone can be used to effectively distinguish among various mashup tools, especially when the target end-user belongs to a non-technical user class. Mainly two approaches have been followed in the past, which are data-flow and blackboard-based [107].

      According to the data-flow approach, actual data flow from a component to another component. A component starts its execution upon receiving data its waiting for, and once the execution completes it sends the data to the next component in the flow. The data-flow approach considered more intuitive for non-technical users as it follows the philosophy of a natural workflow in daily life work. On the other hand, according to the blackboard based approach data is written to variables, which serve as the source and target of an operation invocation on components, much like in programming languages.

      In our case, we follow the data-flow based approach. For example, a data source component produces data (after fetching from a database/web service etc.) and hand over it to the next connected component that then consumes it for further processing. To convey the execution status of a component that would also reflect a composition execution status, we aim to present to end-users the execution status of individual components. Another aspect related to the data-flow approach, which we describe later in this chapter, is that sometimes on the background instead of passing the actual data we pass control data. This scenario gets activated for components whose implementation is of a web-service type. So far we have presented the different types of components, while we will present how these components can be implemented (i.e., as a web service, or as a client side implementation) later.

      6.2.3 Compositions execution

      A mashup composition, which comprises of several connected components, executes to achieve its goals. The execution of a composition means, running its components in an order that is defined by the end-user. The general execution semantics of a composition/mashup is described in section 5.6.2. However, in this section we look at whether the execution follows an instance-based or a continuous approach [107]. An instance-based model is the traditional service composition model, in which a certain kind of message’s arrival activates a new instance of the composition, and the system executes the instance within the same main thread and context (much like a program run). On the other hand, the continuous model has one instance per component in a composition model. Each component works as a thread, processing the input data feed and transforming or filtering it to generate the output. The strategy we follow is continuous model based, that is, to allow various components to execute using their own threads and also communicate between them when required.

      6.3 Components Definitions

      Given the above insights, now we first detail on Component Definition Language (CDL), which is build based on components capabilities described in the mashup meta-model. CDL represents just the technical version of what a component is defined by the mashup meta-model. As components are the main building blocks of a mashup, they consume data, perform certain actions/manipulation and produce results. From a software development point of view, a component can be seen as a function or method. It can take one or more input and produce one or more outputs. The input might come from another component or from an external service or from component’s own UI. Furthermore, a component might require direct user interaction and provide a corresponding UI (i.e., configuration UI). Considering this idea, we show in Figure 6.1, the component definition language model and in Figure 6.2 depicts the component communication mechanism. In following we elaborate the details of both aspects.

      Figure 6.1: Model Representing Component Definition Language (CDL)

      6.3.1 Component Definition Language (CDL)

      From a technical point of view CDL is comprised of the following elements to build a component:

      • Operations: A component exposes a set of actions that it can perform by means of operations. Operation can be seen as the input configuration ports (i.e., IPT’s) as defined in the mashup meta-model. Operations are invoked through events (described below) and can accept one or more parameters as input and can produce one or more output parameters. Each parameter can have a certain data type. The generic model does not constrain the types, although the mashup tool can restrict possible types that is from a domain concept model (DCM) in the form of an XSD. To complete its computation, a component might need to have operations to be called in a certain order. Therefore, operations can be dependent on each other. Ideally, an operation should expect only one input, to make its purpose more intelligible to the composition designer. This does not necessarily restrain the capabilities of components: A multi-input operation can be split up into several dependent operations.

      • Events: Events are the way to propagate results of a component’s action to other components. Events implement the output ports (i.e., OPT) defined in the mashup meta-model. They are either generated programmatically, for example after an operation is completed, or through the user interacting with a component’s UI. Like the input for operations, the output contained in the event data should conform to one or more data types. Creating a composition mainly consists of connecting events with operations which accept the same data types (i.e., domain concepts).

        These two concepts, operations and events, are general enough to cover any kind of interaction between components as they are essentially a mixture of the Observer pattern [108] and the more general Publish/Subscribe pattern [109]. These are common patterns used in Model View Controller (MVC) architectures, providing a way to decouple different parts of an application and make them easily exchangeable. From this point of view, applying such a concept seems to be a logical step: the components are the different parts of a composition (the application) and they need to be highly exchangeable due to the dynamic nature of mashups.

        Figure 6.2: Component Communication
      • Requests: In a composition components send data to each other through operations events connections, but this is not the only type of communication that happens in a composition, instead, often times components communicate with external services or API to fetch or to process data. In the previous chapter we described that components can be of an information source type that act like data sources or can be of type information processor that implement business logic. For these kinds of components it is common that they call external services to accomplish their task. External interactions can be triggered from an operation or from a component’s UI (i.e., against a user UI interaction). Although these interactions can be seen as internal calls without the relevance to the environment, we think that a formal specification of this interaction can be useful for the platform provider and lead to a more comprehensive specification of the component interface. We call this characteristic a request. The model is deliberately kept universal in this regard and does not require a detailed specification of the possible type of a service or the request and response formats, though the current implementation expects requests to be executable by means of Ajax. Interpreting the response is in the responsibility of the component implementation.

      • Configurations: Each component can have a set of configuration parameters (Configuration ports (i.e., CPTs)) according to the mashup meta-model. Users use these configuration parameters to configure components through the component’s UI. These parameters must have a data type that can be a primitive or a platform specified type (e.g., domain type) and each parameter value (i.e., user supplied value) of a component belongs to that specific instance of the component and the composition, hence not shared among other compositions.

      • Meta-data: Finally, a component can have an arbitrary set of meta-data associated with it, typically in the form of key-value pairs. This can be leveraged by the mashup platform to store necessary, platform specific information. For example, the description and type attribute of a component can be defined using the meta-data feature.

      6.3.2 Component Definition Language in Action

      Listing LABEL:list:cdl shows a simplified version of the model definition of a component, which is responsible for retrieving a list of researchers based on its configuration. The component has a fully qualified name as ID, a descriptive name (line 1) and a more detailed description (line 3). Furthermore, it has four configuration parameters (line 5 to 25) , sectionId, uniId, departmentId and facultyId defined using config tag. These configuration parameters are the examples of filters, which restrict the result-set using various filtering criteria. Each config parameter definition contains further information for displaying the configuration fields, like the label & Sector under option tag. There are various types of options that can be set, for example, as in sectionId configuration parameter (line 6 to 12). These include:

      - label defines a label using its value attribute.

      - renderer renders a UI field based on selected renderer. In this case a rendered of type jsm.ui.input.Autocomplete is used.

      - url specifies a url of an external service if the data has to be fetched from it.

      - search parameter is like the query string value in web service calls.

      - value specifies what field used as value-field in the UI.

      - display specifies the display field used to populate a UI field (e.g., a text field)

      We explain how all these parameters work collectively in section 6.5.8. Moreover, this CDL example only expends first configuration parameter details for the purpose of conveying the understanding.

      The event (line 17 to 19) returns a collection of researchers, denoted by its data type. It gets triggered by the request (line 21 to 29), which connects to some service with the given url, sending the configuration parameters, denoted by the name syntax. Presuming the web service correctly returns a list of researchers, no further implementation has to be provided by the developer. With this definition, the engine generates a generic configuration interface and can manage the request to the web service automatically. Apart from simplifying the development process for data source components, it also completely hides the technical details of service calls from the composition designer.

      This example also shows the usage of meta-data in the definition. The conceptual model does not require a name or description, but it can still be provided through meta-data. That means, the implementation of the component does not process this data but other routines can access it. For example, the user interface of the mashup tool can use this data to present more information about a component for a better user experience. In this example, we do not show the details about how a service call can be configured. However, section 6.5.7 provides in-depth details of this aspect.

      1<component id="org.reseval.ItalianResearchers" name="Italian Researchers">
      2
      3    <description>Gets a list of Italian researchers, optionally filtered</description>
      4
      5    <config ref="sectionId">
      6        <option name="label" value="Sector"/>
      7        <option name="renderer">
      8            <option name="type" value="jsm.ui.input.Autocomplete"/>
      9            <option name="url" value="http://example.com/italianSource/sector/name/autocomplete"/>
      10            <option name="search_parameter" value="input"/>
      11            <option name="value" value="{id}"/>
      12            <option name="display" value="{name}"/>
      13        </option>
      14    </config>
      15    <config ref="uniId">
      16        <option name="label" value="University"/>
      17  ...
      18    </config>
      19    <config ref="departmentId" dependsOn="uniId">
      20        <option name="label" value="Department"/>
      21  ...
      22    </config>
      23    <config ref="facultyId" dependsOn="uniId">
      24  ...
      25    </config>
      26   ...
      27    <event name="Researchers loaded" ref="researchers_loaded">
      28        <output name="researchers" type="Researcher[id][name][masID][dblpID]" collection="true"/>
      29    </event>
      30
      31    <request name="Get Researchers" ref="get_researchers" triggers="researchers_loaded">
      32        <url>http://example.com/italianSource/getResearchers</url>
      33        <parameters>
      34            <parameter name="uniID" value="{uniId}"/>
      35            <parameter name="facID" value="{facultyId}"/>
      36            <parameter name="depID" value="{departmentId}"/>
      37            <parameter name="secID" value="{sectionId}"/>
      38        </parameters>
      39    </request>
      40...
      41</component>

      6.4 Mashup Compositions Definitions

      Given the component definition language, we now present Mashup Definition Language (MDL), a technical version of what a mashup is defined by the mashup meta-model. The mashup definition language provides a way in which a mashup composition can be defined in terms of components, connections among them through pipes, their states (i.e., parametric values they hold), their instances information and the layout information. We pursue the same goal for the mashup model as for the component model: A minimal set of characteristics that is necessary to represent a functional composition.

      In essence, a mashup composition, which is formed using multiple components, is defined connecting events (i.e., output ports) with the operation (i.e., input ports). An event emits data which passes through a pipe and finally consumed by an operation. Components in a composition may hold a user-defined configuration parameters. A composition contains its layout information, which is then used to render components to their proper layout and position. So basically an MDL that is capable to accommodate the above mentioned snippets of information is suitable for our purpose from a technical point of view.

      6.4.1 Mashup Definition Language (MDL)

      Technically, based on the MDL a mashup is comprised of three basic things, as described below:

      • Components Many components form compositions hence play major role in building mashups. A composition can have multiple components connected together. The MDL defines connections among components in terms of source and target components. A source component is the one whose event is connected to another component’s operation that is called a target component in this case. Each component assigned an instance id in a composition along with its full qualified name. Other information like a component’s configuration details also preserved in the MDL.

      • Connections When two components connect, resultantly form a connection. A connection information in the form of source and target components is maintained by the MDL. That is how the association between events and operations of two components is maintained. That is then this information used by the mashup engine to work as a publish-subscriber approach to hand over data to the target operation emitted by the source event.

      • Meta-Data Mashup definition language also permits to define arbitrary parameters in the form of meta-data (i.e, specifically key-value pair). These parameters can be used to define some special cases such as mashup composition state, permissions. This also provides a way of extending the definition language with details which are not anticipated yet.

      6.4.2 Mashup Definition Language in Action

      Although, an MDL is an internal document of the mashup platform, even then describing its details would further help in case of an extension to the platform if needed. Listing LABEL:list:mdl shows a short sample definition of a mashup composition which consists of only two components and connection between them, just for the sake of understanding.

      1{
      2...
      3  "components": [
      4    {
      5      "instance_id": "2",
      6      "component_id": "org.reseval. ItalianResearchers",
      7      "config": {
      8        "facultyId": {
      9          "value": "",
      10          "display": "All"},
      11        "departmentId": {
      12          "value": "82",
      13          "display": "INGEGNERIA E SCIENZA DELL INFORMAZIONE- DISI"},
      14        "uniId": {
      15          "value": 83,
      16          "display": "TRENTO"},
      17        "sectionId": {
      18          "value": "",
      19          "display": ""}},
      20      "data": {
      21        "name": "DISI Researchers",
      22        "minimized": false,
      23        "position": [
      24          21,
      25          56]}},
      26    {
      27      "instance_id": "4",
      28      "component_id": "org.reseval.MAS",
      29      "config": {
      30        "endYear": {
      31          "value": "2010",
      32          "display": ""
      33        },
      34        "startYear": {
      35          "value": "2008",
      36          "display": ""}},
      37      "data": {
      38        "name": "MAS",
      39        "minimized": false,
      40        "position": [
      41          218,
      42          55]}}],
      43  "connections": [
      44    {
      45      "source": "2",
      46      "event": "researchers_loaded",
      47      "target": "4",
      48      "operation": "set_researchers"
      49    }
      50...
      51 "data": {
      52  "public":true,
      53  "name":"DISI-ItaliaEvaluation-MASBased",
      54  ...
      55  }

      As one can notice that the presented MDL lists down the details of the components used in the composition. Just like a database table, each component with its various attribute represents a tuple. These attributes, to name a few include a component’s configuration parameters and their values, component’s position in the overall mashup layout, its UI status like minimized or not etc.

      Listing LABEL:list:mdl shows a composition of two components in JSON111http://json.org/ format. The Italian Researchers component (line 5 to 25) we showed in the previous example, and the Microsoft Academic component (line 26 to 42), which accepts a list of researchers and adds a list of publications to each researcher. Of the other parts of this mashup definition, the components, the connections and further meta-data are the important ones. For each component, the MDL stores instance-id (line 4), component id (line 6), config (i.e., the configuration parameters and their instance values) (line 7 to 19), name (i.e., component’s name) (line 21), and layout position (line 23 to 25). The instance IDs remain unique within a composition. Whereas a component ID is the one given by component developer and it remains unique among all other components in the platform.

      The important information about connections described for this composition on line 43 to 49. For each individual connection the MDL preserves source component’s instance ID (line 45), its event name (line 46) and the target component’s instance ID (line 47) and its operation name (line 48). Other information about a composition’s name and its visibility status is defined using data tag (line 51 to 53).

      6.5 The Mashup Engine

      Given the CDL that defines components and the MDL that defines mashup compositions (or mashups), we now pursue for a mashup engine that allows the development of components following CDL, composing mashup compositions and finally running those compositions following MDL. The mashup engine must be able to incorporate the above described aspects like orchestration and data-passing style. We aim for data-flow paradigm as a general approach, which must also be conveyed and understandable by the end-users, and sometimes we use control-signals to decrease the data passing overhead hence to increase the overall performance. However, mashup engine’s decision on when specifically data or control-signal flows, is described in the next chapter there we first introduce necessary concepts for its understanding.

      The Mashup Engine is a core part of the platform, which manages various modules and all communications that take place among these modules. One of the main objectives, which drive along the development of the mashup engine, was to keep separate platform or environment specific requirements to that of a mashup tool’s specific ones (i.e., domain-specific) and in parallel to provide a consolidated platform that can easily be tailored to a specific domain. That’s the reason, throughout the elaboration of various steps of the mashup engine, which we described in the next sub-sections, we mainly focus on those set of generic aspects whose design and implementation is not dependent but of course inspired of a domain. This allows us to use the engine for other domains with similar characteristics as of our reference domain (i.e., research evaluation).

      Figure 6.3: Mashup Engine Internals: various modules inside mashup engine and their interactions

      6.5.1 Mashup Engine Architecture

      The mashup engine is designed and developed for the client side technologies (e.g., web browser). JavaScript and the Goolge Closure Library222https://code.google.com/closure/library/ were the languages used for the development. In following, we describe the main modules, their roles and relations to each other. Figure 6.3 depicts an overall architecture of the mashup engine. The modules presented in the architecture are the main building blocks of the engine. Later, we will present how UI of a mashup tool can interact with them to give information and control to users. Some modules only describe an abstract interface, for which the platform provider has to provide a concrete implementation adapted to the environment of mashup tool.

      6.5.2 The Repository Module

      The repository module is the one responsible for performing typical CRUD operations (i.e., create, read, update and delete) for managing components and compositions and other external calls that the engine needs to perform. For example, this is the place which is used to access web services that a developer defines inside a component definition. Moreover, the repository module does not fix the way components and compositions are stored to some persistent storage. Instead, the repository interface is designed for the usage of synchronous and asynchronous storage facilities. For example one can use the HTML 5 local storage API, or a server side storage accessed with Ajax or any other means. As an overall this gives provision to the platform provider to decide on how and where the storage will be done based on his requirements and possibilities.

      In addition to that, the repository module also independent of what representation is used to describe components or composition (e.g., as in our case either XML or JSON). Hence, this task is delegated to component and composition mappers (described below) to transfer whatever representation of components and composition to component Descriptor or Composition which are then understandable by the components and composition classes.

      6.5.3 Component- and Composition Mapper

      The mashup engine does not restrict on a specific component, and compositions representational format. This characteristic introduced because there are many other formats that can be used like to name a few famous ones include W3C Widgets333http://www.w3.org/TR/widgets/ and OpenAjax Widgets444http://www.openajax.org/member/wiki/OpenAjax Metadata 1.0 Specification Widget Overview and there might be others to be developed in the future. For this reason we introduced the concept of mapper in the platform that actually maps a particular format to the system’s internal one. In the previous sections we showed the components and compositions representations that the mashup engine followed in which the default mappers work.

      So, to not restrict the engine to a specific representation, the mapper’s perform the conversion to a specific representation into our internal model and vice versa. It might not be possible to map any component description to our component model, but the idea leads to a certain degree of independence and leaves room for extensions. This again leaves the choice which representations to use and support on platform provider.

      6.5.4 Component Descriptor and Component

      As described earlier that a component, which is a basic building block, can be used in several compositions and can also occur more than once in a composition. Therefore, it is necessary to distinguish between the properties shared by all the instances of the component, like the operations and events and those of instance-specific properties, like a composition-specific name, it’s id, configuration settings etc.

      The component descriptor and component class are used to keep track of these aspects of a component. Basically, the component descriptor is a software artifact that represents a component’s model definition, which is stored in its CDL document. The information that a component descriptor gets populated after parsing component’s CDL include operations, events, requests and configuration parameter details of a component. An instance of the component class is generated to represent this and component’s instance-specific information like the values of configuration parameters, instance id, and it also includes the basic execution logic needed to run a component’s action.

      The user interface of the mashup tool directly interacts with component descriptor and component instances to provide information about them and let the user manipulate them through the configuration interface and the UI of a component.

      6.5.5 Composition

      Compositions represent user defined workflows that accomplish some tasks. As compositions comprised of multiple components connected together in an order to interact with each other by means of connections made between events and operations. The composition module task is to register those connections and performs the communication between components. Basically, the composition listens to each component’s event and notifies the connected components. It’s also responsible for passing the event’s emitted data to the operation. Finally, in order to inform the users about a possible state of a composition while running, the composition module is used by the UI of a mashup tool to convey such indicators.

      6.5.6 Data Mapper

      During a composition’s run-time, all components that belong to a composition and are connected with each other communicate by means of sending and receiving data among them. Basically, the actual communication takes place between an event and an operation. Event emits data and operation consumes it. The data mappers are responsible for the conversion of the data received from the event so that the data is understandable by the operation. Data mappers would not worthwhile for components those understand and built based on and for a specific domain, for example, in our case many components understand our reference domain DCM (e.g., XSD), but components such as bar charts, pie charts, and other visualization components understand a predefined data-format, which depends on what visualization API is used (as we use Google Charts API). Without the conversion functionality a new instance of a same visualization component would be needed that specifically implements a component’s specific data visualization requirements that ends in developing too many new UI components.

      This is the reason, we use data mappers to encapsulate the data conversion logic so that the data is converted in a format that is understandable by the target component. The current implementation of the data mappers follows our current domain model and hence the conversion is automatically performed for such components. Simply, a new data-mapper will be needed in case of a different domain.

      6.5.7 Data Processor

      To perform tasks a component is responsible for, component may require to call external services to fetch data or to perform computations. Also, at run-time mashup engine needs to inspect the data that is being transferred between components for various reasons, like, to read/write meta-data, calling external services, conversion of data etc. For this reason, data processors provide a way to the platform to intercept any communication that a component initiates to either interact with other components or to call services. The data processors allow for pre and post processing of the data that components use. Figure 6.4 depicts how a data processor can be seen as a wrapper around a component. Another advantage of the data processors is for the component developers. A component developer can develop and focus on components related implementation concerns without having to worry about platform-specific requirements or conditions. That means, it also increases the reusability of components those belong to the same domain and share similar characteristics.

      Figure 6.4: Message passing between component with payload and header information

      To further understand the potential of data processors, in following we describe the details of how components communicate with each other having the data processor in place to intercept their calls. As explained already that a composition keeps the information about connected components in terms of events and operations references. These events and operations exchange messages. A message, which is passed from an event to an operation, consists of two parts, i.e., a body and a header, exactly as HTTP header and body means to HTTP protocol. The header contains meta-data information, whereas the body contains the actual payload. In our case, the header contains arbitrary meta-data that is not of concern to a component’s implementation. Whereas, the data in a message’s body populated by the data an event emits. A component implementation can only access to the body of a message not to the header as it’s only understandable by the platform.

      It is then a data processor’s responsibility that it intercept the communication and process header section accordingly. Data processors make different components to synchronize themselves if they understand the header information. Figure 6.4 depicts the communication between two components along with different stages a message passes through. The figure shows, a white box inside written PL (i.e., payload, the actual data), a black box represents header. This is also the point where the mashup engine decides about switching data-flow approach (the default approach) to control-flow approach. We explain the switching mechanism in the next chapter in section 7.4 after conveying the necessary concepts that are required for its understanding.

      6.5.8 Configuration Interface

      A mashup platform whose target users are non-technical, must offer a user-interface that is intuitive, easy to use and consistent. Less-skilled users tend to prefer high intuitiveness, with relatively less but consistent UI elements to play with configuration settings. We noticed that some mashup tools come with too complex user interface options (e.g., Yahoo Pipes) and others with too limited options (e.g., mashArt). However, a right balance is preferable between the two extremes.

      To this end, one possibility to let components developers decide entirely on what type of configuration interface a component should provide. The problem with this approach is, it could lead a component’s configuration interface to arbitrarily complex level, and it might result in major inconsistencies with other components. In order to solve all these issues, the choice we made is to provide a basic set of UI configuration elements along with the parameters the input needs. Component developers can choose among the provided set of input elements, or in case if a new element is needed, then he can add it too.

      In listing LABEL:list:config, we present an example of two configuration input elements. The first element (line 2 to 11) is a simple text-field, which is connected with a web service to provide auto-complete. This text field offers auto-completion of the text when a user starts typing. The second input element (line 13 to 25) generates a drop-down field and also connected with a service, that is where it gets data from. One can notice that the second input element is dependent on the first element, which is set using dependsOn attribute (line 13). That means, the departmentId element gets updated whenever the uniId element will be changed. The auto-updation is achieved through the value and the display tags in departmentId element, both with values {id} and {name} respectively (line 18 & 19). The use of curly braces makes a variable that is then accessible throughout a model. As both id and name are already defined by the uniId element so it is possible to use them anywhere in the component’s definition.

      Most of the remaining options have been already explained in earlier sections. The engine includes a few basic input fields and it is possible to add new elements as required. If a certain type is not available or none is specified, a simple text field is used instead.

      1
      2<config ref ="uniId ">
      3  <option name="label "value="University"/>
      4  <option name="renderer ">
      5    <option name="type " value="jsm.ui.input.Autocomplete " />
      6    <option name="url " value="http:// ... /university/autocomplete" />
      7    <option name="search_parameter" value="input" />
      8     <option name="value" value="{id}"/>
      9            <option name="display" value="{name}"/>
      10  </option>
      11</ config>
      12  ...
      13<config ref="departmentId" dependsOn="uniId">
      14        <option name="label" value="Department"/>
      15        <option name="renderer">
      16            <option name="type" value="jsm.ui.input.Dropdown"/>
      17            <option name="url" value="http://.../{uniId}/departments"/>
      18            <option name="value" value="{id}"/>
      19            <option name="display" value="{name}"/>
      20            <option name="default">
      21                <option name="value" value=""/>
      22                <option name="display" value="All"/>
      23            </option>
      24        </option>
      25   </config>

      Chapter 7 ResEval Mash: A Domain-Specific Mashup Tool

      7.1 Overview

      The domain-specific mashup platform described in the previous chapter provides a consolidated ground for the development of a domain-specific mashup tool. In this chapter, we present how we have developed such a mashup tool for our reference domain. ResEval Mash111http://open.reseval.org [89] [110] is a mashup tool tailored to the research evaluation field, i.e., for the assessment of the productivity or quality of researchers, teams, institutions, journals, and the like. The tool is specifically tailored to the need of sourcing data about scientific publications and researchers from the Web, aggregating them, computing metrics (also complex and adhoc ones), and visualizing them. ResEval Mash is a hosted mashup platform [111] with a client-side editor and runtime engine, both running in a common web browser. It supports the processing of also large amounts of data, a feature that is achieved via the sensible distribution of the respective computation steps over client and server.

      In the following, we first present the important design principles which have been learned from our past works, and also in result of those interactions that we did with domain experts. Moreover, we show how ResEval Mash has been implemented, starting from the domain models introduced throughout the previous sections.

      7.2 Design Principles

      Starting from the considerations that we presented in the section 5.2, the implementation of ResEval Mash is based on a set of design principles (described below), which we think are crucial for the success of a mashup platform like ResEval Mash. These design principles stem both from the earlier work on this direction [9] [112] and also from the requirements that we have presented in the section 4.4.1 those gathered from both domain and end-users. Moreover, these are also based on our past experience with the similar problems in the context of the LiquidPub European project222http://liquidpub.org/.

      7.2.1 Intuitive graphical user interface

      The user interfaces of development tools may not be a complex theoretical issue, but acceptance of programming paradigms can be highly influenced by this aspect too. The user interface comprises, for instance, the selection of the right graphical or textual development metaphor so as to provide users with intelligible constructs and instruments. It is worth investigating and abstracting the different kinds of actions and interactions the user can have with a development environment (e.g., selecting a component, writing an instruction, connecting two components), to then identify the best mix of interactions that should be provided to the developer. To this end, we built as very simple yet powerful interface of the tool, that implements domain-syntax model to its various visual parts [113]. That is, the tool visualizes intuitive graphical symbols those of domain-specific nature and easily understandable by domain-experts.

      7.2.2 Hidden data mappings

      In order to prevent the users from defining data mappings, the mashup component used in the platform are all able to understand and manipulate the domain concepts expressed in the DCM, which defines the domain entities and their relations. That is, they accept as input and produce as output only domain entities (e.g., researchers, publications, metric values). Since all the components, hence, speak the same language, composition can do without explicit data mappings and it is enough to model which component feeds input to which other component.

      7.2.3 Data-intensive processes

      Although apparently simple, the chosen domain is peculiar in that it may require the processing of large amounts of data. For instance, we may need to extract and process all the publications of the Italian researchers, i.e., on average several dozens of publications by about sixty-one thousand researchers (as the scenario presented in the section 4.2 demands). Loading these large amounts of data from remote services and processing them in the browser on the client side is unfeasible due to bandwidth, resource, and time restrictions. Data processing should therefore be kept, especially for this kind of scenarios, on the server side (we achieve this via dedicated RESTful web services running on the server).

      7.2.4 Platform-specific services

      As opposed to common web services, which are typically designed to be independent of the external world, the previous two principles instead demand for services that are specifically designed and implemented to efficiently run in our domain-specific architecture. That is, they must be aware of the platform they run on. As we will see, this allows the services to access shared resources (e.g., the data passed between components) in a protected and speedy fashion.

      7.2.5 Runtime transparency

      Finally, research evaluation processes like our reference scenarios focus on the processing of data, which – from a mashup paradigm point of view – demands for a data flow mashup paradigm. Although data flows are relatively intuitive at design time, they typically are not very intuitive at runtime, especially when processing a data flow logic takes several seconds (as could happen in our case). In order to convey to the user what is going on during execution, we therefore want to provide transparency in the state of a running mashup.

      We identify two key points where transparency is important in the mashup model: component state and processing state. At each instant of time during the execution of a mashup, the runtime environment should allow the user to inspect the data processed and produced by each component, and the environment should graphically communicate the processing progress by animating a graphical representation of the mashup model with suitable indications (i.e., in our case we use different colors to represent different states).

      These principles require ResEval Mash to specifically take into account the characteristics of the research evaluation domain. Doing so produces a platform that is fundamentally different from generic mashup platforms, such as Yahoo! Pipes333http://pipes.yahoo.com/pipes/.

      7.3 ResEval Mash Architecture

      7.3.1 Overview

      Figure 7.1 illustrates the internal architecture that takes into account the above principles and the domain-specific requirements introduced throughout the previous sections: Hidden data mappings are achieved by implementing mashup components that all comply with the domain conceptual model described in Figure 5.1. If all instances of domain activities understand this domain concept model and produce and consume data according to it, we can omit data mappings from the composition environment in that the respective components simply know how to interpret inputs. The processing of large amounts of data is achieved at the server side by implementing platform-specific services that all operate on a shared memory, which allows the components to read and write back data and prevents them from having to pass data directly from one service to another. To provide users with a mashup environment that has an intuitive graphical UI we design first a domain syntax as explained in section 5.7.3, which provides each object in the composition environment with a visual metaphor that the domain expert is acquainted with and that visually convey the respective functionalities. For instance, ResEval Mash uses a gauge for metrics and the icons that resemble the chart types of graphical output components.

      The core of the platform is the functionalities exposed to the domain expert in the form of modeling constructs. These must address the specific domain needs and cover as many as possible mashup scenarios inside the chosen domain. To design these constructs, a thorough analysis of the domain is needed, so as to produce a domain process model as described in section 5.7.1, which specifies the classes of domain activities and, possibly, ready processes that are needed (e.g., data sources and metrics). The components and services implement the domain process model i.e., all the typical domain activities that characterize the research evaluation domain. Runtime transparency is achieved by controlling data processing from the client and animating accordingly the mashup model in the Composition Editor. Doing so requires that each design-time modeling construct has an equivalent runtime component that is able to render its runtime state to the user. The modeling constructs are the ones of the domain-specific syntax illustrated in Figure 5.5, which can be used to compose mashups like the one in our reference scenario (see Figure 5.6). Given such a model, the Mashup Engine is able to run the mashup according to the meta-model introduced in Section 5.6. The role of the individual module in Figure 7.1 is described as follows:

      Figure 7.1: ResEval Mash Architecture presenting its core module both on client and server sides

      7.3.2 Mashup Engine

      The most important part of the platform is the Mashup Engine, which is developed for the client-side processing, that is we control data processing on the server from the client. We have already presented and described the details of the mashup engine’s internal behavior in the previous chapter. However, here we just try to highlight its interaction with those modules that we introduced in the ResEval Mash’s architecture. The engine is primarily responsible for running a mashup composition, triggering the component’s actions and managing the communication between client and server. As a component either binds with one or more services or with a JavaScript implementation, the engine is responsible for checking the respective binding and for executing the corresponding action. The engine is also responsible for the management of complex interactions among components. A detailed view of these possible interaction scenarios is given later in this chapter.

      7.3.3 Composition editor

      Figure 7.2 shows ResEval Mash’s composition editor. The composition editor provides the mashup canvas to the users. It shows a components list from which users can drag and drop components onto the canvas and connect them. The composition editor implements the domain-specific mashup meta-model and exposes it through the domain syntax. From the editor it is also possible to launch the execution of a composition through a run button and hand the mashup over to the mashup engine for execution.

      Figure 7.2: ResEval Mash’s composition editor and its various parts

      Each container with a symbol on the canvas represents one component with its name, a symbol, configuration interface, and in case of a UI component it’s output interface. On the left side of a component shows its input ports (i.e., where it accepts connections) and on the right side it shows output ports (i.e., where it emits data to the next connected components). Components can be expanded to view their configuration interface and various ports. To connect two components, a user has to click and drag from the output port of a source component and drop and release click on input port of a target component. Figure 7.3 depicts how connections can be performed. Composition editor highlights all compatible ports (i.e., checking domain concepts compatibility) of all components present on the canvas upon a mouse click on a component’s event (i.e., output port), that is how editor knows that user intends to make a connection.

      To provide run-time transparency, which convey to users the state of each component’s execution, composition editor shows various visual states of the components. A component can have and change among three visual states. These visual states correspond to a component’s execution status, like a component which is not in the running state shows its label and boarder in black color. Whilst, a component which is in the running state shows an extra label in yellow color right below the component, which shows the operation name which is being executed. The third state represents the successful execution of a component and shows both component name and its boarder in green color. The fourth and the final state, which shows component’s boarder and a notice in a red color that represents the component execution failed due to some reason.

      Figure 7.3: ResEval Mash’s composition editor highlighting compatible ports upon making connections among components

      7.3.4 Component Registration Interface

      The tool also comes with a component registration interface for developers, which aids them in the setup and the addition of new components to the platform. The interface allows the developer to define components starting from ready templates. In order to develop a component, the developer has to provide two artifacts: (i) a component definition (Figure 7.4) and (ii) a component implementation. The implementation consists either of JavaScript code for client-side components or it can be linked to a web service, which is achieved providing a binding to a web service for server-side components.

      To provide ease to component developers , especially with dynamic, untyped languages such as JavaScript, testing and debugging can take much time since errors are often only discovered when the code is executed, the editor provides a supportive interface which allows easy adjustments to the code directly in the browser. Developing inside the browser is not very popular yet, but is possible, especially for languages native to the browser environment, such as JavaScript. Developers do not need to upload code changes of a component repeatedly after some changes are made as the editor automatically identifies that new changes are available and hence it deploys new version.

      Figure 7.4: ResEval Mash’s component registration interface showing a component’s definition

      The component editor consists of two separate editors, one for the component model definition and one for the implementation. Both editors provide syntax highlighting and rudimentary code completion support. For example, the component model editor offers code completion for the elements of our XML representation. This is realized with the help of the CodeMirror444http://codemirror.net/ library.

      7.3.5 Server-Side Services

      On the server side, we have a set of RESTful web services, i.e., the repository services, authentication services, and components services. Repository services enable CRUD operations for components and compositions, that is, the mashup engine interacts with these services to perform CRUD operations. Authentication services are used for user authentication and authorization. Components services manage and allow the invocation of those components whose business logic is implemented as a server-side web service. These web services, together with the client-side components, implement the domain process model. The idea behind these services was to move the computation from the client side to the server side to improve performance by utilizing server’s computational power and big memory hence reducing client side burden. The interaction details between client side components and their services is explained in section 7.5. And, a detail explanation of how to develop a service for a component is given in section 7.6.

      7.3.6 CDM Memory Manager, CDM Module & Shared Memory

      The common data model (CDM) memory manager enforces and supports the checking of data types in the system. To use the domain-specific data-types (i.e., DCM concepts) for various modules on the server side, it is necessary to have an interface that can read a domain-specific model (e.g., an XML schema definition (XSD) in our case), and can parse it, generate implementation classes. These classes then expose their definitions to the other modules. In order to configure the CDM, the CDM memory manger generates corresponding Java classes (e.g., in our case these classes are POJO, annotated with JAXB annotations) from an XSD that encodes the domain concept model. Having the required data-types context in place, the CDM memory manager is then responsible for the insertion and retrieval of data to and from the shared memory.

      On the server side, a shared memory is maintained that CDM memory manager uses it to read and write data. The shared memory can store multiple states of data, and also multiple instances of the same data. However, all the data must comply with the data-types provided by the common data model module. The insertion and retrieval to and from the shared memory follows key-value pair mechanism, where the value represents actual data and key works as an identifier. Client side mashup engine initially generates a key, which passes along the data that is then used by the CDM memory manger to store in the shared memory. Hence, all data processing services read and write to this shared memory through the CDM memory manager. That means, the CDM interacts with the shared memory to provide a space for each mashup execution instance if required. In our first prototype we use the server’s working memory (RAM) as shared memory, which allows for high performance. Clearly, this solution fits the purpose of our prototype but it may not scale to in-production installations, which may need to deal with large numbers of users and large amounts of data that only hardly can be kept in RAM if it offers small memory. However, in our future work, we aim to develop a persistent database-based shared memory.

      7.3.7 Local Database and the Web

      Both the database and the Web represent the data which is required and used by the component services. We as a platform provider provide a database555The database holds data that we have crawled, downloaded from various sources for performance purposes. and a basic set of services on top of it. A third-party service can be deployed and thus it can use an external database anywhere on the Web. However, the development of a third-party service must comply with the specification presented later in the section 7.6.

      7.4 Intelligent Switching between Data-flow and Control-flow

      As explained earlier that the use of server-side web services is one way to implement business logic of a component, which can also be implemented through JavaScript that is we call these components as client-side components. For components whose business logic is implemented by server side services require to send data from client side to the server side for various processing. Moreover, if a composition comprised of more than one such component then for each such component data must be sent to the server side from the client and vice-versa. This scenario will be even worse, especially if the data (i.e., data which is being used in communications) are bigger; as in our case too, then following this strategy poses serious challenges in terms of speed that decreases an overall performance of the platform. To deal with data-intensive compositions, which deals with huge amounts of data, the traditional mechanism (i.e., data-flow back and forth between client and server) is not an appropriate choice. For this reason, the mashup engine adds a Control-flow layer, which provides a substantial increase of performance during such situations.

      Figure 7.5: Service Call Data Processor Flow Chart

      The platform achieves the functionality of intelligently switching between data to control flow and back to data-flow with the help of data processors presented in 6.5.7. The data processors are designed to intercept any operation call, request or event. To call a corresponding web service of a component, it is required to intercept whenever an operation of the component is called, and sending a request to the service. This is achieved by configuring a ServiceCall data process, a specific type of the data processor as listed in the Listing 1. The operation ”get researchers” (line 5) is configured, having a service URL (line 6), with passthrough parameter as false and overwrite parameter as true and also mentioning what type of service it is (line 9). The passthrough parameters represents if the data (i.e., the service response) will be needed by the component implementation (i.e., client-side implementation) for performing some actions before it will be handed over to the next component. The overwrite parameter represents if the original data will be overwritten or not. These lines of configuration make sure that whenever an operation of a component is called, the data processor checks for its ServiceCall configuration.

      1...
      2 <processor cls="org.reseval.processor.ServiceCall">
      3       <![CDATA[
      4        {
      5            "get_researchers": {
      6                "url": "http://example.com/getResearchers",
      7                "passthrough": false,
      8                "overwrite": true,
      9                "method": "POST"
      10               }
      11            }
      12        }
      13        ]]>
      14    </processor>
      15...
      Listing 1: Data processor configuration

      The overall procedure, which is to detect and to decide when and where to send data, is represented in the flow chart in Figure 7.5 and explained as follows. In case, if a servicecall is configured, the data processor inspects the header of the message for a key (i.e., a numeric identifier comprises of composition id and time stamp information) which should be dispatched to the service to be called. The key is used on the server side and used as an identifier for the data in the shared memory. A new key is generated by the data processor if one is not found in the message, which also means the particular data does not exist on the server side.

      The data processor further checks if some data already exist in the message body that will be dispatched to the server along with the Data Request parameter. The data request can be set either as ”yes” or ”no”, which describes whether the response of the service has to be sent back to the client side or not. The decision about setting data request as ”yes” or ”no” depends on the next component in the composition connected to the component that is being inspected. A component whose business logic is implemented as a JavaScript file requires the data on the client side for processing, on the other hand a component whose business logic is implemented by a service on the server side does not require the data to be present on the client side. That is how the mashup engine using data processors detects and makes decisions accordingly. The details of each particular interaction scenario between client and server side components as described in the next section 7.5.

      Figure 7.6: Service Call Data Processor Flow Chart: Event

      Figure 7.6 depicts how the data processor intercept call upon an event trigger. First it checks whether the triggered event is the result of an operation or a request. In both cases, the control data presence is checked. If control data is available, the key is set in the header, the response data is set in the body of the message. Figure 7.7 depicts the flow chart of the mechanism used to detect whether the target component is a client-side or a server-side component. The result is then used to configure dataRequest parameter values that decides whether to fetch data or not on the client-side.

      Figure 7.7: Detecting client-side and server-side components

      7.5 Components Models and Data Passing Logic

      There are two component models in ResEval Mash, depending on whether the respective business logic resides either on the client or the server side: server components (SC) are implemented as RESTful web services that run at the server side; client components (CC) are implemented as JavaScript file and run on the client side. Independently of the component model, each component has a client-side component front-end, which allows (i) the Mashup Engine to enact component operations and (ii) the user to inspect the state of the mashup during runtime. All communications between components are mediated by the Mashup Engine, internally implementing a dedicated event bus for shipping data via events. Server components require interactions with their server-side business logic and the shared memory; this interaction needs to be mediated by the Mashup Engine. Client components directly interact with their client-side business logic; this interaction does not require the intervention of the Mashup Engine.

      The components consume or produce different types of data: actual data (D), configuration parameters (CP), and control data like request status (RS), a flag that conveys whether actual data is required in output (DR), and a key (K) identifying data items in the shared memory. All components can consume and produce actual data, yet, as we will see, not always producing actual data in output is necessary. The configuration parameters enable the setup of the components. The request status enables rendering the processing status at runtime. The key is crucial to identify data items produced by one component and to be ”passed” as input to another component. As explained earlier, instead of directly passing data from one service to another, for performance reasons we use a shared memory that all services can access and only pass a key, i.e., a reference to the actual data from component to component.

      Based on the flow of components in the mashup model, we can have different data passing patterns. Given the two different types of components, we can recognize four possible interaction patterns. The four patterns are illustrated in Figure 7.8. All of these interactions are mediated by the mashup engine, hence neither a composition composer nor a component developer needs to think about these complexities while component or composition development. In particular, we may have two types of interaction, that is, (i) the interaction among components that are connected in the designed composition and (ii) the interaction among a component and its server-side implementation (only in the case of components of type SC). Both these types of interaction are managed by the Mashup Engine. In the first type, the Mashup Engine manages the event bus used to publish the components’ events (carrying associated data) and trigger the subscribed components’ operations. In the second type, the Mashup Engine acts as a proxy for the web service operation invocation with the help of data processors. In both cases, the Mashup Engine has the role of managing and including the correct control data in all the events and service invocations, that is crucial for letting the platform work properly. In following we elaborate individual interaction pattern separately.

      Figure 7.8: ResEval Mash’s internal data passing logic.
      1. SC-SC interaction: As shown in the Figure 7.8, both the components (SC A & SC B) are of type SC. Component A is connected with component B. Since component A is the first component in the composition and it does not require any input, it can start the execution immediately. It is the responsibility of the Mashup Engine to trigger the operation of the component A (step 1). At this point, component A calls its back-end web service through the Mashup Engine, passing only the configuration parameters (CP) to it (2). The Mashup Engine, analyzing the composition model, knows that the next component in the flow is also a server component (component B), so it extends component A’s request adding a key control information to the original request, which can be used by component A’s service to mark the data it produces in the shared memory. Hence, the Mashup Engine invokes service A (3). Service A receives the control data, executes its own logic, and stores its output into the Shared Memory (4). Once the execution ends, Service A sends back the control data (i.e., key and request status) to the Mashup Engine (5), which forwards the request status to component A (6); the engine keeps track of the key. With this, component A has completed and the engine can enable the next component (7). In the SC-SC interaction, we do not need to ship any data from the server to the client.

      2. SC-CC interaction: Once activated, component B enacts its server-side logic (8, 9, 10). The Mashup Engine detects that the next component in the flow is a client component, so it adds the DR control data parameter in addition to the key and the configuration parameters, in order to instruct the web service B to send actual output data back to the client side after it has been stored in the Shared Memory. In this way, when service B finishes its execution, it returns the control data and the actual output data of the service (i.e., key, request status and output data) to the Mashup Engine (11), which then passes the request status to component B (12) and the actual data to the next component in the mashup, i.e., component C (13).

      3. CC-CC interaction: Client component to client component interactions do not require to interact with the server-side services. Once the component C’s operation is triggered in response to the termination of component B, it is ready to start its execution and to pass component B’s output data to the JavaScript function implementing its business logic. Once component C finishes its execution, it sends its output data back to the engine (14), which is then able to start component D (15) by passing C’s output data.

      4. CC-SC interaction: After the completion of component D (16), the Mashup Engine passes the respective data to component E as input (17). At this point, component E calls its corresponding service E, passing to it the actual data and possible configuration parameters (18), along with the key appended by the Mashup Engine (19). Possibly, also the Output Data Request flag could be included in the control data but, as explained, this depends on the next component in the flow, which for presentation purpose is not further defined in Figure 7.8. Eventually, service E returns its response (i.e., key and request status – plus possible output data if the DR flag is present) to the Mashup Engine (21), which is then delivered to component E (22).

      While ResEval Mash fully supports these four data passing patterns and is able to understand whether data are to be processed at the client or the server side, it has to be noted that the actual decision of where data are to be processed is up to the developer of the respective mashup component. Client components by definition require data at the client side; server components on the server side. Therefore, if large amounts of data are to be processed, a sensible design of the respective components is paramount. As a rule of thumb, we can say that data should be processed on the server side whenever possible, and component developers should use client components only when really necessary. For instance, visualization components of course require client-side data processing. Yet, if they are used as sinks in the mashup model (which is usually the case), they will have to process only the final output of the actual data processing logic, which is typically of smaller size compared to the actual data sourced from the initial data sources (e.g., a table of h-indexes vs the lists of publications by the set of the respective researchers).

      7.6 The Domain-Specific Service Ecosystem

      An innovative aspect of our mashup platform is its approach based on the concept of domain-specific components. In Section 7.3 we described the role of the Components services in the architecture of the system. These are not simply generic web services, but web services that constitute a domain-specific service ecosystem, i.e., a set of services respecting shared models and conventions and that are designed to work collaboratively where each of them provides a brick to solve more complex problems proper of the specific domain. Having such an ecosystem of compatible and compliant services, introduces several advantages that make our tool actually usable and able to respond to the specific requirements of the domain we are dealing with.

      Given the important role domain-specific components and services play in our platform, next we describe how they are designed and illustrate some details of their implementation and their interactions with the other parts of the system.

      A ResEval Mash component requires the definition of two main artifacts: the component descriptor (i.e., following the component definition language specifications) and the component implementation.

      The component descriptor describes, to briefly mention, the main properties of a component, which are:

      1. Operations. Functions that are triggered as a consequence of an external event that take some input data and perform a given business logic.

      2. Events. Messages produced by the component to inform the external world of its state changes, e.g., due to interactions with the user or an operation completion. Events may carry output data.

      3. Implementation binding. A binding defining how to reach the component implementation.

      4. Configuration parameters. Parameters that, as opposed to input data, are set up at composition design time by the designer to configure the component’s behavior.

      5. Meta-data. The component’s information, such as name and natural language description of the component itself.

      In our platform the component descriptors are implemented as XML file, which must comply with an XML Schema Definition (XSD). The XSD defines both the schema for the component descriptors and the admitted data types. Validating the descriptor against the data types definition we can actually enforce the adoption of the common domain concept model (DCM), which enable smooth composability and no need for data mapping in the Composition Editor, as discussed in Section 7.2.

      Figure 7.9: The descriptor of the Italian Researchers component along with its representation in the Composition Editor

      For example, an excerpt of the Italian Researchers component descriptor along with its representation in the Composition Editor is shown in Figure 7.9. The component is implemented through a server-side web service. Its descriptor does not present any operation and it has an event called Researchers Loaded, which is used to emit the list of researchers that are retrieved by the associated back-end service. The binding between the service and its client-side counterpart is set up in the descriptor through the <request> tag. As shown, this tag includes the information needed to invoke the service, i.e., its end-point URL and the configuration parameters that must be sent along with the request. In addition, the attribute triggers specifies the event to be raised upon service completion. The attribute runsOn, instead, specifies the component’s operation that must be invoked to start the service call. In this particular case, since the component has no operations and no inputs to wait for, when the mashup is started the Mashup Engine automatically invokes the back-end service associated with the component, causing the process execution to start. If we were dealing with a component implemented via client-side JavaScript, we would not need the <request> tag, and the implementation binding would be represented by the ref attribute of the component operation or event, whose value would be the name of the JavaScript function implementing the related business logic.

      The component in Figure 7.9 has different configuration parameters, which are used to define the search criteria to be applied to retrieve the researchers. We can see the uniId parameter. Beside the name of the related label, we must specify the renderer to be used, that is, the way in which the parameter will be represented in the Composition Editor. In this case, we are using a text input field with auto-completion features. The auto-completion feature is provided by a dedicated service operation that can be reached at the address specified in the url option. Finally, we can see the presence of the configTemplate tag, which is just used to set the order in which the parameters must be presented in the component representation in the Composition Editor.

      The other main artifact that constitutes a ResEval Mash component is its implementation. As already discussed above, a component can be implemented in two different ways: through client-side JavaScript code (client component) or through a server-side web service (server component). The choice of having a client-side or a server-side implementation depends mainly on the type of component to be created, which may be a UI component (i.e., a component the user can interact with at runtime through a graphical interface) or a service component (i.e., a component that runs a specific business logic but does not have any UI). UI components (e.g., the Bar Chart of our scenario) are always implemented through client-side JavaScript files since they must directly interact with the browser to create and manage the graphical user interface. Service components (e.g., the Microsoft Academic Publications of our scenario), instead, can be implemented in both ways, depending on their characteristics. In the research evaluation domain, since they typically deal with large amounts of data, service components are commonly implemented through server-side web services. In such a way, they do not have the computational power constraints present at the client-side and, moreover, they can exploit the platform features offered at the server-side, like the Shared Memory mechanism, which, e.g., permit to efficiently deal with data-intensive processes. In other cases, where we do not have particular computational requirements, a service component can be implemented via client-side JavaScript, which runs directly in the browser. The JavaScript implementation, both in case of UI and service components, must include the functions implementing the component’s business logic.

      For example, our Italian Researchers service component is implemented at server-side since it has to deal with large amounts of data (i.e., thousands of researchers), so it belongs to the server components category (introduced in Section 7.5). This type of components, to correctly work within our domain-specific platform, must be implemented as Java RESTful web service following specific implementation guidelines. In particular, the service must be able to properly communicate with the other parts of the system and, thus, it must be aware of the data passing patterns discussed before and the shared memory. Figure 7.10 shows the interaction protocol with the other components of the platform the service must comply with.

      Figure 7.10: Platform-specific interaction protocol each service must comply with

      The service is invoked through an HTTP POST request by the client-side Mashup Engine, performed through an asynchronous Ajax invocation (the half arrowheads in the figure represent asynchronous calls). The need to expose all the operations through HTTP POST comes from the fact that in many cases it must be possible to send complex objects as parameters to the service, which would not be possible in general using a GET request. For instance, in our example, the operation is invoked through a POST request at the URL http://.../resevalmash-api/resources/italianSource/researchers and the component’s configuration parameters (e.g., selected university or department) are posted in the request body. Besides the parameters, the body also includes control data, that is the key and the OutputDataRequired flag.

      Once the request coming from the Mahup Engine is received by the service, the service code must process it following the sequence diagram shown in Figure 7.10. If the service is designed to accept input data, first it will get the data from the Shared Memory through the API provided by the Server-Side Engine, using the received key as parameter.

      Then, the service may need to have access to other data for executing its core business logic. The services developed and deployed by us (as platform owners) can use the system database to persistently store their data, as shown in the second optional box. This is, for instance, the case of our Italian Researchers component that retrieves the researchers from the system database, where the whole Italian researchers data source has been pre-loaded for efficiency reasons. Third-party services, instead, do not have access to the system database but they can use external data sources as external databases or online services available on the Web. Clearly, the usage of the system database guarantees higher performances and avoids possible network bottlenecks.

      Once the service has retrieved all the necessary data, it starts executing its core business logic (for our example component, it consists in the filtering of the researchers of interest based on the configuration parameters). The business logic execution results are then stored in the Shared Memory. Typically, all the services will produce some output data, although, possibly, there could be exceptions like, for instance, a service that is only designed to send emails. Finally, the service must send a response back to the Mashup Engine. The response content depends on the OutputDataRequest flag value. If it is set to false, as shown in the upper part of the alternative box in the figure, the response will contain the Key and the RequestStatus of the service (success or error). If the flag is set to true, in addition to those control data, the response will also contain the actual OutputData produced by the service logic.

      So far, all components and services for ResEval Mash have been implemented by ourselves, yet the idea is to open the platform also to external developers for the development of custom components. In order to ease component development, e.g., the setup of the connection with the Shared Memory and the processing of the individual control data items, we will provide a dedicated Java interface that can be extended with the custom logic. The description, registration, and deployment of custom components is then possible via the dedicated Component Registration Interface briefly described in Section 7.3.

      7.7 ResEval Mash in Action: Various Mashup Compositions

      This section presents various mashup compositions that are developed using the ResEval Mash tool. The first two compositions implement the scenarios described in chapter 4.

      7.7.1 UniTN Department Evaluation Scenario

      The department evaluation procedure that is used by the University of Trento (UniTN) is described in the section 4.2. According to its description, we need to fetch UniTN and Italian researchers those belong to the same discipline as of UniTN ones. For both, the UniTN and the Italian researchers the publications have to be retrieved from a publication data source, which are then ranked based on the UniTN venue ranking scheme. Finally ranked publications are used to compute impact percentile using negative binomial distribution.

      Figure 7.11 depicts the implemented version of UniTN department evaluation scenario using ResEval Mash tool. The figure shows in the center (in dotted border) the original mashup, along with configuration panels of a few important components and the final output of the mashup. In total ten components are used to compose this procedure in which seven are distinct and other three components are instances of some of these components (e.g., DISI researchers, Microsoft Academic, Publication Impact etc). The composition starts with two parallel flows: one computing the weighted publication number (the impact metric in the specific scenario) for all Italian researchers in a selected discipline sector (e.g., Computer Science). The other computes the same ”impact” metric for the researchers belonging to the UniTN computer science department. The former branch defines the distribution of the Italian researchers in the Computer Science discipline sector, the latter is used to compute the impact percentile of the UniTn’s researchers and to determine their individual percentile, which are finally visualized in a bar and a pie chart.

      Figure 7.11: UniTN Dept. Evaluation Mashup Composition: showing components config panels and output (anonymized) with detail description

      Most of the components that are used in this composition are of server-side type, that is, the actual computation is performed on the server, except two components (i.e., bar chart and pie chart). As described earlier in this chapter that server-side implementation (i.e., using a web service) is preferable for components those manipulate big data.

      7.7.2 Italian Professorship Selection Scenario

      In section 4.4, we elaborated the evaluation procedure used by the National Agency for the Evaluation of Universities and Research Institutes (ANVUR) for hiring and promoting professors. The procedure states that metrics (e.g., contemporary h-index, number of articles, number of citations) used for the evaluation must be normalized prior to perform comparison with the provided thresholds values. These values have been fixed by ANVUR as a research quality threshold for a specific area.

      Figure 7.12: Italian Professorship Selection Mashup Composition: showing components and output with detail description

      Figure 7.12 depicts the implementation of the evaluation procedure, which has been developed using ResEval Mash. In the mashup composition we use seven components in total. The composition starts from the Google Scholar Live component, which takes one or more researchers’ names as input and crawls Google Scholar web site on run-time to get their publications. Retrieved publication list for each researcher is then given to three components (i.e., contemporary h-index, article normalizer and citation normalizer) to compute normalized metric according to the defined procedure in the original evaluation document. The metric analyzer component, however, takes input of all required metrics and the thresholds to determine for each researcher that he/she qualifies or not. The results of this component can be displayed in a visualization component, as in our case we show the results in a stepped area chart component.

      Figure 7.13: Mashup composition showing H and G -index values of DISI researchers (anonymized names)

      7.7.3 Computing and Comparing H and G -Index Values of Researchers

      To show how ResEval Mash can be used to compute various metrics, we compose a mashup that calculates the H and G -Index values of researchers who belong to the University of Trento Computer Science Department. Figure 7.13 (top) depicts the mashup composition that is developed in ResEval Mash.

      The composition starts with DISI Researchers component, which is configured to retrieve (i.e., from a local repository) all the researchers of Computer Science department of UniTN. The list of researchers is then passed to the next component, which is in this case ”Microsoft Academic” component. This component takes as input the researchers list and retrieves publications for each. Next, the output of the Microsoft Academic component is consumed by two components, which are H-Index and G-Index. These two components compute the H and G values, which is then visualized in a bar chart as shown in the figure 7.13 (bottom).

      Figure 7.14: Mashup composition showing citation and self-citation comparison for a given list of researchers (names anonymized)

      7.7.4 Comparison of Citations and Self-Citations

      Figure 7.14 depicts a mashup composition, which can be used to compare the citation versus self-citation of one or more researchers. The task is achieved using three components in ResEval Mash. The first component (i.e., Microsoft Academic), given one or more researchers, retrieves publications from Microsoft Academic source. The next component, which is Citation & Self Citation, takes publications as input and determines self-citation count for each researcher. The self-citation count is determined parsing all the publications of a researcher and checking if the researcher (being investigated) appears in any publication that cites his/her publication. Finally, we use bar chart component to show the results.

      This section presented a few example mashup compositions which are used for different evaluation tasks. The ResEval Mash666http://open.reseval.org tool is capable to do more, as now all the effort depends on the availability of new components that users can use according to their broad wisdom. In the next chapter we present user studies that we have conducted to test whether the tool really useful for the non-technical users or not. The studies also investigate various other associated aspects of a mashup tool usability.

      Chapter 8 User Studies and Evaluation

      8.1 Overview

      To evaluate different aspects associated with the work presented in this thesis, like whether being domain-specific is preferable or not for end-users; and if yes, how much expressive a tool should be, that is, what level of flexibility a mashup tool should offer so that the corresponding complexity stays within the boundaries of non-technical users. For this purpose, we conducted a few user studies. The first study, which was mainly focused on the usability evaluation of our mashup tool also partially used for the comparative analysis between ours and other mashup based tool. To this end, to understand the users’ preference over domain-specific versus generic mashup tools, we used Yahoo! Pipes as a generic tool example. Moreover, to determine what level of complexity a non-technical domain-expert can deal with in case of a domain-specific mashup tool preference, we built four different prototypes. Each prototype encompasses different level complexity which surly depends on the flexibility and customization features that these tools provide.

      After the first user study, we addressed and incorporated the suggestions, feedbacks and new requirements gathered during the first user study. The improvements and changes made our mashup tool more usable and useful. Again, to validate the usability of the ResEval Mash tool, we conducted the second user study to test relatively advance features specifically related to the usability of the ResEval Mash tool. The next section elaborate on the first user study and the section 8.3 presents the details of our second user study. Finally we conclude this chapter with analysis and discussion.

      8.2 Comparative & Usability Evaluation: User Study-1

      Being generic versus specific is fundamentally different from their roots, that is, the former covers a broader level in comparison to the latter, which stays specific and can be a specialized form of what former offers. For instance, in this context, simply asking someone to make a computer application and to its opposite asking to make an accounting application is different. Likewise, for building generic or specific application depends what constructive constructs are provided to a developer that is we believe especially for non-technical users those constructs would be more valuable if they are aligned to their level of domain-expertise. Moreover, what a development environment conveys or understand from an end-user point of view is different, like whether the language that an application speaks understandable by the end-users or not. In order to assess all these aspects, we designed our first user evaluation experiment as described in the next section.

      8.2.1 Task Design

      To evaluate the different aspects related to the usability of our domain-specific tool as well as to determine users’ preference, we performed contextual interviews of 28 users. Among them, 7 were professors, 5 administrative people, 1 post-doc, 3 PhD students and 12 master’s level students. These participants were having different levels of technical skills. The technical skills of the participants were determined asking the following questions:

      • What is the user skill level with tools such as MS Excel, MS Word etc?

      • Is the user aware of the meaning of web service?

      • Is the user able to draw/understand a (simple) process following a given graphical notation (e.g., flowchart)?

      In addition to the above mentioned questions, the participants were also asked to write (in a text box) if they program computer applications, or involved in programming tasks and other details related to their technical as well as domain skills. From those recorded answers, we can say that all of the users were domain-experts (i.e., they know research evaluation and were involved in some kind of evaluation tasks), excepts the master’s level students, which were having low domain-expertise than the former group. Among all the user, 5 administrative were highly domain-expert and most of them were directly involved in the research evaluation task that we used during the study. Table 8.1 presents the details of all the participants with their technical and domain skills.

      Users Position Technical Skills Domain Skills
      7 professor 4 (good skilled), 3 (moderate) very skilled
      5 administrator 4 (moderate), 1 (very skilled) very skilled
      1 post-doc good skilled very skilled
      3 phd good skilled good skilled
      12 ms student moderate moderate
      Table 8.1: User Details

      We use one of the famous mashup tool Yahoo! Pipes as a generic tool example and our ResEval mashup tool as a domain-specific one. To better understand the appropriate level of expressive power, flexibility, and difficulty that our tool (i.e., ResEval Mash) offers, we developed four separate prototypes of the tool based on our selected scenario, each prototype offered a different level of flexibility and, consequently, complexity.

      Figure 8.1: Prototype-1: fixed components with fixed configuration options

      8.2.2 Evaluation Procedure

      First, the participants were asked about their technical and domain -skills (to assign them to the appropriate user category as mentioned in the previous section), then we instructed them to perform the following tasks in steps, providing help only upon explicit request:

      1. In the first step, we introduced prototype 1 as depicted in Figure 8.1. This prototype consists of an explanation of the scenario and a pictorial form of it in which different components are connected making a composition. On the same page we provide a button to start the execution of the process. That is, the mashup is pre-built (by us; acted as a developer) and can only be executed by the participants. The participants were only allowed to click the start button in order to execute the process and the final results shown to them on the same page. The provided composition did not allow the participants to make any type of interaction with the components thus restricting them only with the control over process execution (i.e., start or stop).

      2. In the next step, prototype 2 was presented, as depicted in Figure 8.2. This time the participants were presented the same scenario with configurable components, as it can be seen in the figure that the components’ configuration panels are open. Thus allowing participants to configure components through components’ configuration panels, that is various parameters (e.g., filtering options, date ranges etc.) values can be of user-defined. Once the configured components are ready, the participants can start the process execution using a start button.

      3. In this step, prototype 3 was presented to the participants as depicted in Figure 8.3. The participants are now allowed to change components in the mashup model by choosing among different implementations of the same component class, again through configuration panels. For example, a Microsoft Academic Search (MAS) data source can be replaced with a DBLP data source. The possibility to change the configuration parameters and to substitute the components, provides more flexibility to the participants in order to tune the scenario according to their needs.

      4. During this step, a fully functional mashup composition environment (i.e., ResEval Mash) was presented to the participants. This tool provides the possibility to drag-and-drop components onto a composition canvas, to fill their configuration parameters, to connect them together and to execute composition, hence giving maximum flexibility to the participants so they can be as expressive as they want. The participants during this step were allowed to use all the features of the ResEval Mash tool, as during this step they were asked to compose a mashup composition based on the scenario they have been experiencing during the previous steps.

      5. Finally, we presented to the participants the Yahoo! Pipes tool, which is a popular generic mashup tool. An example pipe, i.e., a composition, is shown to the participants as a short tutorial to introduce the tool. Then, participants were asked to imagine how they would implement our specific scenario in Pipes and were asked to implement it to whatever level they can reach.

      Figure 8.2: Prototype-2: Showing a more customizable approach, where user allowed to configure the components

      8.2.3 Questionnaires

      After each step of the procedure, which are mentioned in the previous section, the participants were presented with a set of questionnaires to answer. The questions related to various aspects like, what difficulties they encountered, their understandability level, their suggestions for the improvements etc. were asked. In following the detail of these questions is presented, and the metaphors used for recording their answers are mentioned in (parentheses):

      1. What is your opinion about the difficulty level of this task? (a set of six radio-option buttons ranging from extremely difficult to extremely easy were presented)

      2. What are the main difficulties you encountered? and why? (a multi-line text box)

      3. What are the advantages of this step as compared to the manual/previous approach? (a multi-line text box)

      4. What do you think are the disadvantages of this step as compared to the manual/previous step approach? (a multi-line text box)

      5. Do you think the increased flexibility of the tool with respect to the previous step would be useful for you to adapt the process to your specific needs? (ten radio-option buttons ranging from extremely efficient to extremely inefficient)

      6. Do you understand how the process executed behind the scene? (ten radio-option buttons ranging from easily understandable to not understandable at all)

      7. Do you feel comfortable having such control over the process execution or you would like to have a clearer idea of what is going on? (ten radio-option buttons ranging from extremely comfortable to extremely uncomfortable)

      8. Based on your experience during this step, would you prefer to do this task by yourself using a similar tool or you would prefer explaining and asking a technician to implement it for you? (two options were presented: (i) I’d like to do by myself (ii) I’d like to ask technician)

      9. Do you have specific suggestions/requirements to improve usability/usefulness of this tool? (a multi-line text box)

      10. Are you happy about the flexibility given by the current tool or you would like to be able to change something to adapt the tool to be used for solving other similar problem? (a multi-line text box)

      11. In your opinion what is the difference between previous and this step? (a multi-line text box)

      The participants completed the above mentioned questionnaires by themselves after the completion of each step, help was provided if asked by any participants. Finally, after the 5th step that is described in the section 8.2.2, in which we also presented the Yahoo! Pipes tool, we asked participants a set of general questions in order to constitute an overall consensus among all the tools with respect to the flexibility, usefulness and complexity they offer. These questions are presented below, answering options are in parentheses. Throughout the phase of answering questions the participants were allowed to ask assistance if they have difficulty in understand a question. Final questions were as follows:

      1. What do you think what is the complexity of computing a research evaluation metric manually? (ten radio-option buttons ranging from extremely complex to extremely easy)

      2. Now, you have seen all the different tools (steps), how would you judge them? (for all the five steps, we presented three radio-option buttons with flexible, useful and complex as options.)

      3. Which step/tool would you consider closer to your needs considering both simplicity and flexibility among 5 different tools and why? (a multi-line text box)

      4. Would you use this tool is your real life? (yes/no)

      Figure 8.3: Prototype-3: showing a more flexible and customizable tool to the users

      8.2.4 Results

      As the objective of the study was to collect feedback about two main questions. First, if the participants are indeed more comfortable with domain-specific mashup tools compared to general purpose tools like Yahoo Pipes. Second, in the case domain-specific tools are preferred, which is the right tradeoff among flexibility and complexity, i.e., which of the tools 1-4 is most effective.

      Figure 8.4: Results of user study-1, prototype-1
      Results of the first step

      As for the first step in which we presented the prototype that allows the participants only to the execute process, leaving configurations and component modification options fixed (i.e., hard coded). The results of the quantifiable questions (question number 1, 5, 6 and 7) whose answers can be presented in a chart are depicted in Figure 8.4. The figure shows four charts in which it can be noticed that in chart-1, most of the participants found the first step extremely easy to perform, as the prototype used in this step had lowest complexity that is to only execute the process with a button click. As chart-2 depicts, most of the participants consider the approach as an efficient upon asking whether they feel the approach is time saving as compared to the other tools or manual effort. However, the chart-3 shows, a half-half division of the opinions when the participants were asked whether they understand how the process executed behind the scene or not. As no execution status was conveyed, nor a progress bar was shown, and also they were having no idea how data is flowing, so many participants remained in dark showing they did not understand it. The chart-4 shows most of the participants feel uncomfortable with the level of control that the tool offers, which implies that they certainly need more control over the execution of the process.

      Regarding the remaining questions (i.e., 2, 3, 4, 8, 9, 10 and 11; as presented in the section 8.2.4) whose answers were collected in textual format, we show here some important responses. For the second question, main responses were ”not flexible, but simple”, ”not difficult”, ”I can’t use for my daily job, as I need to deal with loads of variations”. For the third question typical answers were, ”easier, fast, efficient”, ”not difficult at all”, ”much faster than the manual approach” etc. Regarding the fourth question, user respond like, ”you can not check whats going on behind the scene”, ”unskilling people”, ”all data are registered with more accuracy and speed” etc. For the eighth question 90% of the participants said ”I’d like to do by myself”, whereas 10% said ”I’d like to ask technician”. In response to the ninth question, which was about users’ suggestions, they answered as, ”having an interface to change the parameters that you need to look for”, ”need much more customization”, ”I’d like to make more choices” etc. were the main ones. Mostly, similar to the ninth, in response to the tenth question mainly participants asked for ”I’d like to change configurations”, ”need more flexibility”, ”there is no flexibility” etc.

      Results of the second step

      In response to the second step, which is mentioned in the section 8.2.2, Figure 8.5 depicts the charts of the four questions that are 1, 5, 6 and 7. The chart-1 shows that again the participants found step-2 task (i.e., prototype-2) easy to use. As shown in the chart-2, most of the participants liked the increased flexibility of the tool. The third chart shows that almost an equal division of the perception of the understanding of the execution of the process. However, most of the participants still not comfortable using this tool and thus demanded for more control over the process, as depicted in the chart-4. Regarding questions (i.e., 2, 3, 4, 8, 9, 10 and 11; as presented in the section 8.2.4) whose answers are in text format, mainly the users’ responses to the second question were ”more flexible than the previous but not flexible to change components”, ”understanding configuration parameters” etc. For the third question responses were ”parameteric approach is better than the previous one”, ”its flexible”, ”you can choose different parameters”, ”auto-completion is perfect for me” etc. The fourth question received responses like ”still no idea what’s behind the process”, ”still not able to change the execution flow”, ”not always reliable” etc.

      Figure 8.5: Results of user study-1, prototype-2

      In response to the eighth question, 90% said ”I’d like to do by myself” and 10% said ”I’d like to ask technician”. The ninth question where participants gave suggestions as ”if you provide me right choice of component then I can do myself”, ”it should provide the ability to check input parameters”, ”details about what kind of data are being used”, ”better than one button” etc. The tenth question received like ”more flexibility would be good for me”, ”I’d like to change the flow of execution”, ”more options are needed” etc. In response to the eleventh question, which was not the part of the first step, the participants respond as ”more customization of the search”, ”i have more flexibility”, ”more flexibility” etc.

      Results of the third step

      In response to the third step, which is mentioned in the section 8.2.2, Figure 8.6 depicts the results of the four questions (i.e., 1, 5, 6 and 7; presented in the section ). As shown in the chart-1, majority of the participants still feel that the difficulty of the presented step is manageable and hence easy to handle. Likewise, most of the participants considered the increased flexibility still an efficient approach, as shown in the chart-2. However, the chart-3, which conveys the understanding level of the execution of the process, still plot that largely the process execution was less understandable for many. The chart-4 shows many of the participants remain uncomfortable, which means they still demand for more control over the process. On the other side, from textual answers, for the second question main responses were ”no difficulties”, ”more details about the sources would be good”, ”more documentation”, ”none” etc. For the third question ”more clear, more accurate”, ”more flexible than the previous one”, ”presentation styles changing is good” etc. were the main responses and for the fourth question ”provide more choices”, ”its more error prone”, ”none” etc. were the main responses. In response to the eighth question, 93% said I’d like to do by myself and 7% said ”I’d like to ask technician’.

      Figure 8.6: Results of user study-1, prototype-3

      As for the ninth question, the participants suggested as: ”having more visual charts would be more useful”, ”having more sources would be good”, ”check configuration parameter validity” etc. Regarding the tenth question, main responses were ”yes I’m happy, but if you provide me more flexibility I will be more happy”, ”I’d like to alter the execution flow”, ”I’d like to change flow”, ”yes” etc. For the eleventh question, the participants differentiated this tool with the previous as ”more flexibility”, ”this is better”, ”you can choose other databases” etc.

      Results of the fourth step

      As stated in the section 8.2.2 that we presented to the participants the fully functional tool (i.e., ResEval Mash) during the fourth step. The responses regarding the questions (1, 5, 6 and 7) are depicted in the Figure 8.7. In the chart-1, it is clearly shown that this prototype slightly increased the difficulty level but still majority voted it as easy to use. On the other hand, the increased flexibility of the tool was mainly perceived positively and the majority felt it as an efficient approach, as depicted in the chart-2. However, this prototype was extremely understandable by the participants as compared to all the previous ones, as depicted in the chart-3. Moreover, as depicted in the chart-4, majority of the participants felt comfortable with the control that the prototype-4 provided.

      Figure 8.7: Results of user study-1, prototype-4

      Regarding the responses against the questions (2, 3, 4, 8, 9, 10 and 11; as presented in the section 8.2.4), for the second question main responses were ”no”, ”the more you are free in changing configuring component, the more you are risky”, ”need to know components”, ”documentation of the components required” etc. For the third question mainly the users’ responses were ”Its all advantages and not disadvantages”, ”the more you can personalize the more you feel comfortable”, ”i can customize”, ”I can better adopt my needs”, ”I can reuse compositions, components’ etc. And for the fourth question, ”more degree of freedom so there could be more chances of error”, ”more knowledge is required”, ”takes time to understand” etc. In response to the eighth question, 95% of the participants said ”I’d like to do by myself”, whereas 5% of the users said ”I’d like to ask technician”. Main responses for the question number nine were ”some training course for the user would be good”, ”add more components”, ”give suggestion/assistance to the user during composition”, ”check intermediate results” etc. The tenth question was received as ”I am fine, not more than this”, ”I think its enough”, ”I don’t need more details than this because then we go into the programmers world”, ”absolutely happy”, ”I’m happy with the approach”, ”yes, happy” etc. And finally for the eleventh question, the participants’ responses were like ”interesting”, ”more configurable and more useful”, ”need more skills”, ”more creative and more options”.

      Figure 8.8: Results of user study-1, general results
      Results of the fifth step

      As mentioned in the section 8.2.2 that at the end of the study we asked a set of general questions to the participants. Figure 8.8 depicts the results of the important questions, that is how users judge all different tools in terms of flexibility, usefulness and complexity. As it can be noticed that prototype-1 is little flexible, more useful and little complex, whereas prototype-2 is a bit more flexible, highly useful and little complex too. The prototype-3 is considered more flexible than its usefulness and still considered as little complex. However, it is clearly shown that in case of the prototype 4, which is the full mashup environment (i.e., ResEval Mash), the flexibility is largely increased along with its usefulness. The participants found it more useful and flexible as the tool gives freedom to drive their requirements as they want. As compared to the previous three prototypes, the complexity of prototype 4 increased and that is normal. Because most of the participants stated in their remarks that they would need an introductory training to fully utilize its advantages. In the same figure the results on Yahoo! Pipes tool can also be seen (the last chart). As anticipated it has a high complexity, very low usefulness for the users, however, it is fairly flexible as it offers more options to play with, but mainly suitable for programmers only.

      8.2.5 Evaluation Analysis & Discussion

      In the previous section we have presented the results, which reflect the exact representation of the participants’ responses. However, this section provides an analysis of the overall study in which we analyze inter as well as intra -steps variabilities and patterns particularly focusing on the participants’ skills (e.g., technical, non-technical). Mainly, the technical expertise of the participants mentioned in the table 8.1, which shows 19 out of 28 participants have moderate/low technical and 9 have good technical skills. As mentioned earlier that users with good technical expertise are familiar with the programming languages and they were involved in some sort of programming. On the other hand, users with low technical expertise are not programmers. Based on this, we can divide these users into two groups, that is, technical group (i.e., those who know web services, programmings, etc.) and non-technical group (i.e., those who know MS world, Excel etc. but do not know programming).

      Figure 8.9: For both tech and non-tech groups the difficulty level of steps (1-4)

      Regarding the question number 1, as presented in the section 8.2.4, for all the four steps, as described in the section 8.2.2, the difficulty level slightly increased for the non-technical group as compared to the technical one. Figure 8.9 depicts the distribution of the both groups along various difficulty levels. Non-technical group faced difficulties as the process started providing more flexibility and customization (see non-tech column of step 3 & 4). However, one can notice that even during the step-4 in which the prototype-4 was presented, the majority of the non-technical participants are still within the boundaries that they consider it easy to use, and a few considered it ”slightly difficulty”. In response to the very next question (i.e., question-2, where participants were asked to provide textual answers about what difficulties they faced) during the step-4, most of the participants demand for more training and tutorial prior to the use of the tool to effectively deal with difficulty.

      Figure 8.10: For both tech and non-tech groups, how increased flexibility perceived for all steps (1-4)

      In figure 8.10, the distribution regarding both, the technical and non-technical groups in terms of process adaptation with respect to increased flexibility for all the four steps is depicted. Clearly for all the steps the ”moderately efficient” pattern is consistent that is to some extent increased for the fourth step. The technical group considered the prototype-4 more efficient than the non-technical. Even then the majority of the non-technical participants voted for the efficient option except one participant who considered it slightly inefficient. Figure 8.11 depicts the level of understandability of the both groups. A very low understandability level can be seen in the step number 1 & 2 and a slight increase is detected in the step-3 but still majority could not easily understand how the process execution is performed. However, for the step number four both non-technical and technical groups shown a good understanding of the process execution. Obviously technical users have advantage than non-technical users with their technical skills, that is the reason the process execution during the step-4 were easily understandable for the former group.

      Figure 8.11: For both tech and non-tech groups, process execution understandability for all steps (1-4)

      The accumulated results of the both groups (i.e., technical and non-technical) on question number 7, which represents participants’ comfortability about the given control over the process, are depicted in Figure 8.12. Clearly the demands for more control over the process have emerged till the step number 3. However, one can notice that during the step-4 majority of the participants of both groups feels comfortable with the given control, that is they now feel they can tailor it as they want up-to the level of their expertise. The demand for more control mainly asked by the technical participants, as a small number of participants still want to go beyond the flexibility that the prototype-4 provides, whereas non-technical participants largely consider prototype-4 as a boundary line for them, or otherwise the complexity will increase, responded many non-technical participants.

      Figure 8.12: For both tech and non-tech groups, control over process results for all steps (1-4)

      As an overall, for non-technical participants the difficulty level increased against each richer prototype, however, most of the non-technical participants still found prototype-4 easy to use, as shown in chart (a) in figure