Predicting global usages of resources
endowed with local policies
††thanks: Research supported by the Italian PRIN Project “SOFT”, FET Project “ASCENS” and Autonomous Region of
Sardinia Project “TESLA”.
The effective usages of computational resources are a primary concern of up-to-date distributed applications. In this paper, we present a methodology to reason about resource usages (acquisition, release, revision, …), and therefore the proposed approach enables to predict bad usages of resources. Keeping in mind the interplay between local and global information occurring in the application-resource interactions, we model resources as entities with local policies and global properties governing the overall interactions. Formally, our model takes the shape of an extension of -calculus with primitives to manage resources. We develop a Control Flow Analysis computing a static approximation of process behaviour and therefore of the resource usages.
Evolutionary programming paradigms for distributed systems changed the way computational resources are integrated into applications. Resources are usually geographically distributed and have their own states, costs and access mechanisms. Moreover, resources are not created nor destroyed by applications, but directly acquired on-the-fly when needed from suitable resource rental services. Clearly, resource acquisition is subject to availability and requires the agreement between client requirements and service guarantees (Service Level Agreement – SLA). The dynamic acquisition of resources increases the complexity of software since the capability of adapting behaviour strictly depends on resource availability. Ubiquitous computing  and Cloud computing [9, 17, 3] provide illustrative examples of a new generation of applications where resource awareness has been a major concern.
The design of suitable mechanisms to control the distributed acquisition and ownership of computational resources is therefore a great challenge. Understanding the foundations of the distributed management of resources could support state-of-the-art advances of programming language constructs, algorithms and reasoning techniques for resource-aware programming. In the last few years, the problem of providing the mathematical basis for the mechanisms that support resource acquisition and usage has been tackled by several authors (see e.g. [4, 8, 14, 12, 16], to cite only a few).
Here we consider a programming model where processes and resources are distinguished entities. Resources are computational entities having their own life-cycle. Resources can range from computational infrastructures, storage and data services to special-purpose devices. Processes dynamically acquire the required resources when available, but they cannot create any resource. This simple programming model abstracts the features of several interesting distributed applications. As an example, let us consider a cloud system offering computing resources. The available resources are the CPU units of a given power and processes can only acquire the CPU time, when available, to run some specialised code. Similar considerations apply to storage services, where client processes can only acquire slots of the available storage. In our programming model, the deployed resources can be dynamically reconfigured to deal with resource upgrade, resource un-availability, security intrusion and failures. A distinguished feature of our approach is that the reconfiguration steps updating the structure of the available resources are not under the control of client processes.
In this paper, we introduce the formal basis of our programming model. Specifically, we introduce a process calculus with explicit primitives for the distributed ownerships of resources. In our calculus, resources are not statically granted to processes, but they are dynamically acquired on-the-fly when they are needed.
We start from the -calculus  and we extend it with primitives to represent resources and the operations to acquire and release resources on demand. Central to our approach is the identification of an abstract notion of resource. In our model, resources are stateful entities available in the network environment where processes live. Specifically, a resource is described through the declaration of its interaction endpoint (the resource name), its local state and its global properties. Global properties establish and enforce the SLA to be satisfied by any interaction the resource engages with its client process. The global interaction properties can be expressed by means of a suitable resource-aware logic in the style of , or contract-based logic as in [11, 5]. The interplay between local and global information occurring in the process-resource interactions motivates the adjective G-Local given to our extension of the -calculus.
Since we build over the -calculus, name-passing is the basic communication mechanism among processes. Beyond exchanging channel names, processes can pass resource names as well. Resource acquisition is instead based on a different abstraction. In order to acquire the ownership of a certain resource, a process issues a suitable request. Such request is routed in the network environment to the resource. The resource is granted only if it is available. In other words the process-resource interaction paradigm adheres to the publish-subscribe model: resources act as publishers while processes act as subscribers. Notice that processes issue their requests without being aware of the availability of the resources. When they have completed their task on the acquired resource they release it and make it available for new requests. The two-stage nature of the publish-subscribe paradigm relaxes the inter-dependencies among computational components thus achieving a high degree of loose coupling among processes and resources. In this sense our model also resembles tuple-based systems . Consequently, our model seems to be particularly suitable to manage distributed systems where the set of published resources is subject to frequent changes and dynamic reconfigurations.
To summarise, our approach combines the basic features of the -calculus (i.e. dynamic communication topology of processes via name passing) with the publish-subscribe paradigm for the distributed acquisition of resources. This is our first contribution. The interplay between local and global views is also one of the novel features of our proposal. A second contribution consists in the development of a Control Flow Analysis (CFA) for our calculus. The analysis computes a safe approximation of resource usages. Hence, it can be used to statically check whether or not the global properties of resources usages are respected by process interactions. In particular, it helps detecting bad usages of resources, due to policy violations. This suggests where are sensible points in the code that need dynamic check in order to avoid policy violations.
Related Work. The primitives for resource management make our approach easy to specify a wide range of the resource behaviour of distributed systems such as Cloud Computing and Ubiquitous Computing. We believe that our approach also leverages analysis technique such as CFA and behavioural types. A simplified version of the G-Local -calculus has been presented in . The work presented here differs in several ways from the previous one. The version of the calculus we considered in this paper is more expressive of the one presented in  since here processes can pass resource names around. This feature was not allowed in . Also, the management of resource acquisition and release is much more powerful.
In  an extension of the -calculus is proposed to statically verify resource usages. Our notion of global usages is inspired by this work. The -calculus dialect of  provides a general framework for checking resource usages in distributed systems. In this approach private names are extended to resources, i.e. names with a set of traces to define control over resources. Also resource request and resource release are simulated through communicating private names and structural rules respectively. This gives shared semantics of resources, i.e. several processes can have a concurrent access to resources (by communicating private names). In our approach, when a process obtains a resource, it has an exclusive access to it. Furthermore, resource entities can be dynamically reconfigured, while this is not the case in .
In , resources form a monoid and the evolution of processes and resources happens in a SCCS style. In our approach, resources are independent stateful entities equipped with their own global interaction usage policy. A dialect of the -calculus, where resources are abstractly represented via names and can be allocated or de-allocated has been introduced in . In this approach reconfigurations steps are internalized inside processes via the operations for allocating and de-allocating channels. A type system capturing safe reconfigurations over channels has been introduced. In our approach resources are more structured than channels and their reconfiguration steps are not under the control of processes. Finally, the work presented in [DBLP:conf/esop/BuscemiM07] mainly focuses on specifying SLA by describing resources as suitable constraints. Our approach can exploit constraints to express global resource usages as well.
2 The G-Local -Calculus
We consider the monadic version of -calculus  extended with suitable primitives to declare, access and dispose resources. The syntax is displayed in Fig. 1. Here, is a set of channel names (ranged over by ), is a set of resource names (ranged over by ) and is a set of actions (ranged over by ) for running over resources. We assume that these sets are pairwise disjoint. From now on, for the sake of simplicity, we often omit the trailing 0.
The input prefix binds the name (either a channel or a resource) within the process , while the output prefix sends the name along channel and then continues as . Note that resource names can be communicated, however they cannot be used as private names and used as channels. As usual, input prefixes and restrictions act as bindings. The meaning of the remaining operators is standard. The notions of names , free names , bound names and substitution are defined as expected.
Our extension introduces resource-aware constructs in the -calculus. The access prefix models the invocation of the operation over the resource bound to the variable . Traces, denoted by , are finite sequences of events. A usage policy is a set of traces. The release prefix describes the operation of releasing the ownership of the resource . In our programming model, resources are viewed as stateful entities, equipped with policies constraining their usages. More precisely, a resource is a triple , where is a resource name, is the associated policy and is a state ( denotes the empty state). Policies specify the required properties on resource usages. Policies are usually defined by means of a resource-aware logic (see [4, 5, 10, 11]), while states keep track of the sequence of actions performed on resources, by means of (an abstraction of) execution traces.
For instance, in , the policies are expressed in terms of automata over an infinite alphabet, where automata steps correspond to actions on resources and final states indicate policy violations.
To cope with resource-awareness, we introduce two primitives managing resource boundaries: resource joint point and resource request point . Intuitively, process when plugged inside the resource boundary can fire actions acting over the resource . The state is updated at each action according to the required policy . A resource request point represents a process asking for the resource . Only if the request is fulfilled, i.e. the required resource is available, the process can enter the required resource boundary and can use the resource , provided that the policy is satisfied. Processes of the form represent available resources. These processes are idle: they cannot perform any operation. In other words, resources can only react to requests.
To illustrate the main features of the calculus, we consider a small example, which describes a workshop with two hammers and one mallet. Tools are modelled as resource entities: and , with the policy (, resp.) that one can only make hard hit (soft hit, resp.) when using (, resp.). We model workers as a replicated process, whose instantiations take a hammer or a mallet to do jobs, whose chain is described by . Job arrivals are modelled as sending/receiving and on the channels . Furthermore, we assume that there are two types of jobs, hard jobs on the channel and soft jobs on the channel , which get done by and actions respectively.
The initial configuration of the workshop is given below. Resources ( and ) have empty traces. Note that we have two resources of the same name , which corresponds to the number of available hammers in the workshop. Intuitively, it means that only two jobs, which use hammers, can be concurrently done. We have a sequence of four jobs described by the process .
The operational semantics of our calculus is defined by the transition relation given in Tab. 1. Labels for transitions are for silent actions, for free input, for free output, for bound output, , and (, and , resp.) for closed, open and faulty access or release actions over resource . The effect of bound output is to extrude the sent name from the initial scope to the external environment.
We assume a notion of structural congruence and we denote it by . This includes the standard laws of the -calculus, such as the monoidal laws for the parallel composition and the choice operator. To simplify the definition of our Control Flow Analysis, we impose a discipline in the choice of fresh names, and therefore to alpha-conversion. Indeed, the result of analysing a process , must still hold for all its derivative processes , including all the processes obtained from by alpha-conversion. In particular, the CFA uses the names and the variables occurring in . If they were changed by the dynamic evolution, the analysis values would become a sort of dangling references, no more connected with the actual values. To statically maintain the identity of values and variables, we partition all the names used by a process into finitely many equivalence classes. We denote with the equivalence class of the name , that is called canonical name of . Not to further overload our notation, we simply write for , when unambiguous. We further demand that two names can be alpha-renamed only when they have the same canonical name.
In addition, we introduce specific laws for managing the resource-aware constructs, reported in Fig. 2. If two processes and are equivalent, then also and when plugged inside the same resource boundaries are. Resource request and resource joint points can be swapped with the restriction boundary since restriction is not applied to resource names but only to channel names. The last law is crucial for managing the discharge of resources. This law allows rearrangements of available resources, e.g. an available resource is allowed to enter or escape within a resource boundary.
The rules , , , , , , and are the standard -calculus ones. The rule describes actions of processes, e.g. the silent action, free input and free output. Concretely, sends the name along the channel and then behaves like , while receives a name via the channel , to which is bound, and then behaves like . We only observe that our semantics is a late one, e.g. is actually bound to a value when a communication occurs. Finally, performs the silent action and then behaves like .
The rule expresses the parallel computation of processes, while the rule represents a choice among alternatives. The rule is used to communicate free names. The rules and are rules for restriction. The first ensures that an action of is also an action of , provided that the restricted name is not in the action. In the case of in the action, the rule transforms a free output action into a bound output action , which basically expresses opening scope of a bound name. The rule describes communication of bound names, which also closes the scope of a bound name in communication.
We are now ready to comment on the semantic rules corresponding to the treatment of resources. The rule models a process that tries to perform an action (, resp.) on the resource . This attempt is seen as an open action, denoted by the label (, resp.).
Intuitively, if the process is inside the scope of (see the rule ), and the action satisfies the policy for , then the attempt will be successful and the corresponding action will be denoted by the label (see the rule ). If this is not the case, the process is stuck. Similarly, if the process tries to release a resource with the action .
We introduce the rule to model the communication of resource names between processes.
When a resource is available, then it can be acquired by a process that enters the corresponding resource boundary , as stated by the rule .
Symmetrically, according to the rule , the process can release an acquired resource and update the state of its resources by appending to . In the resulting process, the process escapes the resource boundary. Furthermore, the resource becomes available, i.e. it encloses the empty process 0. If the process is not inside the scope of (see the rule ), then, as in the case of accesses, the process is stuck.
The rules check whether the execution of the action on the resource obeys the policy , i.e. whether the updated state , obtained by appending to the current state , is consistent w.r.t. . If the policy is obeyed, then the updated state is stored in the resource state according to the rule and the action becomes closed and if not, then the resource is forcibly released according to the rule and the action becomes faulty. Notice that is the rule managing the recovery from bad access to resources.
The rules and express that actions can bypass resource boundaries for only if they do not involve the resource .
Finally, the rules and describe the abstract behaviour of the resource manager performing asynchronous resource reconfigurations. In other words, resource configuration is not under the control of processes. Resources are created and destroyed by external entities and processes can only observe their presence/absence. This is formally represented by the rules and .
To explain the operational semantics, we come back to our running example. The following trace illustrates how the workshop works. At the beginning, instantiates a new worker (a resource request point) when receiving a hard job:
where . At this point the new worker can take a hammer and other jobs are also available (on the channel ). In the following, for the sake of simplicity, we only show sub-processes that involve computation. Assume that the new worker takes a hammer, then we have the following transition:
Now, three workers are similarly instantiated for doing all remaining jobs.
In the current setting, the new three workers make one request on the remaining hammer and two requests on the mallet. Since we have only one mallet, one of two mallet requests could be done at a time. Suppose the first job get done first, we have the following transition:
Note that the hammer is available again. Similarly, the second job is done as follows:
If the third job would be processed, then a forced release could occur. This happens because the worker attempts to do a hard hit by using a mallet in doing the job, which violates the mallet policy.
Finally, the similar trace is for the fourth job.
3 Control Flow Analysis
In this section, we present a CFA for our calculus, extending the one for -calculus . The CFA computes a safe over-approximation of all the possible communications of resource and channel names on channels. Furthermore, it provides an over-approximation of all the possible usage traces on the given resources and records the names of the resources that can be possibly not released, thus providing information on possible bad usages. The analysis is performed under the perspective of processes. This amounts to saying that the analysis tries to answer the following question: “Are the resources initially granted sufficient to guarantee a correct usage?”. In other words, we assume that a certain fixed amounts of resources is given and we do not consider any dynamic reconfiguration, possible in our calculus, due to the rules and . The reconfiguration is up to the resource manager and is not addressed by the CFA.
For the sake of simplicity, we provide the analysis for a subset of our calculus, in which processes enclosed in the scopes of resources are sequential processes (ranged over by ), as described by the following syntax. Intuitively, a sequential process represents a single thread of execution in which one or more resources can be used.
This implies that one single point for releasing each resource occurs in each non deterministic branch of a process. The extension to general parallel processes is immediate. Nevertheless, it requires some more complex technical machinery in order to check whether all the parallel branches synchronise among them, before releasing the shared resource.
In order to facilitate our analysis, we further associate labels with resource boundaries as follows: and , in order to give a name to the sub-processes in the resource scopes. Note that this annotation can be performed in a pre-processing step and does not affect the semantics of the calculus. During the computation, resources are released and acquired by other processes. Statically, sequences of labels are used to record the sequences of sub-processes possibly entering the scope of a resource. Furthermore, to make our analysis more informative, we enrich the execution traces with special actions that record the fact that a resource has been possibly:
acquired by the process labelled : , with a successful request;
released by the process labelled : with a successful release;
taken away from the process labelled : because of an access action on that does not satisfy the policy.
The new set of traces is , where . The corresponding dynamic traces can be obtained by simply removing all the special actions.
The result of analysing a process is a tuple called estimate of , that provides an approximation of resource behavior. More precisely, and offer an over-approximation of all the possible values that the variables in the system may be bound to, and of the values that may flow on channels. The component provides a set of traces of actions on each resource. Finally, records a set of the resources that can be possibly not released. Using this information, we can statically check resource usages against the required policies.
To validate the correctness of a given estimate , we state a set of clauses that operate upon judgments in the form , where is a sequence of pairs , recording the resource scope nesting. This sequence is initially empty, denoted by .
The analysis correctly captures the behavior of , i.e. the estimate is valid for all the derivatives of . In particular, the analysis keeps track of the following information:
An approximation of names bindings. If then the channel variable can assume the channel value . Similarly, if then the resource variable can assume the resource value .
An approximation of the values that can be sent on each channel. If , then the channel value can be output on the channel , while , then the resource value can be output on the channel .
An approximation of resource behavior. If then is one of the possible traces over that is performed by a sequence of sub-processes, whose labels are juxtaposed in .
An approximation of the resources which are possible locked by processes in deadlock for trying to access or to release a resource not in their scope. More precisely, if is in and occurs in , then the resource can be possibly acquired by a process that can be stuck and that therefore could not be able to release it.
The judgments of the CFA are given in Tab. 2, which are based on structural induction of processes. We use the following shorthands to simplify the treatment of the sequences . The predicate is used to check whether the pair occurs in , i.e. whether . With we indicate that the pair is replaced by in the sequence . With we indicate the sequence where the occurrence has been removed, i.e. the sequence , if .
All the clauses dealing with a compound process check that the analysis also holds for its immediate sub-processes. In particular, the analysis of and that of are equal to the one of . This is an obvious source of imprecision (in the sense of over-approximation). We comment on the main rules. Besides the validation of the continuation process , the rule for output, requires that the set of names that can be communicated along each element of includes the names to which can evaluate. Symmetrically, the rules for input demands that the set of names that can pass along is included in the set of names to which can evaluate. Intuitively, the estimate components take into account the possible dynamics of the process under consideration. The clauses’ checks mimic the semantic evolution, by modelling the semantic preconditions and the consequences of the possible synchronisations. In the rule for input, e.g., CFA checks whether the precondition of a synchronisation is satisfied, i.e. whether there is a corresponding output possibly sending a value that can be received by the analysed input. The conclusion imposes the additional requirements on the estimate components, necessary to give a valid prediction of the analysed synchronisation action, mainly that the variable can be bound to that value.
To gain greater precision in the prediction of resource usages, in the second rule, the continuation process is analysed, for all possible bindings of the resource variable . This explains why we have all the other rules for resources, without resource variables.
The rule for resource joint point updates to record that the immediate sub-process is inside the scope of the new resource and there it is analysed. If the process is empty, i.e. in the case the resource is available, the trace of actions is recorded in .
In the rule for resource request point, the analysis for is performed for every possible element from the component . This amounts to saying that the resource can be used starting from any possible previous trace . In order not to append the same trace more than once, we have the condition that does not contain . This prevents the process labelled to do it. Furthermore, is enriched by the special action that records the fact that the resource can be possibly acquired by the process labelled .
According to the rule for access action, if the pair occurs in (i.e. if we are inside the resource scope of ) and the updated history obeys the policy , then the analysis result also holds for the immediate subprocess and is updated in , by replacing in with , therefore recording the resource accesses to possibly made by the sub-process labelled by .
In case the action possibly violates the policy associated with (see the last conjunct), the process labelled may loose the resource , as recorded by the trace in , , with the special action appended to . If instead, the action on is not viable because the process is not in the scope of , then all the resources in the context could not be released, as recorded by the component .
According to the rule for release, the trace of actions over at is recorded in . Other sub-processes can access the resource starting from the trace . Furthermore, is removed from and this reflects the fact that the process can exit its scope, once released the resource . Similarly, in the last rule, is removed from and there the process is analysed. Again, if the action on is not possible because the process is not in the scope of , then all the resource in the context could not be released, as recorded by the component .
We briefly interpret the results of CFA on our running example. A more complex of exemplification of CFA is given in the next example (see below). First we associate labels with the resource boundaries as follows:
It is easy to see that there is one policy violation, which is captured by our CFA in the component , from which we can extract the following trace: . It occurs when doing the third job the worker tries to hit hard using a mallet. We know that the channel (, resp.) is supposed to send/receiving hard jobs (soft jobs, resp.), i.e. sending/receiving (, resp.) and names and are supposed to be bound to and respectively. By checking the component and , we can explain the above violation too. On the one hand, we found that is a singleton set of , while is a set of and , which is a wrong bound of . On the other hand, similarly we found that contains only , while contains and , which is a wrong use of .
Example 3.2 (Robot Scenario)
We now consider a scenario, where a set of robots collaborate to reach a certain goal, e.g. to move an item from one position to another. Without loss of generality, we assume that robots operate in a space represented by a two-dimensional grid. We also assume that certain positions over the grid are faulty, and therefore they cannot be crossed by robots. To move the item, a robot needs to take it, and this is allowed provided that the item is co-located within the range of robot’s sensor. Moreover, since robots have a small amount of energy power, they can perform just a few of steps with the item. Finally, we consider three families of robots ( and ): each robot in the family has different computational capabilities.
Fig. 3 gives a pictorial description of the initial configuration of the scenario. Positions are represented by circles and double circles. Double circles indicate faulty positions. The item is located at position and the goal is to move it into the position . There is just one faulty position , crossing through which is considered a failure. Moreover, we consider a scenario where the three families of robots and are initially located at , and , respectively (e.g. all the robots of the family are located at ).
Sensors are modelled by clearly identified resources. The sensor of the robot family is specified by the resource , where is the name of the sensor,