Secondary use of data in EHR systems

Secondary use of data in EHR systems

Fan Yang111DTU Informatics, Technical University of Denmark, Lyngby, Denmark    Chris Hankin222ISST, Imperial College London, London United Kingdom    Flemming Nielson 333DTU Informatics, Technical University of Denmark, Lyngby, Denmark    Hanne Riis Nielson444DTU Informatics, Technical University of Denmark, Lyngby, Denmark

We show how to use aspect-oriented programming to separate security and trust issues from the logical design of mobile, distributed systems. The main challenge is how to enforce various types of security policies, in particular predictive access control policies - policies based on the future behavior of a program. A novel feature of our approach is that advice is able to analyze the future use of data. We consider a number of different security policies, concerning both primary and secondary use of data, some of which can only be enforced by analysis of process continuations.

1 Introduction

Whilst there is broad agreement that security and other non-functional properties should be designed into systems ab initio it is also recognized that, as society becomes more IT-savvy, our expectations about security and privacy evolve. This is usually followed by changes in regulation in the form of standards and legislation. Thus, whilst we would still argue that security should feature in the initial design of a system, there is merit in separating out security and other non-functional properties so that they can be updated without disturbing the functional aspects of the system.

This paper focuses on designing a language for specifying policies for access control and explicit flow of information. The traditional approach to enforcing such security policies is to use a reference monitor [1] that dynamically tracks the execution of the program; it makes appropriate checks on each basic operation being performed, either blocking the operation or allowing it to proceed. In concrete systems this is implemented as part of the operating system or as part of the interpreter for the language at hand (e.g. the Java byte code interpreter); in both cases as part of the trusted computing base. Sometimes it is found to be more cost effective to systematically modify the code so as to explicitly perform the checks that the reference monitor would otherwise have imposed [2]. In any case, even small modification in the security policies may involve substantial changes in the code or the underlying system.

The notion of aspect-oriented programming [3, 4] is an interesting approach to separation of concerns. The enforcement of security policies is an obvious candidate for such separation of concerns, e.g. because the security policy can be implemented by more skilled or more trusted programmers, or indeed because security considerations can be retrofitted by (re)defining advice to suit the (new) security policy. The detailed definition of the advice will then make decisions about how to possibly modify the operation being trapped. This calls for a modified language (like AspectJ [3] for Java) that supports the use of aspects and where a notion of trapping operations and applying advice has been incorporated. It is possible to systematically modify the code so as to explicitly perform the operations that the advice would have imposed (e.g. [3]).

In many cases the aspect-oriented approach provides a more flexible way for dealing with modifications in security policies [5, 6, 7, 8, 9] than the use of reference monitors. It facilitates the use of frameworks for security policies that may be well suited to the task at hand but that are perhaps not of general applicability and therefore not appropriate for incorporating into a reference monitor.

Outline of the paper.

In this paper we are primarily interested in the modelling of mobile, distributed systems. The work is based on the coordination language KLAIM [10] (reviewed in Section 2) that facilitates distribution of data, mobility of code and handling of dynamically evolving, open systems. The main contribution of the paper is the design of AspectKE, an aspect-oriented extension of KLAIM that facilitates the trapping of actions (presented in Section 3) as well as processes (presented in Section 5), which can enforce traditional access control policies (presented in Section 4) as well as predictive access control policies (presented in Section 6) – i.e., security policies based on the future behavior of a program.

To evaluate our language design we shall throughout the paper illustrate its features using a running example based on a health information system for a care facility for the elderly people in New South Wales, Australia [11]. In Section 4 we show how to use AspectKE to enforce basic primary use of data policies, that is, policies concerned with the right to access data. Here we consider the three classical access control models, namely discretionary access control, mandatory access control and role-based access control [1]. Furthermore, we illustrate how multiple security policies can be integrated into existing systems thereby allowing policies to be refined at later stages in the system development. In Section 6 we show how secondary use of data policies can be modelled; these policies are concerned with how data is used once it has been obtained [12]. They can be enforced by using predictive access control policy enforcement mechanism offered by AspectKE. Here we exploit the ability to analyze not only the behavior of remotely executed processes but also the future use of data. To the best of our knowledge few, if any, proposals have ever used aspect oriented programming to tackle secondary use of data policies and provide a predictive access control policy enforcement mechanism.

Finally, in Section 7 we present related work and we conclude in Section 8 with a discussion of the experience gathered from a proof-of-concept implementation [13] and outline some future work.

2 Background: KLAIM

AspectKE is an extension of the KLAIM (Kernel Language for Agents Interaction and Mobility) coordination language [10] with support for aspect oriented programming. In this section we will review the fragment of KLAIM that will be used for AspectKE in the following section.

KLAIM is a language specifically designed to program distributed systems consisting of several mobile components that interact through multiple distributed tuple spaces (databases). KLAIM uses a Linda-like generative communication model [14] but, instead of using Linda’s global shared tuple space (shared database), KLAIM associates a local tuple space with each node of a net. Each node may also have processes associated with it; the KLAIM computing primitives allow programmers to distribute and retrieve data and processes to and from locations (nodes) of a net, evaluate processes at remote locations and introduce new locations to the net.

2.1 Syntax of KLAIM

The syntax of a fragment of KLAIM is displayed in Table 1.

Table 1: KLAIM Syntax – Nets, Processes and Actions (Part of AspectKE).

A net (in Net) is a parallel composition of located processes and/or located tuples. For simplicity, components of tuples can be location constants only555Compared with the original KLAIM, we do not allow processes to be components of tuples.. We use the notation to represent a sequence of location constants and is used to represent the empty sequence. Nets must be closed: all variables must be in the scope of a defining occurrence (indicated by an exclamation mark).

A process (in Proc) can be a parallel composition of processes, a guarded sum of action prefixed processes, or a replicated process (indicated by the operator). We write 0 for a nullary sum, for a unary sum, and for a binary sum.

An action (in Act) operates on locations, tuples and processes: a tuple can be output to, input from (read and delete the source) and read from (read and keep the source) a location; processes can be spawned at a location; new locations can also be created. The actual operation performed by an action is called a capability (in Cap) – this is a key concept when formalizing uses of data later. We do not distinguish real locations and data: all of them are called locations (in Loc) in our setting, which can be location constants , defining (i.e. binding) occurrences of location variables (where the scope is the entire process to the right of the occurrence), and use of location variables .

Well-Formedness of Locations and Actions.

We do not allow multiple defining occurrences of the same variable in an action. We also prohibit bound variables and free variables from sharing any name in a single action. Thus we disallow as well as .

2.2 Semantics of KLAIM

Informally the meaning of a KLAIM program is as follows:

  1. a node is selected for the next step of execution

  2. if the process at the node is a choice, then one of the enabled choices is chosen non-deterministically and executed as described in the following four steps

  3. if the prefix of the process is an output action, the output is performed

  4. if the prefix of the process is an input (either destructive or non-destructive), the input action is enabled if there is a matching tuple at the target location, and the input is performed and appropriate variables are bound in the remainder of the process

  5. if the prefix is an eval, the process is spawned at the target location

  6. if the prefix is a newloc, the network is dynamically extended with a new location and the continuation process is given the address of that location

  7. then return to Step 1

Notice that we do not need to deal with parallelism and replication within nodes because, at the cost of having duplicate addresses in the network, these can be lifted to the net level.

2.3 Running Example

Health Care Information Systems are gradually becoming prevalent and indispensable to our society. An electronic health record (EHR), part of a system’s database, stores a patient’s data and is created, developed, and maintained by the health care providers.

To illustrate the use of KLAIM, we now introduce a typical EHR system, which is inspired by [11], and the scenario presented here is used throughout the paper.

The EHR database (EHDB) stores all patient healthcare records and we assume that there are two types of data recorded for each patient: medical records (MedicalRecord) and private notes (PrivateNote). Medical records are entries created by doctors and so are the private notes; however the latter are of a more confidential nature. Also we distinguish between past records (Past) that have been entered into the EHR system previously and recent records (Recent) that have been created since the patient was admitted to the hospital. We therefore assume that the EHR database contains tuples with the following five fields:

patient The name of the patient
recordtype The type of record: MedicalRecord or PrivateNote
author The author of the record
createdtime The time of creation of the record: Recent or Past
subject The record’s content

For example is a recent medical record of , created by and it has content .

Doctors and nurses, as well as the patient, can access a patient’s record. We model these actors as locations in a network; the process at the location represents the actions of the individual and the data is the individual’s local “knowledge”. As an example the following process expresses that DrSmith reads one of the Past medical records for Alice created by DrHansen before she was admitted to this hospital, writes some of the information in her own note (in location DrSmith) and then creates a new medical record for the patient:

Here DrSmith will first consult location EHDB and read a five-tuple whose first four components are Alice,MedicalRecord,DrHansen, and Past respectively and the corresponding fifth component is assigned to variable . The second action will write the read at the first action to the location associated with DrSmith. The final construct will write a new five-tuple to location EHDB for this patient whose last three components indicate that the author is DrSmith, it is a Recent medical record and the content is newtext.

To illustrate the semantics of KLAIM let us consider the following net, consisting of locations EHDB and DrSmith:

The execution may proceed as follows:

DrSmith first reads the tuple from EHDB; the binding of the variable is reflected in the continuation of the process. In the second step DrSmith outputs a tuple that consists of Alice together with a bound (alicetext) to her own tuple space. In the final step, a new tuple that represents a new medical record is written to location EHDB.

3 AspectKE: Trapping Actions

We now show how to integrate aspects into KLAIM by presenting the basic features of AspectKE, with a focus on how aspects trap actions in a KLAIM program. We consider a global set of aspects.

3.1 Syntax

The Syntax of AspectKE is given by Tables 2 and 1 (the KLAIM syntax).

Table 2: AspectKE Syntax - Aspects for Trapping Actions

Table 2 introduces a system (in System) that consists of a net and a sequence of global aspect declarations . An aspect declaration (in Asp) takes the form : is the aspect name, and (in Advice) is the advice to the trapped action. Each action (the Act in Table 1) is a potential join point that can be intercepted by AspectKE’s pointcut (in Cut).

Moreover, is introduced as a don’t-care parameter in the cut version of actions, and in the test primitive of conditional expressions (BExp). It can match any type of location used in the program. Note in the cut, the occurrence of and have different meaning from those of KLAIM; a plain variable in a pointcut can only match an actual location and banged (!) variables in the pointcut can only match against binding occurrences of variables, while the don’t-care () can match both in the join point.

Each aspect gives a unique run-time suggestion (either break or proceed) which may depend on the evaluation of a conditional expression. The suggestion break suppresses the trapped action whilst proceed allows the trapped action to be executed. In case of multiple aspects that trap an action, break takes precedence over proceed. The primitive evaluates to if a tuple exists in the tuple space of which matches . Besides basic boolean expressions, condition also includes bounded existential quantification and universal quantification – this allows simple queries to the databases occurring in the nets.

In contrast to other aspect languages, the condition is part of the advice instead of being part of the pointcut (being evaluated before intercepting a join point). Evaluating the condition after intercepting a join point allows a more natural modelling of security policies.

Well-formedness of Cuts.

In addition to the well-formedness conditions for KLAIM, we require that the variables in a cut are pairwise distinct. We shall also impose that aspects are closed: any free variable in the body is defined in the cut. Additionally, when !u is used in a cut pattern, u should not be used in conditions except in the context of set expressions.

Example 1

To illustrate how aspects can be composed in AspectKE that work with the KLAIM program, the following simple aspect gives advice to the running example in section 2.3.

The aspect traps an out action of processes running at location DrSmith that attempt to send a tuple with two fields. If the actual value of the second field is equal to alicetext, the aspect will break the execution of the action and its continuation process. Otherwise, the action continues.

3.2 Semantics

The base semantics is that of KLAIM (Section 2) but now, before executing an action (all actions in a KLAIM program are potential join points), we check to see if any aspect applies to the action and combine the advice of all applicable aspects. Each advice is either that the action be allowed to proceed or not. We resolve possible conflicts by ensuring that any aspect that disallows an action has priority. Aspects are applied in definition order but, because aspects can only allow or disallow the join point to proceed, the order is actually immaterial.

Example 2

Suppose we have a system that contains the same net as in running example of Section 2.3 and aspect in Example 1:

let in

and some steps of execution (omitting the aspect definition):

Aspect does not trap the read action, thus the read action executes and binds content with alicetext. But traps the first out action, and the result substitution is

and the case condition evaluates to tt, thus the aspect breaks the execution of this action and its continuation process.

4 Worked Examples: Advice for Access Control Models

To evaluate the expressiveness of the language and show its language features, we now show how AspectKE can be used to enforce access control policies by utilizing three well-known access control models, namely discretionary access control (Section 4.1), mandatory access control (Section 4.2) and role-based access control (Section 4.3), and how AspectKE can introduce new aspects for retrofitting new policies to existing systems (Section 4.4).

Since patient confidentiality is an important issue in the health care industry it is imperative that EHRs are protected [15]. To help achieve this goal, governments define many types of security policies, encapsulated in various acts and guides (e.g. [16, 17]). Throughout the paper, we will enforce several security policies for the EHR system that was introduced in Section 2.3 and this shows different features of the language.

The first is a primary use of data policy inspired by [11] which regulates the basic access control concerning the read and write rights owned by doctors and nurses:

Doctors can read all patients’ medical records and private notes, while nurses can read all patients’ medical records but cannot read any private notes. Medical records and private notes can only be created by doctors.

For simplicity, here we restrict ourselves to only focussing on read, in and out actions, while eval and newloc actions will be discussed further when enforcing other security policies.

4.1 Discretionary Access Control

We will show how to enforce the above policy with discretionary access control (DAC), which is a type of access control as a means of restricting access to objects based on the identity of subjects and/or the groups to which they belong[18]. We do so by using an access control matrix containing triples identifying which subjects can perform which operations on which objects . If we use the KLAIM programming model, we should equip the semantics of KLAIM with a reference monitor that consults the access control matrix when an action is executed to check if the action is permitted. In AspectKE we can directly use aspects to elegantly inline the reference monitor to enforce this discretionary access control policy.

Example 3

The access control matrix is stored in location DAC, which contains tuples: . For example, if DrSmith is a doctor and NsOlsen is a nurse, then DAC might contain the following tuples:

We also assume that the location DAC can only be modified by privileged users, thus doctors and nurses cannot perform any in and out action on it. This can be enforced by other aspects but we omit them here.

The following aspect declarations will impose the desired requirements.

Aspects , , and enforce the above policy by using DAC, where the access rights for each user are actually described. Note that the second field of the tuple operated by these cut actions is recordtype, which trap an action that clearly specifies a concrete record type.

Consider the following KLAIM program that is a variant of the running example in Section 2.3 (in that the user is nurse NsOlsen instead of doctor DrSmith) and is equipped with the above four aspects:

The first read action will be trapped by aspect , and the resulting substitution is

and the condition test(NsOlsen, MedicalRecord, read)@DAC is evaluated. Since NsOlsen has the appropriate right according to DAC we proceed and perform this read action thereby giving rise to the binding of to alicetext.

The second action will not be trapped by any of the aspects, so it will simply be performed and the tuple is output to location NsOlsen.

The last action will be trapped by aspect and after the substitution we evaluate the condition test(NsOlsen, MedicalRecord, out)@DAC which is evaluated to ff and thus we break the execution.

However, the KLAIM program can also execute read or in actions without specifying the record type, e.g., using !recordtype instead of recordtype, users can thus get a record as follows:

where a successful input action can retrieve any type of EHR record.

None of the above aspects can trap these input actions, thus we have to enforce additional aspects so that the above input actions will not bypass our aspects and consequently break the policy. The simple aspects forbid any attempts to read or in (read and then remove) EHR records without specifying the record type:

One may wonder why not build the above two aspects on top of aspects and by directly replacing recordtype with !recordtype in their pointcut, respectively. The reason is that these aspects will not be well-formed: when trapping actions, recordtype binds with a variable, which cannot be used in a test condition such as .

4.2 Mandatory Access Control

In this subsection we will show how to enforce the above policies by using mandatory access control (MAC), which is a means of restricting access to objects based on the sensitivity (as represented by a label) of the information contained in the objects and the formal authorization (i.e., clearance) of subjects[18]. Before enforcing the above policy, we first impose a comparable classical MAC policy - the Bell-LaPadula security policy [1] based on a mandatory access control model. Later we enforce the above policy as a variant of the Bell-LaPadula policy. In the presentation, security levels are assigned to subjects (as clearances) and objects (as labels).

Example 4

In this scenario, we just need two security levels, and may assign security levels to subjects as follows: doctors have level high and nurses have level low; similarly we may assign objects as follows: private notes have level high and medical records have level low.

To model this policy we need to introduce a location MAC that stores tuples of the form: and . Continuing Example 3, we create the location MAC with the tuples:

As before we also assume that the location MAC can only be modified by privileged users.

Firstly, we enforce the Bell-LaPadula security policy [1] to illustrate that AspectKE can enforce a well-known mandatory access control policy. Then we will enforce our example policy, with small modifications based on the aspects that enforce Bell-LaPadula policy.

If we enforce the Bell-LaPadula security policy, the first part of the policy states that a subject is allowed to read or input data from any object provided that the subject’s security level dominates that of the object. In our case, this guarantees no read up: that is, low subjects (nurses) cannot read high objects (private notes) but can only read low objects (medical records); however, high subjects (doctors) can access both kinds of records.

The no read up part of the policy can be enforced by aspects as follows:

The second part of the policy (a simplified form of Bell-LaPadula star property [1]) states that a subject can write to any object provided that the security level of the object dominates that of the subject (no write down). In our case high subjects (doctors) cannot write low objects (medical records) but low subjects (nurses) can write to both kinds of records.

The no write down of the policy can be enforced by the aspect below:

Additionally, we have an aspect for the read action to prevent users from reading records without specifying the record type, and an aspect for the in action to prevent users from reading and deleting records:

These aspects correctly enforce our policy about reading patient records. However, the no write down policy is not quite right for our example, instead we depart from the Bell-LaPadula policy and define:

This aspect allows doctors to write any kind of record.

The aspect together with reflect a mandatory access control model which satisfies our policy. In this case we only allow high users (doctors) to write patient records. Hence nurse NsOlsen in Example 3 cannot execute the third action as it will be blocked by , which would be allowed with from the Bell-LaPadula security policy.

4.3 Role-Based Access Control

Role-based access control (RBAC) [19] is another access control mechanism which allows the central administration of security policies and is often more flexible and elegant for modelling security policies. The simplest model in the RBAC family is , where there are three sets of entities called user, role, and permission. A user can be assigned multiple roles (role assignment) and a role can have multiple permissions (permission assignment) to corresponding operations. In addition, the user can initiate a session during which the user activates some subset of roles that he or she has been assigned. A user can execute an operation only if the user’s active roles have the permission to perform that operation.

Example 5

To implement the security policy for patient records, we use a model that does not differentiate a user’s assigned role and active role (we assume that the assigned roles of all users are activated by default), so we only need location RDB with tuples :

For permission assignment we also need a location to describe each role’s permission. This can be done by storing tuples at :

Once more we assume that the locations RDB and PDB can only be modified by privileged users.

The following aspects then implement the required policy:

These three aspects are useful for interrupting the execution when a user attempts to operate on EHR records with a concrete record type, which essentially relies on the tuples from RDB and PDB. They also show the benefit of admitting quantifiers into the conditional expressions.

Similar to the previous subsections, we have to enforce additional aspects for capturing user attempts to access EHR records without specifying the record type.