Stream on the Sky: Outsourcing Access Control Enforcement for Stream Data to the Cloud
There is an increasing trend for businesses to migrate their systems towards the cloud. Security concerns that arise when outsourcing data and computation to the cloud include data confidentiality and privacy. Given that a tremendous amount of data is being generated everyday from plethora of devices equipped with sensing capabilities, we focus on the problem of access controls over live streams of data based on triggers or sliding windows, which is a distinct and more challenging problem than access control over archival data. Specifically, we investigate secure mechanisms for outsourcing access control enforcement for stream data to the cloud. We devise a system that allows data owners to specify fine-grained policies associated with their data streams, then to encrypt the streams and relay them to the cloud for live processing and storage for future use. The access control policies are enforced by the cloud, without the latter learning about the data, while ensuring that unauthorized access is not feasible. To realize these ends, we employ a novel cryptographic primitive, namely proxy-based attribute-based encryption, which not only provides security but also allows the cloud to perform expensive computations on behalf of the users. Our approach is holistic, in that these controls are integrated with an XML based framework (XACML) for high-level management of policies. Experiments with our prototype demonstrate the feasibility of such mechanisms, and early evaluations suggest graceful scalability with increasing numbers of policies, data streams and users.
An enormous amount of data is being generated continuously, by organizations and individuals carrying out their day-to-day activities, as well as by dedicated sensing and monitoring infrastructures. Examples include financial services for monitoring stock prices , sensor networks for meteorological, environmental , battle field and traffic control [9, 10] monitoring. As personal devices, especially those equipped with sensing capabilities, are enjoying an unprecedented growth, they are also becoming a prominent source of stream data. Applications such as participatory sensing [22, 28] and personal health monitoring  collect continuous data from sensors fitted in smart phones or in other mobile devices.
The abundance of such data brings many new opportunities, such as real time decision making and resource management at various scales - from personal area or home networks to smart planet. Often, these applications assume sharing and mash-up of data from multiple sources, possibly created and owned by different stake holders. One important requirement is to equip data owners with adequate control for determining different granularity in which the sharing is done with various parties. Another requirement is the infrastructure over which such sharing can be done in a scalable manner.
One can presume that the need for a scalable infrastructure can be readily realized thanks to the advent of cloud computing. Businesses are moving their computing systems cloud-ward, availing themselves of the elastic resources, ease of management and good cost-benefit trade-off [11, 25, 18, 38]. While recent advances in the technologies have given users more control and better performance [13, 37], limited progress has been made towards security guarantees, particularly vis-a-vis sharing data with multiple other parties in possibly different granularity. In general, many argue [11, 35, 34] that security remains one of the biggest obstacles to be overcome before the potential of the cloud can be fully realized.
This paper is an important step exploring the design space of outsourcing the enforcement of access control of live data streams to the cloud. Our goal is to design a system that supports fine-grained access control policies in which most of the expensive computations are done by the cloud. We assume that a data owner outsources its stream data to an untrusted cloud. The system must allow the data owner to specify fine-grained access control policies for the data stream. It must ensure security with respect to access control enforcement, i.e. no unauthorized access is allowed even if the cloud colludes with dishonest data users. Finally, the system must protect privacy of the outsourced data.
Access control for stream data is more challenging than for traditional non-stream (archival) data. In archival databases, access is defined on views  which are constructed by querying the databases. Since views are static and can be pre-computed, this technique is not applicable to stream data because of the potentially infinite size and queries being continuous and are carried out over newly arrived data. Most critically, the nature of stream data demands support for more fine-grained policies, especially those involving temporal constraints . Specifically, most policies can be categorized as trigger or sliding window policies:
- Trigger: A user is given access to a data record when its content satisfy a certain condition. For example, in a personal health monitoring system , a user tracks his blood pressure, heart rate, blood sugar, etc. using a portable device at regular intervals. On the one hand, such data is crucial for doctors to monitor patient recovery and to make early diagnosis. On the other hand, the user is concerned with his privacy and would not wish to disclose all the measurements. As a result, he can specify a policy dictating that certain doctors have access to his data only when his blood pressure exceeds a threshold.
- Sliding window: A user is given summaries of the data over specific windows, as opposed to having access to raw data. For example, a financial company monitoring stock prices every second can sell its raw data for a very high price. It can also offer more coarse-grained packages at lower prices which contain only the average stock prices over -second interval (). This gives the data owner greater flexibility in managing and generating profit from its data.
A naive design to realize these goals would be to encrypt the data using a symmetric encryption scheme, store it on the cloud and distribute the decryption keys to authorized users. Such a system [34, 27] relies on the data owner for access control enforcement, while the cloud acts only as a transit storage provider. Use of a more advanced encryption scheme, namely attribute-based encryption (ABE) scheme [23, 14], can help relieve the data owner from much of the key management tasks. However, ABE deals only with individual data objects and therefore can not naturally support sliding window policies. Additionally, ABE effectively pushes the access control management task towards the end-user who needs to check all the ciphertexts and discard that that will result in invalid decryption. Such a modus operandi is inefficient and does not scale for large systems with many different data streams.
In this paper, we propose to use a proxy attribute-based encryption scheme  for trigger policies, and extend it with support for sliding window policies. Out system ensures access control while also offloading expensive computations to the cloud. Furthermore, we decouple the task of policy management from security enforcement by using the XACML framework . The cloud handles this task seamlessly, and as a result the end-user only receives and decrypts ciphertexts of his authorized data. Adding an explicit layer of management on top of ABE helps the system scale better with more data streams and more policies. In summary, our main contributes are as follows:
1. We propose an extension to an ABE scheme which provides support for sliding window access control.
2. We design a system allowing data owner to outsource stream data and access control enforcement to the cloud. The system supports both trigger and sliding window policies in a secure manner. It is also scalable, in the sense that most of the expensive operations are done by the cloud. We achieve both security and scalability by combining novel cryptographic primitives with the standard XACML framework for policy management.
3. We implement a prototype of our system. Through preliminary evaluation, we find that the overhead is reasonable.
The remaining of the paper is organized as follows. The next section presents the system and security model. Section III details the cryptographic protocols. It is followed by the design for policy management. We present our prototype and preliminary evaluation in Section V. Related work is discussed in Section VI before we conclude in Section VII.
Ii System Model
Our system consists of a number of data owners (or owners), one cloud provider and a number of data users (or users). As depicted in Figure 1, a data owner generates a stream of data and sends it to the cloud. Several users interested in the data will retrieve it through the cloud. The owner and users agree on the access policy before-hand. In summary, the data is outsourced to the cloud where it will be stored, managed and distributed to a set of users.
Ii-a Data Model
For simplicity, we assume that data generated by the owner are key-value tuples. A stream is a sequence of ) for where are integers and the value belongs to a known, finite domain. In a participatory sensing application, the may represent a geographical location while may represent a measurement of air pollution. In a time-series application, such as weather monitoring, may represent a time-stamp value while a temperature reading.
For a time-series data stream, i.e. , we define a data window, denoted as , as a sequence of data values whose keys are in between a starting index and an ending index . That is: . A non-overlapping sliding window — denoted as — is a sequence of non-overlapping data windows starting from index , having size . More precisely, the sliding window is .
Ii-B Access Control Model.
An access control policy is defined by the data that an authorized user can read. We consider two types of policies: trigger and sliding window. In the former, the user is granted access to a tuple (i.e. it can read the value ) if is equal to or exceeds a threshold . Denote as the trigger policy, where is a threshold value and is a comparison operator, then:
In a sliding window policy, an user can only access the averages of the data tuples that constitute a non-overlapping sliding window. In other words, the user cannot read individual data tuples, but only the summaries. Denote as the sliding window policy with being the starting index and being the window size, we have:
As an example, suppose a data owner is to share a weather database containing temperature readings collected at five-minute intervals starting from time . A sliding window policy may require that the user can only access the average temperature every hour, starting from time . This corresponding to the policy , meaning that the user can only read the averages of the windows , , , etc.
Ii-C Trust Model
We assume that data owners are honest. They outsource data streams to the cloud and wish to protect privacy of their data against the cloud. They define access policies for a set of users and wish to enforce the policies in a secure manner that does not permit unauthorized data access. The cloud is not trusted in that it tries to learn sensitive information in the data, and may collude with rogue users in order to provide them unauthorized access to the data. However, the cloud is trusted to carry out computations delegated to it by the owners and users. This means it does not skip or distort computations. A data user may be dishonest, in which case he attempts to gain unauthorized access, for which he may collude with the cloud. Finally, we assume that users do not collude with each other (we explain the reason for this assumption shortly).
Ii-D System Design Goals
Given the models above, we now state informally the goals that our system aims to achieve.
Goal 1 (privacy). The cloud cannot learn the outsourced data in plaintext.
Goal 2 (access control, trigger). An user given access according to the policy cannot access values belonging to another policy where or .
Goal 3 (access control, sliding window). An user given access according to the policy cannot access values belonging to any different policy. Specifically, the user cannot see the values belonging to where or . This means the user does not have access to data windows starting before . He can access windows starting from later indices, but those indices must be a multiple of , which is the same as skipping a number of windows. Moreover, the user cannot see the values defined by where or . This means the user cannot access data at finer granularity, nor can it access windows whose sizes are not multiples of . For instance, if an user is given access to windows of size , he must not be able to access windows of size or size . It is worth noticing that the assumption about users not colluding with each others is necessary to achieve this goal. If two users with and where collude, they can derive windows of size or even windows of size .
Goal 4 (scalability). Most of the expensive computation should be off-loaded to the cloud. The protocols involving the data owners and users should be light-weight.
Iii The Protocols
Our protocols for both trigger and sliding window policies consist of 3 phases, as illustrated in Figure 1. First, the data owner and user negotiate an access control policy which can be either a trigger or sliding window policy (negotiation phase). Next, the data owner encrypts its data and sends it to the cloud (outsourcing phase). Upon receiving the ciphertexts, the cloud performs transformation on the ciphertexts and forwards the results to the authorized users (relaying phase). The users receive messages from the cloud and decrypt them to obtain the plaintext data.
A naive approach would be for the data owner to generate one encryption key for each policy, share this key to the authorized users, and encrypt the data using this key. This does not scale for two reasons. First, the amount of duplicated data that must be encrypted grows linearly with the number of policies. Second, for a sliding window policy , the owner has to transform the data locally and update the encrypted version to the cloud. This process must be done for every combination of and .
In our protocol, there is only encrypted copy of the data stored at the cloud for the trigger policies, and up-to copies for sliding window policies where is the number of unique window sizes. To this end, we make use of three important cryptographic primitives: attribute-based encryption (ABE), proxy re-encryption (PRE) and additive homomorphic encryption (AHE). ABE is used to enforce threshold conditions specified in trigger policies, as well as the condition on the starting index in sliding window policies. PRE is used by the cloud to transform ABE ciphertexts to simpler ciphertexts that can be decrypted by users. The aims of this transformation is to relieve the users from carrying out expensive operations involving ABE. For sliding window policies, AHE enables the cloud to compute the encrypted aggregates of the data windows over the encrypted text without breaking data privacy.
Iii-B Cryptographic Primitives
Iii-B1 Attribute Based Encryption.
Attribute-Based Encryption (ABE) schemes produce ciphertexts that can only be decrypted if a set of attributes satisfies certain conditions (or policies). Two types of ABE exists: key-policy ABE  (KP-ABE) and ciphertext-policy ABE  (CP-ABE). In both cases, a policy is expressed as an access structure over a set of attributes. For example, if the policy is defined as the predicate , then , . The plaintext can be recovered only if evaluating the policy over the given attributes returns true.
In KP-ABE, the message is encrypted with a set of attributes and an access structure is embedded in a decryption key. In CP-ABE, is embedded within the ciphertext and the user is given a set of attribute . In both cases, decryption will succeed and return if . KP-ABE and CP-ABE are based on secret sharing schemes that generate shares in a way that the original secret will only be reconstructed if a certain access structure is satisfied. Constructions of KP-ABE and CP-ABE for any linear secret sharing scheme (modelled as a span program) are described in [23, 14]. In practice, the schemes rely on threshold tree structures for secret sharing, in which non-leaf nodes are threshold gates (AND and OR).
We adopt KP-ABE in our work, which is more suitable than CP-ABE, because the attributes are based on the plaintext message (the key value ). Although KP-ABE and CP-ABE are similar and in some cases can be used interchangeably, the former is more data-centric (concerning the question of who gets access to the given data), whereas the latter is more user-centric (concerning the question of which data the given user has access to). For the trigger policy we encrypt using KP-ABE with as the attribute and as the policy. Similarly, for the sliding window policy , we encrypt using as the attribute and as the policy.
Since the policies involve arithmetic comparison, we make use of the bag-of-bits method that translates an integer into a set of attributes. For example, with -bit integers, an attribute can be mapped to , and the policy can be mapped into where is an attribute representing values greater than or equal to . It can be seen that a policy involving equality condition often involves more attributes than a policy with an inequality condition.
Iii-B2 Proxy Re-Encryption.
A Proxy Re-Encryption scheme (PRE)  enables a third-party to transform one ciphertext to another without learning the plaintext. Using a transformation key it can convert to which can be decrypted with , without learning . Traditional PRE schemes have been used for distributed file systems, in a way that saves the data owner from having different copies of the same data for different users. In this work, we extend a recent proxy-based ABE scheme  which can transform a KP-ABE ciphertext into a Elgamal-like ciphertext whenever the access structure is satisfied. This scheme has the benefit of traditional PRE schemes, and it relieves users from expensive computations involving ABE.
Iii-B3 Additive Homomorphic Encryption.
Sliding window policies require that authorized users have access to the averages of data values within non-overlapping windows. To allow for such an aggregate operation over ciphertexts, the encryption scheme must be additively homomorphic, i.e. for an operator . Paillier  is one of such schemes, which is based on composite residuosity classes of group where is at least 1024-bit.
In this work, we use an Elgamal-like encryption scheme to achieve additive homomorphism, which can also be integrated with ABE. Essentially, consider a multiplicative group with generator . For , and as random values of , then . The homomorphic property holds, because . The drawback of this scheme is that decryption requires finding the discrete logarithm of in base . As in other works using a similar scheme [20, 26, 39], we assume that the plaintext domain is finite and small so that discrete logarithms can be computed by brute-force  or even pre-computed and cached. For many types of applications where data values belong to a small set (such that temperature values), this is a reasonable assumption. Note that the security is not compromised due to the mapping of to , which merely acts as an additional layer used to achieve homomorphism.
To prevent a user from learning the individual values constituting a window, we introduce a blind factor into the ciphertext, which can only be removed when a proper number of ciphertexts are put together. Suppose the windows are of size , the owner generates randomly and give to the user. The ciphertext and cannot be decrypted individually, since and are unknown to the user. However, can be decrypted by computing and recovering by taking its discrete logarithm. In our protocol, the user only needs to know for the starting window, with which the unblind factors for the subsequent windows can be efficiently computed.
Iii-C Protocol for Sliding Window Policies
Figure 2 illustrates the high-level protocol for sliding window policies. In the negotiation phase, data owner and user agrees on the sliding window parameters . The owner generates an access tree for the condition , a transformation key and a decryption key for the user. It sends to the user and to the cloud. For every tuple , the owner encrypts using KP-ABE with attribute . Each tuple is encrypted times, each for a different window size and using a different blind factor. During the relaying phase, the cloud selects ciphertexts whose attribute is greater or equal to , and transforms them using into other ciphertexts that can be decrypted with another key . Every of the transformed ciphertexts are used as inputs to the ComputeSum function which produces an encrypted sum value which can be decrypted using .
Our protocol relies on the properties of bilinear maps for security. Let be two multiplicative groups of prime order , be the generator of . is a bilinear map if it is efficiently computable, for and . Given random group elements , it is difficult to compute (Computational Bilinear Diffie-Hellman Assumption). Also, it is difficult to distinguish from a random element (Decisional Bilinear Diffie-Hellman Assumption).
Negotiation phase. During setup, all parties agree on a security parameter . Let be the attribute universe, i.e. . The owner chooses a master secret value , and other secret . For every window size , it generates a set of random values . The public key contains .
Let be an user with a sliding policy . The owner generates a random value . Next, the policy is converted into an access tree whose leaves are the bag-of-bit representation of , as explained in . Every non-leaf node is a threshold gate: an AND node is a 2-out-of-2 threshold gate, an OR node is a 1-out-of-2 gate. This tree is used to share the master secret . For a node with threshold , choose a random polynomial of degree (so that points are needed to reconstruct the polynomial). For the root node, set . For any other node, set where is the polynomial of ’s parent and is the node index. The leaf node associated with an attribute is given the value .
For the policy , the owner computes:
Finally, the user is sent
where and are the transformation and user decryption keys respectively.
Outsourcing phase. To encrypt , the data owner first translates into bag-of-bit attributes . Next, it chooses a random value and computes for all attribute . It also computes:
Finally, the following ciphertext is forwarded to the cloud:
Relaying phase. The transformation proceeds recursively as follows. When is a leaf node associated with attribute :
When is a non-leaf node, we call for all that are ’s children. Let be the Lagrange coefficient for an element and . Let be the output of , and be the set of ’s children such that . If , return . Otherwise:
By calling Transform on the root node of the access tree, we obtain . Let . For the window, the cloud has to gather the full set of for . Then, it executes ComputeSum as follows:
and sends to the user.
Having received this message from the cloud, the user computes:
Finally, the average is computed as:
Iii-D Protocol for Trigger Policy
The protocol for trigger policies is very similar to that for sliding window. Its main differences are that the access policies may involve other conditions besides as in sliding window, and that no blind factor is needed since authorized users can access individual data tuples.
During the negotiation phase, the owner create the public key containing as before. Next, the access structure for the policy is constructed. The transformation key contains for all leaf node , and the user decryption key is . During the outsourcing phase, a ciphertext sent to the cloud is generated in the same way as with sliding window policy, except for . During transformation, the cloud computes as before and sends to the user, who decrypts it by taking the discrete logarithm of .
Security. The protocols described above provide data privacy with respect to the cloud, because we use an encryption scheme which is an extension of the proxy-based ABE scheme proposed in . As shown in , the scheme is Replayable-CCA (or RCCA) secure. Our encryption is secure in the standard model (as opposed to security in random oracle model), since we use the small-universe construction of ABE.
The access control property for trigger policies hold as a result of ABE. For sliding window policies, we now discuss an informal proof showing that unauthorized access is not possible. Particularly, we need to show that an user given access according to a policy cannot decrypt values belonging to other policies where , , or . Since the ciphertext is encrypted with the attribute and the policy embedded in the transformation key requires , the Transform operation will fail for . Thus, the user is prevented from learning the values in for . From , one can infer the value for and , since . However, for , deriving from is not possible as it requires knowing the sum of a subset of . Therefore, the user cannot recover the plaintext because the unblind factors are incorrect. This similarly holds when or , because computing the blind factors in these cases requires knowing individual values of .
Scalability. The most expensive cryptographic operations are exponentiations and pairings. The latter are done only by the cloud during transformation. The number of pairings is proportional to the size of the access tree. Notice that the access tree is likely to be smaller for a sliding window policy than for a trigger policy, as there is a smaller number of attributes involved in constructing than in constructing . Without the cloud, the user will have to perform these pairings themselves. Instead, in our protocol, the user only performs simple operations such as inversion, multiplication, and only one exponentiation. Having to find discrete logarithm during decryption is a potential bottleneck. But as discussed earlier, we assume the data value domain is known and finite, hence the user may pre-compute the discrete logarithm, thus reducing this problem into a simple table look-up. If is the maximum value that can take, the look-up table has the size of for trigger policies, and for sliding policies where is the maximum window size.
The data owner’s computations are during the negotiation phase and the outsourcing phase. Public and master key are generated only once, hence the cost will amortize over time. Generation of transformation keys is done once per policy. Compared to data encryption during the outsourcing phase, this operation is much less frequent and can be considered as a constant. Thus to say, the overall setup cost is a one-off cost which amortizes over time. During the outsourcing phase, the owner performs one encryption per data value for trigger policies, and encryptions for sliding window policies ( being the number of window sizes). The cost per encryption is roughly the same for both trigger and sliding policies, and is dominated by the cost of exponentiations.
There are overheads incurred when storing the ciphertexts in the cloud. First, ABE encryption introduces overheads that are proportional to the number of attributes per ciphertext. These overhead are constant in our protocol, because the number of attributes is fixed at . Second, the data owner may wish to support sliding window policies of different window sizes, which entails storing multiple ciphertexts for the same data value at the cloud. However, when encrypting for different values of only will be different. As a result, the overhead is limited to where is the number of bits per group element.
Iv Access Control Management
Before starting Transform operation, the cloud checks if the attributes associated with the ciphertext satisfy the access structure associated with the policy. This is to avoid performing redundant transformations which return . As the number of streams and policies increase, management and matching of policies must be done in a systematic and scalable manner. To this end, we adopt a holistic approach that leverages XACML framework, allowing the data owner to specify access control policies in XML format. It also provides a unified framework for managing policies at the cloud, for defining access control requests and matching them against the policies. Traditionally, XACML is used for access control in trusted domains, such as internal systems or in trusted collaborations. Our work utilizes XACML solely for policy management and not for access control enforcement. The latter is instead realized by our cryptographic protocols.
XACML is an OASIS framework for specifying and enforcing access control . It defines standards for writing policies, requests, matching of requests against policies and interpreting the response. Here, we briefly explain the main components of XACML, more details can be found in .
Requests in XACML are written in XML, which contain subject credentials and the system resources being accessed. Subjects and resources are specified in the Attribute elements included in the Subjects and Resources element respectively. Each policy in XACML contains a Target and a set of Rules. A Target element consists of a set of matching rules that must be met by the request before the rest of the policy can be evaluated. The XACML framework comprises two main components: a Policy Enforcement Point (PEP) and a Policy Decision Point (PDP). Requests are sent to PEP which translates them to canonical forms before sending to PDP. It interprets responses from PDP and sends the results back to the users. Policies are loaded into and managed by PDP. After receiving well-formed requests from PEP, it evaluates them against the loaded policies and sends the well-formed responses back to the PEP.
Iv-B Using XACML With Our Protocols
In traditional settings, resource owners write XACML policies and upload them to a access control server. Users interested in the resource write XACML requests and submit them to the server, where they are evaluated by the PDP. If the decisions are Permit, the users are granted access.
In our setting, the cloud acts as the access control server and runs an XACML instance. Each data stream is represented as a resource. The data owner writes XACML policies representing the trigger or sliding window policies. For each policy, the cloud maintains a list of users to which the policy is given. The user does not have to write or submit XACML requests to the cloud. Instead, it is the owner who specifies XACML request for every ciphertext it sends to the cloud. Specifically, the request generated for the tuple contains the value of in the Subject element. Once received at the cloud, it is evaluated against the loaded policies. The result is a set of matching policies and a set of users to whom access to the value should be granted. Figure 3 illustrates an XACML policy for sliding window policies where . Figure 4 shows a request constructed for the tuple where . When evaluated, the PDP will return a Deny decision, and hence the cloud will not perform transformation for the ciphertext of this tuple.
V Prototype and Evaluation
V-a Prototype Implementation
We implement a prototype111The source code is available at https://code.google.com/p/streamcloud/ demonstrating the feasibility of the proposed approach and to benchmark the various components under simple settings. We extend the KP-ABE implementation from , which uses PBC library  for pairing operations. Our extensions mainly concern sliding window policies, which include functions for the Transform operation and for computing the encrypted sums at the cloud. We use a Java implementation of XACML  for managing and matching of policies. The communication between data owners, the cloud and users is implemented in Java, using sockets and one-thread-per-connection concurrency model.
We set up simple experiments to investigate the costs of the main operations in our protocols. For each operation, we use latency as a metric for measuring computation cost, which is also representative of the overhead caused by the operation.
We use the fastest pairing implementation for our experiments (type A ), whose base field size is bits. There is one data owner sending one stream of data with a certain rate. There is one data user, to whom the owner specifies either a trigger policy or a sliding policy (but not both). We fix the sliding window size to and the maximum data value to , which implies that the maximum sum of a window is . For trigger policies, we set to be equality comparison. Our experiments are run on 3 machines, two desktop machines running as the owner and the user, and one laptop running as the cloud. Each desktop has 2 2.66Ghz DuoCore CPUs and 4GB of RAM running Ubuntu Linux. The laptop has 2.3Ghz Core i5 CPU, 4GB of RAM and run Snow Leopard. All machines are connected via university LAN network.
Figure 5 depicts the costs of cryptographic operations. The setup cost for data owner is the highest, since it involves generating public parameters and transformation key. This cost dominates the setup time at the user. Recall that this is a one time cost, and hence does not affect run time performance or practicality of the approach. The cost for sliding window policies is slightly smaller than for trigger policies, because the transformation keys in the former are of smaller sizes (as discussed in the previous section). Another significant cost in the setup phase is the cost of pre-computing the look-up table for discrete logarithms of values in . The setup time at the cloud is much smaller, because it only needs to read and initialize the public and transformation keys from byte arrays.
The cost per encryption is the same for both sliding window and trigger policies (as predicted in the previous section), which is around . We believe this cost is reasonable, especially considering that data streams in reality often generate data in very long intervals (in the order of seconds or even minutes). The transformation cost at the cloud for trigger policies is orders of magnitude larger than for sliding window policies. This is because transformation for the policy requires pairings, whereas for policy only pairing is needed for large values of . The maximum latency is around , which we also consider as reasonable. Decryption cost at the user is an order of magnitude smaller than transformation at the cloud, which further illustrates the benefit of outsourcing bulk of the computation to the cloud.
In order to understand the overhead of encryption with respect to the user-perceived latency, we measure the elapsed time between decrypted values at the user. For trigger policies, we fix to be the same as , so that the the user has access to all the data. Figure 6 shows the inter-arrival time at the user. There exists a lower-bound on this metric when we increase the sending rates (by decreasing the send intervals). This boundary is exactly at the cost per encryption. This is expected since encryption becomes the bottleneck at the data owner for high sending rates.
Finally, we investigate the cost of running the XACML framework. We upload a large number of policies and send matching requests to the cloud. The cost for policy matching is depicted in Figure 7, which is small and scale gracefully with more policies. It suggests the system’s scalability for large numbers of policies, streams and users.
Vi Related Work
Existing works on access control for stream data assume trusted domains and focus on the specification and enforcement of access policies [17, 16, 30]. They are different to our work which considers outsourcing access control to an untrusted environment. Similarly, previous works that use XACML for fine-grained access control have also focused on trusted domains . We use XACML only for policy management, and rely on encryption schemes for policy enforcement.
When the cloud is untrusted, systems such as CryptDb  and LLP  proposed systems that guarantee data privacy while still enabling meaningful, database-related queries on ciphertexts. Our work also provides data privacy, i.e. the cloud cannot learn plaintext values, but it focuses on the question of who can access the data and what the access granularity is. Access control on untrusted environment have been investigated in [27, 41], both of which deal with archival data. Our use of ABE allows for more flexible, fine-grained access control than in . The system proposed in  also employs KP-ABE and the cloud as a proxy, but it does not work with stream data, nor does it support sliding window policies. Furthermore, the cloud’s main role in  is during key revocation requiring data re-encryption, and unlike in our work, the cloud does not alleviate the users from expensive computations.
Attribute-Based Encryption schemes are being actively researched [31, 41, 23, 14, 29]. Our protocols can be viewed as an extension of the proxy-based ABE scheme . Compared to the original work, we have built a system that support meaningful access control policies for stream data. We have additionally integrated XACML framework for access control management. Finally, we have provided a holistic prototype implementation and preliminary evaluation for the encryption scheme as well as for the system as a whole.
Vii Conclusions and Future Work
In this paper, we have proposed a system that allows secure and scalable outsourcing of access control to the cloud. Our system works with stream data for which access control is more complex than archival data. We have extended an Attribute-Based Encryption scheme to support trigger and sliding window policies. We have employed the cloud as a computational proxy which performs the expensive cryptographic operations on behalf of the user. In particular, the cloud transforms complex ABE ciphertexts into Elgamal-like ciphertexts that are decrypted only by authorized users. We have integrated the standard XACML framework to aid policy management. Thus, our approach for access control outsourcing is holistic and is able to scale for large numbers of data streams and policies. Preliminary evaluation using a prototype indicates not only the feasibility but also the benefits of access control outsourcing.
While we have presented a working system, showing a proof-of-concept that outsourcing access control is feasible and practical for stream data, there is much left for future work. First, our current prototype is not capable of dealing with large, concurrent workloads. Our immediate plan is to either improve our prototype using event driven architecture (SEDA ) as a replacement for our current one-thread-per-connection concurrency model, or to use existing high-throughput stream processing engines such as STREAM  or Borealis  followed by experiments at larger scale (more streams, more policies, more users, more cloud servers) over a real cloud infrastructure.
The current system can be extended to support multi-value streams by performing different encryptions for different values. We have not considered access revocation, which in our case requires the data owner to change the attribute set during encryption. We plan to investigate if existing revocable KP-ABE schemes  can be integrated in our work, especially if they allows revocation to be outsourced to an untrusted cloud. Other interesting extensions are to add support for policies with negative attributes , and for encryption with hidden attributes. The former allows for a wider range of access policies, whereas the latter provides attribute privacy which is necessary when encryption attributes are the actual data.
Finally, we would like to relax the trust assumption for the cloud. Currently, the cloud performs all the computations delegated to it in an honest manner. However, a malicious cloud which skips or distorts computations will greatly increase the cost of security while maintaining a good service quality.
-  Healthfrontier. www.healthfrontier.com.
-  Key-policy attribute-based encryption scheme implementation. http://www.cnsr.ictas.vt.edu/resources.html.
-  National oceanic and admospheric administration. www.noaa.gov.
-  Oasis extensible access control markup language (xacml). urlhttps://www.oasis-open.org/committees/xacml/.
-  The pairing-based cryptography library. http://crypto.standford.edu/pbc.
-  xignite: on demand financial market data. xignite.com.
-  Stanford stream data manager. http://infolab.stanford.edu/stream, 2003.
-  Sun’s xacml impelementation. http://sunxacml.sourceforge.net/, 2006.
-  Daniel J. Abadi, Don Carney, Ugur Cetintemal, Mitch Cherniack, Christian Convey, Sangdon Lee, Michael stonebraker, Nesime Tatbul, and Stand Zdonik. Aurora: a new model and architecture for data stream management. VLDB Journal, 12(2):120–39, 2003.
-  Arvind Arasu, Mitch Cherniack, Eduardo Galvez, David Maier, Anurag S. Maskey, Esther Ryvkina, Michael Stonebraker, and Richard Tibbetts. Linear road: a stream data management benchmark. In VLDB, pages 480–91, 2004.
-  Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy H. Katz, Andrew Konwinski, Gunho Lee, David A. Patterson, Ariel Rabkin, Ion Stoica, and Matei Zaharia. Above the clouds: a berkeley view of cloud computing. Technical Report UCB/EECS-2009-28, EECS Department, UCB, 2009.
-  Nuttapong Attrapadung. Revocation scheme for attribute-based encryption. RCIS Workshop, http://www.rcis.aist.go.jp/files/events/2008/RCIS2008/RCIS2008_3-5_Nuts%.pdf, 2008.
-  Hitesh Ballani, Paolo Cost, Thomas Karagiannis, and Ant Rowstron. Towards predictable datacenter networks. In SIGCOMM, pages 242–53, 2011.
-  John Bethencourt, Amit Sahai, and Brent Waters. Ciphertext-policy attribute-based encryption. In IEEE Symposium on Security and Privacy, pages 321–34, 2007.
-  Matt Blaze, Gerrit Bleumer, and Martin Strauss. Divertible protocols and atomic proxy cryptography. In EUROCRYPT’98, pages 127–44, 1998.
-  Barbara Carminati, Elena Ferrari, and Kian Lee Tan. Enforcing access control over data streams. In SACMAT, pages 21–30, 2007.
-  Barbara Carminati, Elena Ferrari, and Kian Lee Tan. Specifying access control policies on data streams. In DASFAA, pages 410–21, 2007.
-  Yao Chen and Radu Sion. To cloud or not to cloud? musings on costs and viability. In 2nd ACM Symposium on Cloud Computing, 2011.
-  Mitch Cherniack, Hari Balakrishnan, Magdalena Balazinska, Don Carney, Ugur Cetintemel, Ying Xing, and Stan Zdonik. Scalable distributed stream processing. In CIDR, 2003.
-  Ronald Cramer, Rosario Gennaro, and Berry Schoenmakers. A secure and optimally efficient multi-authority election scheme. In EUROCRYPT’97, pages 103–118, 1997.
-  Tien Tuan Anh Dinh, Wang Wengiang, and Anwitaman Datta. City on the sky: extending xacml for flexible, secure data sharing on the cloud. Journal of Grid Computing, 10(1):151–72, 2012.
-  Prabal Dutta, Paul M. Aoki, Neil Kumar, Alan Mainwaring, Chris Myers, Wesley Willett, and Allison Woodruff. Common sense: participatory urban sensing using a network of handheld aire quality monitors. In 7th ACM Conference on Embedded Networked Sensor Systems, pages 349–50, 2009.
-  Vipul Goyal, Omkant Pandey, Amit Sahai, and Brent Waters. Attribute-based encryption for fine-grained access control of encrypted data. In CCS’06, pages 89–98, 2006.
-  Matthew Green, Susan Hohenberger, and Brent Waters. Outsourcing the decryption of abe ciphertexts. In 20th Usenix conference on Security, 2011.
-  Mohammad Hajjat, Xin Sun, Yu-Wei Eric Sung, David Maltz, Sanjay Rao, Kunwadee Sripanidkulchai, and Mohit Tawarmalani. Cloudward bound: planning for beneficial migration of enterprise applications to the cloud. In SIGCOMM, 2010.
-  Ge Jhong, Ian Goldberg, and Urs Hengartner. Louis, lester and pierre: Three protocols for location privacy. In 7th International conference on Privacy Enhancing Technologies, pages 62–76, 2007.
-  Mahesh Kallahalla, Erik Riedel, Ram Swaminathan, Qian Wang, and Kevin Fu. Plutus: scalable secure file sharing on untrusted storage. In FAST, pages 29–42, 2003.
-  Nicholas D. Lane. Community-aware smartphone sensing systems. IEEE Internet Computing, 16(3):60–64, 2012.
-  Allison Lewko and Brent Waters. Decentralizing attribute-based encryption. In EUROCRYPT, pages 568–88, 2011.
-  Rimma V. Nehme, Elke A. Rundensteinerr, and Elisa Bertino. A security punctuation framework for enforcing access control on streaming data. In International Conference on Data Engineering, pages 406–15, 2008.
-  Rafail Ostrovsky, Amit Sahai, and Brent Waters. Attribute-based encryption with non-monotonic access structures. In CCS’07, pages 195–203, 2007.
-  Pascal Paillier. Public-key cryptosystems based on coposite degree residuosity classes. In EUROCRYPT, pages 223–38, 1999.
-  J.M. Pollard. Monte carlo methods for index computation (mod p). Mathematics of comptuation, 32(143):918–24, 1978.
-  Raluca Ada Popa, Jay Lorch, David Molnar, Helen J. Wang, and Li Zhuang. Enabling security in cloud storage slas with cloudproof. In USENIX’11, 2011.
-  Raluca Ada Popa, Nickolai Zeldovich, and Hari Balakrishnan. Cryptdb: a practical encrypted relational dbms. Technical Report MIT-CSAIL-TR-2011-005, CSAIL, MIT, 2011.
-  Raghu Ramankrishnan and Johannes Gehrke. Database Management Systems. McGraw-Hill higher Education, 3rd edition, 2002.
-  Alan Shieh, Srikanth Kandula, Albert Greenberg, Changhoon Kim, and Bikas Saha. Sharing the data center network. In NSDI, 2011.
-  Byung Chul Tak, Bhuvan Urgaonkar, and Anand Sivasubramaniam. To move or not to move? the economics of cloud computing. In 3rd USENIX conference on Hot topics in cloud computing, 2011.
-  Osman Ugus, Dirk Westhoff, Ralf Laue, Abdulhadi Shoufan, and S. A. Huss. Optimized implementation of elliptic curve based additive homomorphic encryption for wireless sensor. CoRR, 2009.
-  Matt Welsh, David Culler, and Eric Brewer. Seda: An architecture for well-conditioned, scalable internet services. In SOSP, 2001.
-  Schucheng Yu, Cong Wang, Kui Ren, and Wenjing Lou. Achieving secure, scalable and fine-grained data access control in cloud computing. In INFOCOM, pages 534–542, 2010.