Mixed reality (MR) technology is now gaining ground due to advances in computer vision, sensor fusion, and realistic display technologies. With most of the research and development focused on delivering the promise of MR, there is only barely a few working on the privacy and security implications of this technology. This survey paper aims to put in to light these risks, and to look into the latest security and privacy work on MR. Specifically, we list and review the different protection approaches that have been proposed to ensure user and data security and privacy in MR. We extend the scope to include work on related technologies such as augmented reality (AR), virtual reality (VR), and human-computer interaction (HCI) as crucial components, if not the origins, of MR, as well as a number of work from the larger area of mobile devices, wearables, and Internet-of-Things (IoT). We highlight the lack of investigation, implementation, and evaluation of data protection approaches in MR. Further challenges and directions on MR security and privacy are also discussed.
0pt \IncMargin- \settopmatterprintacmref=false \citestyleacmauthoryear \acmVolume0 \acmNumber0 \acmArticle0 \acmYear0 \acmMonth0 \copyrightyear2017 \setcopyrightacmlicensed \acmDOI0000001.0000001 Security and Privacy Approaches in MR]Security and Privacy Approaches in Mixed Reality: A Literature Survey
¡ccs2012¿ ¡concept¿ ¡concept_id¿10003120.10003121.10003124.10010392¡/concept_id¿ ¡concept_desc¿Human-centered computing Mixed / augmented reality¡/concept_desc¿ ¡concept_significance¿500¡/concept_significance¿ ¡/concept¿ ¡concept¿ ¡concept_id¿10002978.10003029¡/concept_id¿ ¡concept_desc¿Security and privacy Human and societal aspects of security and privacy¡/concept_desc¿ ¡concept_significance¿300¡/concept_significance¿ ¡/concept¿ ¡concept¿ ¡concept_id¿10002978.10003029.10011150¡/concept_id¿ ¡concept_desc¿Security and privacy Privacy protections¡/concept_desc¿ ¡concept_significance¿300¡/concept_significance¿ ¡/concept¿ ¡concept¿ ¡concept_id¿10002978.10003029.10011703¡/concept_id¿ ¡concept_desc¿Security and privacy Usability in security and privacy¡/concept_desc¿ ¡concept_significance¿100¡/concept_significance¿ ¡/concept¿ ¡/ccs2012¿
Human-centered computing Mixed / augmented reality \ccsdescSecurity and privacy Human and societal aspects of security and privacy \ccsdescSecurity and privacy Privacy protections \ccsdescSecurity and privacy Usability in security and privacy
Originally, the term mixed reality (MR) was used to pertain to the various devices – specifically, displays – that encompass the reality-virtuality continuum [Milgram et al. (1994)]. This means that augmented reality (AR) systems and virtual reality (VR) systems are MR systems but, if categorized, will lie on different points along the reality-virtuality continuum as seen in Figure 1. Presently, mixed reality has a hybrid definition that combines aspects of AR and VR to deliver rich services and immersive experiences [Curtin (2017)]. Particularly, mixed reality allows the interaction of real objects with synthetic virtual objects, and vice versa. By combining the synthetic presence offered by VR and the extension of the real world by AR, MR enables a virtually endless suite of applications that is not offered by the current AR and VR platforms, devices, and applications we have now.
Advances in computer vision – particularly in object sensing, object tracking, and gesture identification – sensor fusion, and artificial intelligence has furthered the human-computer interaction as well as the machine understanding of the real-world. At the same time, advances in 3D rendering, optics – such as projections, and holograms – , and display technologies have made possible the delivery of realistic virtual experiences. All these technologies makes MR possible. As a result, MR can now allow us to interact with machines in a totally different manner: for example, using gestures in the air instead of swiping in screens or tapping on keys. The output of our interactions with a machine, also, will no longer be confined within a screen. Instead, outputs will now be mixed with our real-world experience, and, possibly, we may not even be able to tell which is real and synthetic. Microsoft’s Hololens provides us with what these MR devices can do. It allows users to interact with holographic augmentations in a more seamless and direct manner [Goode and Warren (2015)]. In prospect, these MR devices will definitely integrate with existing technologies, i.e. smart phones, wearable devices, and IoT devices, and will extend, if not enhance, the capabilities of these devices.
Most of the work on mixed reality for the past two decades have been focused on delivering the necessary technology to make MR a possibility. As the necessary technology is starting to mature, and with the preceding release of AR and VR devices such as the Google Glass and Oculus Rift, many companies have started releasing MR-capable devices. As these devices become more available and affordable, like any other technology, the use of these devices may entail some security and privacy implications, and some of these risks may not yet be known. For example, it has been demonstrated how facial images captured by a web camera can be cross-matched with publicly available online social network (OSN) profile photos to match the names with the faces and further determine social security numbers . With most MR devices coming out in a wearable, i.e . head-mounted, form-factor and having at least one camera to capture the environment, it will be easy to perform such facial matching tasks in the wild without the subjects knowing it. Security and privacy, most often than not, comes as an afterthought as demonstrated by the OSN experiment. We are observing a similar trend in MR as further explained in this survey paper.
To systematically capture this trend, we used the search tool of Scopus to gather AR and MR literature, and further identify works with security and privacy. Although, Scopus does not index literature from all resources, particularly in security and privacy research, the search tool can include secondary documents that have been cited by Scopus-indexed documents which now effectively includes most security and privacy literature such as those from USENIX Security Symposium and the Internet Society’s Networks and Distributed System Security (NDSS) Symposium. Figure 2 shows the yearly percentage of these papers. Despite the increasing percentage from 0.7% in 1997 to 5.8% in 2016, most are only “mentioning” security and privacy, and only a few (1.42% for 2016) are actually discussing the impacts, or presenting security and privacy approaches applied to, or using, AR/MR.
Early surveys on AR and MR, have been focused on categorizing the exisiting technologies then. In 1994, Milgram and Kishino presented a taxonomy for classifying mixed reality displays based on the user-interface – from monitor-based video displays to completely immersive environments – and plotted these devices along their reality-virtuality continuum [Milgram and
Kishino (1994)]. On the other hand, in contrast to this one-dimensional continuum, Benford et al. (1996) presented two classifications for mixed reality: a two-dimensional categorization of shared space or collaborative mixed reality technologies according to concepts of transportation
Succeeding endeavours have focused on collecting all relevant technologies necessary to AR and VR. The various early challenges – such as matching the real and virtual displays, aligning the virtual objects with the real world, and the various errors that needs to be addressed such as optical distortion, misalignment, and tracking – have been discussed in broad [Azuma (1997)]. It was complemented with a following survey that focuses on the enabling technologies, interfacing, and visualization [Azuma et al. (2001)]. A much more recent survey updated the existing challenges to the following: performance, alignment, interaction, mobility, and visualization [Rabbi and Ullah (2013)]. Another one looked into a specific type of AR, mobile AR, and looked into the different technologies that enable mobility with AR [Chatzopoulos et al. (2017)]. Lastly, a review of the various head-mounted display technologies for consumer electronics [Kress and Starner (2013)] was also undertaken. While all of these different challenges and technologies are important to enable AR, none of these survey or review papers have focused onto the fundamental issues of security and privacy in AR or MR.
This survey paper makes the following contributions:
To the best of our knowledge, this is the first survey on the relevant security and privacy challenges and approaches on mixed reality.
We provide a data-centric categorization of the various works which categorizes them to five major aspects, and, further, to subcategories and presented generic system block diagrams to capture these different aspects.
We include a collection of other relevant work not necessarily directed to mixed reality but is expected to be related to or part of the security and privacy considerations for MR.
Lastly, we also include a listing of the latest mixed reality devices and platforms.
Before proceeding to the review of the various security and privacy work, we clarify that we do not focus on network security and related topics. We will rely on the effectiveness of existing security and privacy measures that protects the communication networks, and data transmission, in general.
The rest of the survey proceeds as follows. Section 2 explains the categorization used on the literature to capture the different aspects of MR. The details of these different security and privacy work in the literature are discussed in Section 3. Lastly, in Section 4, the remaining challenges and future directions are discussed before finally concluding this survey in Section 5.
2 Categorizing the Approaches
Previous surveys on MR tackled the issues of computer vision, sensing, and augmentation with the goal of trying to understand all the necessary aspects to deliver the technology. A few others have pointed out the non-technical issues such as ethical considerations [Heimo et al. (2014)] and value-sensitive design approaches [Friedman and Kahn Jr (2000)] that pushes to consider data ownership, privacy, secrecy, and integrity. A much recent work emphasized the three aspects for protection in AR – input, data access, and output – over varying system complexity (from single application, to multiple applications, and, eventually, to multiple systems) [Roesner et al. (2014b)]. In supplement, we expand from these three aspects, and include interaction and device protection as equally important aspects.
Figure 3 presents an example of an MR environment with the supporting data services, and shows how data flows within the environment and through the data services. On the left-half of the diagram is the ‘view’ of the mixed reality device which, in this example, is a see-through MR head-mounted device (HMD) or, simply, an MR eye-wear. Within the view are the physical objects apart which are ‘seen’ by the MR device as indicated by the green arrows. The synthetic augmentations are shown in the diagram which are represented by the red-dotted arrows. On the right-half of the diagram are the various supporting data services that processes the data. The bi-directional solid arrows represent the access of these applications to captured data and the delivery of output1s to be augmented. Representative points of the five aspects of protection within the data flow are also labelled in Figure 3 and we use these labels to further explain the categorization.
Input Protection focuses on the challenges in ensuring security and privacy of data that is gathered and inputted to the MR platform. These data may contain sensitive information. For example, in Figure 3, the MR eye-wear can capture the sensitive information on the user’s desktop screen (labelled 1). These are user-sensitive information that needs to be protected. Similarly, the same device can also capture information that may not be sensitive to the user but is sensitive to other entities such as bystanders. This can include facial information of bystanders, i.e. bystander privacy. Aside from readily sensitive objects, the device may capture other objects in the environment that may be benign but can be used by other entities to infer knowledge about the user which the user did not originally intend to share. The different input protection approaches are reviewed in Section 3.1.
After sensing, data is usually collected by the system. Multiple sensor data are aggregated, and, then, stored in a database or other forms of data storage. Applications, then, need to access these data in order to deliver output in the form of user-consumable information or services. However, almost all widely used computing platforms allows applications to collect and store data individually (as shown in the access of supporting data services labelled 2 in Figure 3) and the users have no control over their data once it has been collected and stored by these applications. A lot of security and privacy risks have been raised concerning the access and use of user data by third party agents, particularly, on user data gathered from wearable [Felt et al. (2012)], mobile [Lee et al. (2015)], and on-line activity [Ren et al. (2016)]. MR technology faces even greater risks as richer information can be gathered using its highly sensitive sensors. Section 3.2 will present a review of the different data protection approaches for MR and the other non-MR or generic approaches.
After processing the data, applications send outputs to the mixed reality device to be displayed or rendered. However, in MR, applications may inadvertently have access to outputs of other applications. If an untrusted application have access to other outputs, then it can potentially modify those outputs making them unreliable. For example, in the smart information (labelled 3) hovering over the cup in Figure 3, malicious applications can modify the sugar level information. Also, it is possible that one application’s output is another’s input which necessitates multiple application access to an output object. The integrity and reliability of these outputs has to be ensured. All these should be safely considered in output protection and the different approaches are discussed in Section 3.3.
In user interaction protection, we expand the coverage of protection to ensure protected sharing and collaborations (labelled 4 in Figure 3) in MR. In contrast to current widely adapted technologies like the desktop computer and the smart phone, MR can enable entirely new and different ways of interacting with the world, with machines, and with other users. One of the key expectations is how users can share MR experiences with assurance of security and privacy of information. Details of the approaches in protecting user interactions are discussed in Section 3.4.
The last aspect, device protection, focuses on the actual physical MR device, and, by extension, implicitly protects data that goes through all the other four aspects by ensuring device-level protection. In Section 3.5, the different novel approaches in device access and physical display protection are discussed.
Finally, in Figure 4, the five categories are shown with their further subcategories. A simplified representation of the process described in Figure 3 is shown as a pipeline in Figure 5 which has three essential blocks: detection, transformation, and rendering. The first three aspects addresses the risks within this processing pipeline – protecting how applications, during the transformation stage, access real-world input data gathered during detection, which may be sensitive, and generate reliable outputs during rendering. The detection focuses on gathering information such as user view orientation and direction, location, and surrounding objects. Thus, the detection process primarily depends on the sensing capabilities of the device. After detection, the information gathered will be transformed or processed to deliver services. Depending on the service or application, different transformations are used. Finally, the results of the transformation is delivered to the user by rendering it through the device’s output interfaces. However, actual approaches may actually be applied beyond the simple boundaries we have defined here. Thus, some input and output protection approaches are actually applied in the transformation stage, while some data access protection approaches, e.g. data aggregation, are actually applied in the detection and rendering stages. Furthermore, the interaction protection and device protection approaches cannot be laid out along the pipeline unlike the other three as the intended targets of these two categories transcend this pipeline.
The presented categorization does not exclusively delineate the five aspects, and it is significant to note that most of the approaches that will be discussed can fall under more than one category or subcategory. Notwithstanding, this survey paper complements the earlier surveys by presenting an up-to-date collection of security and privacy research and development for the past two decades on MR and related technologies, and categorizing these various works according to the presented data-centric categorization. The next section proceeds in discussing these various approaches that have been done to address each of the five major aspects.
3 Security and Privacy Approaches
We present in Figure 6 a generic block diagram representation of an MR platform and how data, primarily visual and audio data, flow. It is a detailed version of Figure 3 which emphasizes more on how applications access the input and output resources of existing devices or platforms for MR. File systems of popular mobile platforms, such as Android and iOS, operate similarly to this diagram where each application has a dedicated file system or database, as seen on the right side of the MR platform, for operation. Both mobile operating systems have already implemented additional measures for security and privacy such as Android’s permission control, and iOS’s sand-boxed applications. However, despite those protection mechanisms, once an application has been granted access to those resources, they now have indefinite access to data that can be collected through those resources. In addition, users have no control on how applications use those collected information that resides on each of the application’s dedicated data stores.
In an MR-specific sense, once an MR application has been given access to the visual sensing facility of a device, it now has virtually access to a rich set of information. This is aggravated more by combining with other data such as audio, and location data. In the following subsections, we present the various security and privacy work that has been done on MR and related technologies, especially on AR. We have organized these approaches according to the five major categories and, if applicable, to their further subcategories. There may be instances that presented solutions may address several aspects and may fall on more than one category. For these cases, we focus on the primary objective or challenges focused on their approach.
3.1 Input Protection Approaches
The most common input protection approaches usually involves the removal of latent and sensitive information from the input data stream. These approaches are generally called input sanitization techniques. These techniques are implemented as an intermediary layer between the sensor interfaces and the applications as shown in Figure 7. In general, this protection layer acts as an input access control mechanism as well aside from sanitization.
The sanitization techniques can be categorized according to the policy enforcement – whether intrinsic or extrinsic policies for protection are used. In intrinsic enforcement, the user, the device, or the system itself imposes the protection policies that dictates the input sanitization that is applied. On the other hand, extrinsic input protection arises from the need for sensitive objects external to the user that are not considered by the intrinsic policies. In the following subsections, the sanitization techniques are presented as either intrinsic or extrinsic approaches. We include earlier implementations of such techniques which were targeted to generic visual-capturing devices. A separate subsection is dedicated for user input protection. Aside from input sanitization, techniques for secret user inputs are also discussed.
Intrinsic Input Sanitization
Intrinsic input sanitation policies are usually user-defined. For example, the Darkly system [Jana et al. (2013b)] for perceptual applications uses OpenCV in its intermediary input protection layer to implement a multi-level feature sanitation. The basis for the level or degree of sanitization are the user-defined policies. The users can impose different degrees of sensitivity permissions which affects the amount of detail or features which can be provided to the applications, i.e. stricter policies means less features are provided. For example, facial information can vary from showing facial feature contours (of eyes, nose, brows, mouth, and so on) to just the head contour depending on the user’s preferences. The user can actively control the level of information that is provided to the applications.
A context-based intrinsic sanitization framework [Zarepour et al. (2016)] improved on the non-contextual policies of Darkly. It determines if there are sensitive objects in the captured images, like faces or car registration plates, and automatically implements sanitization. Sensitive features are sanitized by blurring out, while images of sensitive locations (e.g. bathrooms) are entirely deleted. Similarly, PlaceAvoider [Templeman et al. (2014)] also detects images as sensitive or not, depending on the features extracted from the image, but deletion is not automatic and still depends on the user. Despite the context-based nature of the sanitization, the policy that governs how to interpret the extracted contexts are still user-defined, thus, we consider both sanitization techniques as intrinsic. However, intrinsic policy enforcement can be considered as self-policing which can potentially have a myopic view of privacy preferences of other users and objects. Furthermore, intrinsic policies can only protect the inputs that are explicitly identified in the policies. Moreover, these sanitization approaches are primarily implementing media (i.e. image) protection control. Translating it to video is not that straightforward due to the frame rate demand. Nonetheless, there are a few video sanitization approaches on the generic, non-MR context.
Video Sanitization The previously discussed sanitization techniques were targeted for generic capturing devices and were mostly sanitizing images and performs the sanitization after storing the image. For MR platforms that require real-time video feed, there is a need for on the fly sanitization of data to ensure security and privacy. A privacy-sensitive visual monitoring [Szczuko (2014)] system was implemented by removing individuals from a video surveillance feed, and render 3D animated humanoids in place of the detected and visually-removed individuals. Another privacy-aware live video analytic system called OpenFace-RTFace [Wang et al. (2017)] focused on performing fast video sanitization by combining with face recognition. The OpenFace-RTFace system lies near the edge of the network, or on cloudlets. Similar approaches to edge or cloud-assisted information sanitization can be performed for MR.
Extrinsic Input Sanitization
In extrinsic input protection, the input policy module can receive input policies, e.g. privacy preferences, from the environment. An early implementation [Truong et al. (2005)] involved outright capture interference to prevent sensitive objects from being captured by unauthorized visual capturing devices. A camera-projector set up is used. The camera detects unauthorized visual capture devices, and the projector beams a directed light source to “blind” the unauthorized device. This technique can be generalized as a form of a physical access control, or, specifically, a deterrent to physical or visual access. However, this implementation requires a dedicated set up for every sensitive space or object, and the light beams can be disruptive.
Other approaches involves the use of existing communication channels or infrastructure for endorsing or communicating policies to capture devices, and to ensure that enforcement is less disruptive. The goal was to implement a fine-grained permission layer to “automatically” grant or deny access to continuous sensing or capture of any real-world object. A simple implementation on a privacy-aware see-through system [Hayashi et al. (2010)] allowed other users that are “seen-through” to be blurred out or sanitized and shown as human icons only if the viewer is not their friend. However, this requires that users have access to the shared database and explicitly identify friends. Furthermore, enabling virtually anyone or, in this case, anything to specify policies opens new risks such as forgery, and malicious policies.
To address authenticity issues in this so called world-driven access control, policies can be transmitted as digital certificates [Roesner et al. (2014c)] using a public key infrastructure (PKI). Thus, the PKI provides cryptographic protection to media access and sanitization policy transmission. However, the use of a shared database requires that all possible users’ or sensitive objects’ privacy preferences have to be pushed to this shared database. Furthermore, it excludes or, unintentionally, leaves out users or objects that are not part of the database –or, perhaps, are unaware – which, then, defeats the purpose of an world-driven protection.
I-pic [Aditya et al. (2016)] removes the involvement of shared databases. Instead users endorse privacy choices via a peer-to-peer approach using Bluetooth Low Energy (BLE) devices. However, I-pic is only a capture-or-not system. PrivacyCamera [Li et al. (2016b)] is another peer-to-peer approach but is not limited to BLE. Also, it performs face blurring, instead of just capture-or-not, using endorsed GPS information to determine if sensitive users are within camera view. On the other hand, Cardea [Shu et al. (2016)] allows users to use hand gestures to endorse privacy choices. In Cardea, users can show their palms to signal protection while a peace-sign to signal no need for protection. However, this three approaches are primarily targeted for bystander privacy protection, i.e. facial information sanitization.
MarkIt [Raval et al. (2014)] can provide protection to any user or object uses privacy markers and gestures (similar to Cardea) to endorse privacy preferences to cameras. It was integrated to Android’s camera subsystem to prevent applications from leaking private information [Raval et al. (2016)] by sanitizing sensitive media. This is a step closer to automatic extrinsic input sanitization, but it requires visual markers in detecting sensitive objects. Furthermore, all these extrinsic approaches have only been targeted for visual capture applications and not with AR- or MR-specific ones.
User Input Protection
Another essential input that needs to be protected is user input. We put a separate emphasis on this as user input entails a command to the machine, while the previous input sanitization techniques discussed does not necessarily invoke commands. Also, aside from privacy, there is also a need for securing input protection from external inference threats such as shoulder-surfing attacks.
Currently, the most widely adopted user input interfaces are the tactile types, specifically, the keyboard, computer mouse, and touch interfaces. However, these current tactile inputs are limited by the dimension
Securing User Inputs EyeDecrypt [Forte et al. (2014)] uses a visual cryptography technique to protect sensitive input interfaces, such as ATM PIN pads. The publicly viewable input interface is encrypted, and the secret key is kept or known by the user. The user uses an AR device to view the encrypted public interface and, through the secret key, is visually decrypted. As a result, only the user can see the actual input interface through the AR display. This approach further provides physical access protection by utilizing out-of-band channels to securely transmit the cryptographic keys between two parties (e.g. the client, through the ATM interface, and the bank).
A similar AR-based approach was used to secretly scramble keyboard keys to hide key inputs from external inference [Maiti et al. (2017)]. Only the user through the AR device can see the actual key arrangement of the leyboard. Both the EyeDecrypt and this keyboard scrambler are using visual cryptography as a physical access protection over these shared or public resources. However, these techniques greatly suffers from visual alignment issues, i.e. aligning the physical display with the objects rendered through the augmented display.
Furthermore, these types of inputs are not fully suited for three-dimensional interactions which is desired in an MR environment. Thus, there is a need for new user input interfaces to allow three-dimensional inputs. Early approaches used gloves [Dorfmuller-Ulhaas and Schmalstieg (2001), Thomas and Piekarski (2002)] that can determine hand movements, but advances in computer vision have led to tether- and glove-free 3D interactions. Gesture inference from smart watch movement have also been explored as a possible input channel, particularly on finger-writing inference [Xu et al. (2015)]. This allows future systems to use latent movement information as possible input channels and move away from keypad and keyboards. However, at the same time, being able to detect keyboard strokes using smart watch movement information also poses security and privacy risks [Maiti et al. (2016)].
Protecting Vision-based User Inputs Vision-based natural user interfaces (NUI), such as the Leap Motion [Zhao and Seah (2016)] and Microsoft Xbox Kinect, have been integrated with MR systems to allow users to interact with virtual objects in three-dimensions. These NUI devices detects users’ body gestures, i.e. arms, hand or fingers, using video or depth sensing. However, the use of visual capture to detect input means that applications that require gesture inputs can inadvertently capture other sensitive inputs within view. Similar privacy risks arises and, thus, the need for some form of sanitization. To address these risks, Prepose [Figueiredo et al. (2016)] provided secure gesture detection and recognition as an intermediary layer. The Prepose core, then, only sends gesture events to the applications, which effectively removes the necessity for having access to the raw input feed. Thus, it provides least privilege access control to applications, i.e. only the necessary event information is transmitted to the third party applications. Furthermore, it provides a programming core to allow future detection of new gestures.
A preceding work to Prepose implemented the similar idea of inserting a hierarchical recognizer [Jana et al. (2013a)] as an intermediary input protection layer. They inserted Recognizers to the Xbox Kinect to address input sanitization as well as to provide input access control. The policy is user-defined, thus, it is an intrinsic approach. Similar to Darkly, the goal is to implement a least privilege approach to application access to inputs – applications are only given the least amount of information necessary to run. For example, a dance game in Xbox, e.g. Dance Central or Just Dance, only needs body movement information, and it does not need facial information, thus, the dance games are only provided with the moving skeletal information and not the raw video feed of the user while playing. To handle multiple levels of input policies, the recognizer implements a hierarchy of privileges in a tree structure, with the root having highest privilege, i.e. access to RGB and depth information, and the leaves having lesser privileges, i.e. access to skeletal information.
However, a least privilege approach requires that the recognizers must know what type of inputs or objects the different applications will require. Prepose addresses this for future gestures but not for future objects. For example, an MR painting application may require the detection of different types of brushes but the current recognizer does not know how to ‘see’ or detect the brushes. This has been raised by the researchers behind MarkIt which they try to address by using markers to “tell” devices what and what not to see, but its implementation was only limited to camera applications and not for AR- or MR-specfic ones.
3.2 Data Protection
We can divide the different data protection techniques based on the data flow. First, after sensing, data is gathered and aggregated, thus, protected data aggregation is necessary. Then, to deliver output, applications will need to process the data, thus, privacy-preserving data processing is required. Ultimately, the data storage has to be protected as well. Thus, three major data protection levels arise: aggregation, processing, and storage.
Generally, the aim of these various approaches or algorithms is to learn something from the data without privacy leakage – without learning anything about a particular user or any sensitive entity. Usually, privacy-preserving algorithms use privacy definitions as reference for protection and are applied as a form of data mining protection. Example privacy definitions are k-anonymity, and differential privacy. k-anonymity [Sweeney (2002)] ensures that records are unidentifiable from at least k-1 other records. It usually involves data perturbation or manipulation techniques to ensure privacy, and suffers from scaling problems, i.e. larger data dimensions, which can be expected from MR platforms and devices with a high number of sensors or input data sources. Differentially private algorithms [Dwork et al. (2014)], on the other hand, inserts randomness to data to provide plausible deniability to the data. The guaranteed privacy of differentially private algorithms are well-studied [McSherry and Talwar (2007)]. Randomized response [Erlingsson et al. (2014)] is an example of a differentially-private data collection algorithm.
There are necessary modifications that applications have to partake in order to implement these privacy-aware schemes. However, there are proposals on how to eliminate the necessity of code modification such as GUPT [Mohan et al. (2012)], which focuses on the sampling and aggregation process to ensure distribution of the differential privacy budget. Ultimately, there are other privacy definitions or metrics used in the literature. In the following subsections, the different approaches on data aggregation, processing, and storage are covered.
Protected Data Collection and Aggregation
Protected data collection is also a form of input protection but we transfer the focus to data after sensing and how systems (and eventually applications) handle sensor data as a whole. The intermediate layer labelled 1 in Figure 8 shows how these data collection and aggregation approaches are usually implemented. For example, SemaDroid [Xu and Zhu (2015)] is a privacy-aware sensor management framework that extends the current sensor management framework of Android and allows users to specify and control fine-grained permissions to applications accessing sensors. Just like the input protection approaches, SemaDroid is implemented as an intermediary protection layer that provides application access control to sensors and sensor data to prevent privacy leakage. What differentiates it from a pure input protection techniques is the application of auditing and reporting of potential leakage, and applying it for privacy bargain. This allows users to ‘trade’ their data or privacy in exchange for services from the applications. There are a significant number of work on privacy bargain and the larger area of privacy economics but we will not be elaborating on it further and point the readers to Acquisti’s work on privacy economics [Acquisti et al. (2016)].
Evidently, the SemaDroid framework was primarily designed for mobile devices, i.e. Android-based devices, and data. MR data, on the other hand, is expected to be visual-heavy, and information collected is not only confined to users but also of other externally sensitive information that can be visually-captured. MR-targeted protected sensor management is still yet to be designed. The previously discussed input sanitization approaches are also promising protected data gathering techniques and can complement MR-specific sensor management.
In addition to protected data gathering approaches is privacy-preserving data aggregation (PDA) which has been adopted in information collection systems [He et al. (2007), He et al. (2011)] with multiple data collection or sensor points, such as wireless sensor networks or body area networks. The premise of PDAs is to get aggregate statistic or information without knowing individual information, whether of individual sensors or individual users. A similar PDA approach specific to MR data is still yet to be designed and evaluated.
Protected Data Processing
Similar premise for data processing also holds: applications process user data to deliver services but with data security and user privacy in mind. These secure and private processing algorithms are implemented as secure computation algorithms as seen in the block labelled 2 in Figure 8. Secure multi-party computation (SMC) have been proposed as a method for processing, i.e. computing, outputs from two or more data sources without necessarily knowing about the actual data each source has. There are various approaches in performing SMC such as garbled circuits [Yao (1986), Huang et al. (2011)], or using cryptographic techniques such as fully homomorphic encryption (FHE) [Gentry (2009)]. These homomorphic encryption techniques allows queries or computations over encrypted data.
Primarily, these protected processing approaches are utilizing cryptographic (i.e. secure) protection and are usually implemented in a distributed manner (i.e. multi-party) to provide further protection. All these techniques complement each other and can be used simultaneously on a singular system. Inevitably, we understand the equal importance of all these technologies and how they can be used on MR, but these data protection techniques are technology agnostic. Therefore, any or all of these techniques can be applied to MR and it will only be a matter of whether the technique is appropriate for the amount of data and level of sensitivity of data that is tackled in MR environments.
An example in Visual Data Processing These secure data processing techniques have been used in privacy-preserving video data processing was used in a virtual cloth try-on [Sekhavat (2017)] by using secret sharing and secure two-party computation techniques. The anthropometric information
Data Storage Solutions to Protection
After collection and aggregation, applications store user data on separate databases in which users have minimal or no control over. Privacy concerns on how these applications use user data beyond the expected utility to the user have been posed [Ren et al. (2016), Lee et al. (2015), Felt et al. (2012)]. When trustworthiness is not ensured, protected data storage solutions, such as personal data stores (PDS), with managed application access permission control is necessary. PDSs allows the users to have control over their data, and which applications have access to it. In addition, further boundary protection and monitoring can be enforced on the flow of data in and out of the PDS.
Figure 8 shows a generic block diagram of how a PDS protects the user data by running it in a protected sand-box machine that may monitor the data that is provided to the applications. Usually, applet versions (represented by the smaller App blocks within the PDS in the diagram) of the applications run within the sand-box. Various PDS implementations have been proposed such as the personal data vaults (PDV), OpenPDS, and the Databox. The PDV was one of the earlier implementations of a PDS but it only supports a few number of data sources, i.e. location information, and does not have an application programming interface (or API) for other data sources or types. Nonetheless, they demonstrated how data storage and data sharing to applications can be decoupled [Mun et al. (2010)].
OpenPDS improves on the lack of API of the PDV. OpenPDS allows any application to have access to user data through SafeAnswers [de Montjoye et al. (2014)]. SafeAnswers (SA) are pre-submitted and pre-approved query application modules (just like an applet in Figure 8) which allows applications to retrieve results from the PDS using. However, the necessity of requiring applications to have a set of pre-approved queries reduces the flexibility of openPDS.
Databox also involves the use of a sandbox machine where users store their data, and applications run a containerized piece of query application that is trusted. The running of containers in the sandbox allows users to have control of what pieces of information can exit from the sandbox. However, similar to SafeAnswers requiring application developers to develop a containerized application for their queries may be a hindrance to adaptation. Despite that, Databox pushes for a privacy ecosystem which empowers users to trade their data for services similar to privacy bargains in a privacy economy [Crabtree et al. (2016)]. These privacy ecosystem can possibly assist in addressing the adaptability issues of developing containerized privacy-aware query applications, because users can now demand service in exchange for their data.
3.3 Output Protection
The prime value of MR is to deliver immersive experiences. To achieve that, applications ship services and experiences in the form of rendered outputs. In general, there are three possible types of outputs in MR systems: real-world anchored outputs, non-anchored outputs, and outputs of external displays. The first two types are both augmented outputs. The last type refers to outputs of other external displays which can be utilized by MR systems, and vice versa. Protecting these outputs is of paramount importance aside from ensuring input and data protection. As a result there are three enduring points or aspects of protection when it comes to the output: protecting external displays, output control, and protected rendering.
Secure Output Displays
In general, MR technology needs the cooperation of different input sensing, data processing, and output display resources to provide these immersive experiences. Focusing on the display part, there are instances when sensitive outputs needs to be protected from external inference threats or visual channel exploits such as shoulder-surfing attacks. This premise for protection also intersects well with that of secure user inputs (as discussed in Section 3.1.3) – to provide secrecy and privacy on certain sensitive contexts which requires input and output secrecy such as ATM bank transactions, PIN input, and so on. MR can be leveraged to provide this kind of protection and it has been demonstrated as a viable approach. For example, EyeGuide [Eaddy et al. (2004)] used an near-eye HMD to provide a navigation service that delivers secret and private navigation information augmented on a public map display. Because the EyeGuide display is practically secret, shoulder surfing is prevented.
Content hiding methods Other approaches involve the actual hiding of content. For example, VRCodes [Woo et al. (2012)] takes advantage of rolling shutter to hide codes from human eyes but can be detected by cameras at a specific frame rate. A similar approach has been used to hide AR tags in video [Lin et al. (2017)]. This type of technique can hide content from human attackers but is still vulnerable to machine inference or computer vision-aided capture.
Visual cryptography Secret display approaches have also been used in visual cryptographic techniques such as visual secret sharing (VSS) schemes. VSS allows the ‘mechanical’ decryption of secrets by overlaying the visual cipher with the visual key. However, classical VSS was targeted for printed content [Chang et al. (2010)] and requires strict alignment which is difficult even for AR and MR displays, particularly handhelds and HMDs. The VSS technique can then be relaxed by using code-based secret sharing, e.g. barcodes, QR codes, 2D barcodes, and so on. The ciphers are publicly viewable while the key is kept secret. An AR-device can then be used to read the cipher and augment the decrypted content over the cipher. This type of visual cryptography have be applied to both print [Simkin et al. (2014)] and electronic displays [Lantz et al. (2015), Andrabi et al. (2015)].
Electronic displays are, however, prone to attacks from malicious applications which has access to the display. One of this possible attacks is cipher rearrangement for multiple ciphers. To prevent such in untrusted electronic displays, a visual ordinal cue [Fang and Chang (2010)] can be combined with the ciphers to provide the users immediate signal if they have been rearranged. EyeDecrypt [Forte et al. (2014)] also provides defence if the viewing device, i.e. the AR HMD, is untrusted by performing the visual decryption in a secure server rather than on the AR device itself. In general, this visual cryptography and content-hiding methods provide physical access control, and information in shared or public resource protection.
These techniques can also be used to provide protection for sensitive content on displays during input sensing. Instead of providing privacy protection through post-capture sanitization, the captured ciphers will remain secure as long as the secret shares or keys are kept secure. Thus, even if the ciphers are captured during input sensing, the content stays secure.
Despite these improvements in visual cryptography using AR or MR displays, the usability of this technique is still confined to specific sensitive use cases due to the requirements of alignment. Also, this type of protection is only applicable to secrets that are pre-determined, specifically, information or activities that are known to be sensitive, such as password input or ATM PIN input. These techniques are helpful in providing security and privacy during such activities in shared or public space due to the secrecy provided by the near-eye displays which can perform the decryption and visual augmentation. Evidently, it only protects the output or displayed content of external displays but not the actual content which are displayed through the AR or MR device. In the next subsection, we focus on how access control and policy enforcement can be used for output reliability.
Output control policies for Safe and Reliable outputs
Output control policies are the guiding framework on how MR devices will handle outputs from different applications. This includes the management of display priority which could be in terms of transparency, arrangement, occlusion, and other possible spatial attributes. Due to the loose output access control in exsiting MR devices and platforms, an output access control framework [Lebeck et al. (2016)] with an object-level of granularity have been proposed in output handling to make enforcement easier. The mechanism can be implemented as an intermediary layer, as in Figure 9, and follows a set of output policies. In a follow up work, they presented a design framework [Lebeck et al. (2017)] for output policy specification and enforcement which combined output policies from Microsft’s HoloLens Developer guidelines, and the U.S. Department of Transportation’s National Highway Traffic Safety Administration (NHTSA) (for user safety in automobile-installed AR). Here are two example descriptions of their policies: “Don’t obscure pedestrians or road signs” is inspired from the NHTSA; “Don’t allow AR objects to occlude other AR objects” is inspired from the HoloLens’s guidelines. They designed a prototype platform called Arya that will implement the application output control based on the output policies specified, and evaluated Arya on various simulated scenarios. As of yet, Arya is the only AR or MR output access control approach in the literature.
Secure and Private Rendering
Other MR environments incorporates any surface or medium as a possible output display medium. Protected output rendering protects the medium and, by extension, whatever is in the medium. For example, when a wall is used as a display surface in an MR environment, the applications that use it as a display does not need to know what are the contents in the wall; it only has to know that there is a surface that can be used as a display. Least privilege has been used in this context [Vilk et al. (2014)]. For example, in a room-scale MR environment, only the skeletal information of the room, and the location and orientation of the detected surfaces (or display devices) is made known to the applications that wish to display content on these display surfaces [Vilk et al. (2015)]. It is also interesting to note that this specific case intersects very well with input protection because what is protected here is the possible sensitive information that can be captured in trying to determine the possible surfaces for displaying. Also, this example of room-scale MR environments are usually used for collaborative purposes. Protecting user collaborations and interactions are focused in the next subsection.
3.4 Protecting User Interactions
This vision of sharing experiences have already been realized in various forms such as print media, television, and on-line. There are now a number of various ways available on how users can share their experiences on-line, particularly in social networks. Any user can share a photo or a video, either posted or live, or even video chat with one or more users simultaneously. When it comes to collaboration, technologies such as audio-visual teleconferencing, and computer-supported collaborative work (CSCW) have been around to enable live sharing of information among multiple users. In addition to this, MR offers the sharing of a much more immersive experience. Through MR, a user can virtually be transported to a concert venue without leaving their house. An aircraft engineering team can collaboratively design and discuss without the need of them being physically together, while they simultaneously interact with a virtual prototype that is floating within their field-of-view. These are called shared space technologies as more than one user interacts in the same shared space as shown in Figure 10.
The three previous aspects are primarily protecting user data from potentially malicious or untrusted third-party applications. However, with shared space technologies, users are deliberately sharing information to other users. Thus, a new challenge arises: how can user security and privacy be ensured despite users deliberately sharing and collaborating? Moreover, with MR allowing sharing and interaction regardless of a physical limit, there may be unknown security and privacy risks.
Concerns on the boundaries in MR – “transparent boundaries between physical and synthetic spaces” – and on the directionality of these boundaries have been raised early on [Benford et al. (1998)]. The directionality can influence the balance of power, mutuality and privacy between users in shared spaces. For example, the boundary (labelled 1) in Figure 10 allows User 2 to receive full information (solid arrow labelled 2) from User 1 while User 1 receives partial information (broken arrow labelled 3) from User 2. The boundary enables an ‘imbalance of power’ which can have potential privacy and ethical effects on the users.
Early attempts on ensuring user privacy in a collaborative MR context was to provide users with a private space, as shown in Figure 10, that displays outputs only for that specific user while having a shared space for collaboration. For example, the PIP or personal interaction panel was designed to serve as a private interface for actions and tasks that the user does not want to share to the collaborative virtual space [Szalavári and Gervautz (1997)]. It is composed of a tracked “dumb” panel and a pen. It was, then, used as a gaming console, but we first take a look at some other early work on collaborative interactions, and, then, proceed with examples of how to manage privacy in shared spaces.
Privacy in Collaborative Interactions
Early collaborative platform prototypes [Billinghurst and Kato (1999), Regenbrecht et al. (2002), Grasset and Gascuel (2002), Schmalstieg and Hesina (2002), Hua et al. (2004)] demonstrated fully three-dimensional collaboration in MR; however, none have addressed the concerns raised from information sharing and due to the boundaries created by shared spaces. EMMIE (Environmental Management for Multi-user Information Environments) [Butz et al. (1999)] is a hybrid multi-interface collaborative environment which uses AR as a 3D interface at the same time allowing users to specify privacy of certain information or objects through privacy lamps and vampire mirrors [Butz et al. (1998)]. EMMIE’s privacy lamps are virtual lamps that ‘emit’ a light cone in which users can put objects within the light cone to mark these objects as private. On the other hand, the vampire mirrors are used to determine privacy of objects by showing full reflections of public objects while private objects are either invisible or transparent. However, the privacy lamps and vampire mirrors only protect virtual or synthetic content and does not provide protection to real-world objects.
Kinected Conference [DeVincenzi et al. (2011)] allows the participants to use gestures to impose a temporary private session during a video conference. Aside from that, they implemented synthetic focusing using Microsoft Kinect’s depth sensing capability – other participants are blurred in order to direct focus on a participant who is speaking –, and augmented graphics hovering above the user’s heads to show their information such as name, shared documents, and speaking time. The augmented graphics serve as feed-through information to deliver signals that would have been available in a shared physical space but is not readily cross-conveyed between remote physical spaces.
SecSpace [Reilly et al. (2014)] explores a feed-through mechanism to allow a more natural approach to user management of privacy in a collaborative MR environment. In contrast to Kinected Conference’s gesture-based privacy session, and EMMIE’s privacy lamps and vampire mirrors, users in SecSpace are provided feed-through information that would allow them to negotiate their privacy preferences. Figure 10 shows an example situation in which User n enters the shared space (labelled 4) on the same physical space as User 2 which triggers an alarm (labelled 5) or notification for User 1. The notification serves as a feed-through signalling that crosses over the MR boundary. By informing participants of such information, an imbalance of power can be rebalanced through negotiations.
Non-AR Feed-through signalling have also been used in a non-shared space context like the candid interactions [Ens et al. (2015)] which uses wearable bands that lights up in different colors depending on the smart-phone activity of the user, or other wearable icons that change shape, again, depending on which application the icon is associated to. However, the pervasive nature of these feed-through mechanisms can still pose security and privacy risks, thus, these mechanisms should be regulated and properly managed. In addition, the necessary infrastructure, especially for SecSpace, to enable this pervasive feed-through systems may be a detriment to adaptation. A careful balance between the users’ privacy in a shared space and the utility of the space as a communication medium is ought to be sought.
In the next subsection, we explore the different approaches used in MR gaming with multiple users. Competitive gaming demands secrecy and privacy in order to make strategies while performing other tasks in a shared environment. Thus, it is a very apt use case for implementing user protection in a shared space.
Protection in Shared Space Gaming
Using the PIP as a gaming console and an AR HMD for each user, the region that is defined within the PIP “dumb” panel serves as the private region [Szalavári et al. (1998)]. For example, in a game of Mah-jongg, the PIP panel serves as the user’s space for secret tiles while all users can see the public tiles through their HMDs. The PIP pen is used to pick-and-drop tiles between the private space and the public space. On the other hand, TouchSpace implements a larger room-scale MR game. It uses an HMD that can switch between see-through AR and full VR, an entire floor as shared game space with markers, and a wand for user interactions with virtual objects [Cheok et al. (2002)].
The AR game BragFish [Xu et al. (2008)] implements a similar idea on privacy to that of the PIP with the use of a handheld AR device, i.e. Gizmondo. In BragFish, a game table with markers serves as the shared space, while each user has the handheld AR that serves as the private space for each user. The handheld AR device has a camera that is used to “read” the markers associated to a certain game setting, and it frees the user from the bulky HMDs as in PIP and TouchSpace. The Gizmondo handheld device has also been used in another room-scale AR game [Mulloni et al. (2008)]. Similarly, camera phones have been used as a handheld AR device in a table top marker-based setup for collaborative gaming [Henrysson et al. (2005)].
In general, there are two basic spaces necessary in collaborative mixed reality as manifested in gaming: a shared space for public objects, and a private space for user-sensitive tasks such as making strategies. All examples, such as PIP and BragFish, assumes that each user can freely share and exchange content or information through the shared platform that their devices are running on. Furthermore, these privacy-sensitive shared space approaches are also, to some extent, inherently distributed which provides further security and privacy. In the next section, we focus on how this simple act of sharing can be protected in a mixed reality context without a pre-existing shared platform.
All those shared space systems that were discussed rely on a unified architecture to enable interactions and sharing. However, there might be cases that sharing is necessary but not in a shared space context, and an entire architecture, just like in SecSpace or EMMIE, to support sharing is not readily available. Looks Good To Me (LGTM) is an authentication protocol for device-to-device sharing [Gaebel et al. (2016)]. It is leveraged on the camera/s and wireless capabilities of existing AR HMDs. Specifically, it uses the combination of distance information through wireless localization and facial recognition information to cross-authenticate users. In other words, using the AR HMD that has a camera and wireless connectivity, users can simply look at each other to authenticate and initiate sharing. The visual channel acts as an out-of-band channel.
The previous subsections on protected user interactions have been focused on trying to address the innate challenges and limitations that exist due to the “boundaries” created by these shared spaces. PIP, EMMIE, and BragFish all tried to address the privacy concerns by providing users with a private space in addition to the shared space. In addition to a private space, SecSpace allowed the users to negotiate privacy while collaborating in trying to address issues on balance of power which was earlier raised by Benford et al. Differently, LGTM is an inter-device authentication protocol for initiating sharing without a centralized entity to handle the authentication.
3.5 Device Protection
Given the capabilities of these MR devices, various privacy and security risks and concerns have been raised. Various data protection approaches have also been proposed in the previous subsections. To complement these approaches, the devices themselves have to be protected as well. There are two general aspects that needs protection in the device level: device access, and display protection. Device access control ensures that authorized users are provided access while unauthorized ones are barred, while display protection ensures that the visual channel cannot be used for malicious inference or similar attacks. Visual cryptography approaches for display protection have already been discussed in Section 3.3.1 (and partly in 3.1.3). In the subsequent section for display protection, optical strategies are discussed.
Currently, password still remains as the most utilized method for authentication [Dickinson (2016)]. To enhance protection, multi-factor authentication (MFA) is now being adopted, which uses two or more independent methods for authentication. It usually involves the use of the traditional password method coupled with, say, a dynamic key that can be sent to the user via SMS, email, or voice call. The two-factor variant has been recommended as a security enhancement, particularly in on-line services like E-mail, cloud storage, e-commerce, banking, and social networks.
Aside from passwords are pin-based and pattern-based methods that are popular as mobile device authentication methods. A recent study [George et al. (2017)] evaluated the usability and security of these established pin- and pattern-based authentication methods in virtual interfaces, and showed comparable results in terms of execution time compared to the original non-virtual interface. Now, we take a look at other novel authentication methods that is leveraged on the existing and potential capabilities of MR devices.
Gesture- and Active Physiological-based Authentication We look at the various possible gestures that can easily be captured by MR devices, specifically finger, hand, and head gestures. Mid-air finger and hand gestures have been shown to achieve an accuracy between 86-91% (based on corresponsing accuracy from the equal error rate or EER) using a 3D camera-based motion controller, i.e. Leap Motion, over a test population of 200 users [Aslan et al. (2014)]. A combination of head gestures and blinking gestures triggered by a series of images shown through the AR HMD have also been evaluated and promises an approximately 94% of balanced accuracy rate in user identification over a population of 20 users [Rogers et al. (2015)]. On the other hand, Headbanger uses head-movements triggered by an auditory cue (i.e. music) and achieved a True Acceptance Rate (TAR) of 95.7% over a test population of 30 users [Li et al. (2016a)]. Other possible gestures or active physiological signals, such as breathing [Chauhan et al. (2017)], are also potential methods.
Passive Physiological-based Authentication In this subsection, we focus on passive methods using physiological or biometric signals. Physiological-signal-based key agreement or (PSKA) [Venkatasubramanian et al. (2010)] used PPG features locked in a fuzzy-vault for secure inter-sensor communications for body area networks or BAN. Despite existing MR devices not having PPG sensing capabilities, the PSKA method can be utilized for specific use cases when MR devices need to communicate with other devices in a BAN such as other wearables which can potentially be PPG sensing capable. On the other hand, SkullConduct [Schneegass et al. (2016)] uses the bone conduction capability of the Google Glass for user identification (with TAR of 97%) and authentication (EER of 6.9%). All these novel methods shows promise on how latent gestures, physiological signals, and device capabilities can be leveraged for user identification and authentication.
Multi-modal Biometric Authentication Multi-modal biometric authentication combines two or more modes in a singular method instead of involving other methods or bands of communication in MFA. One multi-modal method combines facial, iris, and periocular information for user authentication and has an EER of 0.68% [Raja et al. (2015)]. GazeTouchPass combines gaze gestures and touch keys as a singular pass-key for smart phones to counter shoulder-surfing attacks on touch-based pass keys [Khamis et al. (2016)]. These types of authentication methods can readily be applied to MR devices that has gaze tracking and other near-eye sensors.
Display Capture Protection
Currently available personal AR or MR see-through HMDs projects or displays content through lenses. The displayed content on the see-through lenses can leak and be observed externally, and it has been shown [Kohno et al. (2016)] that external capture devices, i.e. camera, can be used to capture and extract information from the leakage. Optical protection strategies (ibid) have been proposed, such as the use of polarization on the outer layer, use of narrowband illumination, or a combination of polarization in RGB and narrowband filters to maximize transmission while minimizing leakage. As of yet, this is the only work on MR display leakage protection using optical strategies.
However, there are other capture protection strategies that have been tested on non-MR devices which allows objects to inherently or actively protect themselves. For example, the TaPS widgets uses the optical reflective properties of a scattering foil to only show content at a certain viewing angle, i.e. 90\degree[Möllers and Borchers (2011)]. Active camouflaging techniques has also been used on mobile phones which allows the screen to blend with its surrounding just like a chameleon [Pearson et al. (2017)]. Both TaPS widgets and the chameleon-inspired camouflaging are device-level approaches which can also be used for device protection. Essentially, these approaches are physically hiding sensitive objects or information from visual capture.
3.6 Summary of Security and Privacy Approaches in Mixed Reality
|The extent of each control was either fully applied, or partially applied, while the blank indicates not applied, not applicable, or not mentioned. The controls have been applied to either a MR context, or a proto-MR context. * Least privilege is applied to AC.|
The previous discussions on the different security and privacy approaches were arranged based on the categorization presented in Section 2. Now, we present an over-all comparison of these approaches based on which security and privacy controls have been used by these approaches. We also highlight the focus of the different approaches discussed and, consequently, reveal some shortcomings, gaps, or potential areas of further study or investigation.
Security and Privacy Controls
For Table 1, we use the US National Institute of Standard and Technology’s (NIST) listed security and privacy controls [NIST (2013)] as basis to formulate 11 generic control families which we use to generalize and compare the approaches from a high-level perspective. The 11 security and privacy controls are labelled as follows: AC = Access Control Policy; DA = Device Authentication; MP = Media Protection Policy; MA = Media Access; MS = Media Sanitization; PA = Physical Access Control, and Access to Output Devices; IS = Information in Shared Resources and Information Sharing; TC = Transmission Confidentiality and Integrity; CP = Cryptographic Protection, Key Establishment and Management, and/or PKI certificates; DP = Distributed Processing and Storage; OC = Out-of-band Channels. We manually identify the controls used by the approaches and to what extent it has been applied (indicated by the shade on the boxes). In addition, we have also labelled the approaches to show which of them have actually been applied to an MR context, or not. The label proto-MR indicates that the approach has been done on a non-MR context but has clear use cases in MR. However, we did not include the generic non-MR targeted approaches to focus only on MR and proto-MR approaches.
Controls under each category Most of the input protection approaches are fully implementing media sanitization (labelled MS) as a form of control, as well as media protection (MP) and access (MA) policies, to some extent. Two of them were using access control policies (AC) to limit or provide abstraction on how applications have access to the input resources, i.e. sensing interfaces, and they further implemented least privilege access control. AC controls are targeted for resources while MA controls are for the actual media.
Most of the data protection approaches discussed in Section 3.2 are non-MR. Only one work has actually been applied in an MR context as shown in Table 1. Therefore, there is big opportunity to look into MR-targeted data protection approaches in all three major aspects: collection, processing, and storage.
The first three output protection approaches were partially providing protection of outputs in shared resources (IS) and physical access control (PA) to the output interfaces. Similarly, the second set of three approaches were providing IS protection but in a full extent, as well as TC and CP protection. The two remaining output approaches were more focused on the access (AC) the applications have to the output resources, i.e. displays.
IS controls are in two aspects: protecting information sharing and protecting information in shared resources. The output protection approaches that use IS controls are primarily focusing on the shared output resource. On the other hand, the protection approaches for user interactions are providing protection, primarily, in an actual sharing activity in a shared space; thus, these interaction protection approaches are addressing the two aspects of IS controls. Furthermore, most interaction protection approaches are also using distributed processing (DP) controls.
Obviously, among all the approaches, only the device protection ones are using device authentication (DA) controls, while the last three device protection approaches were more focused on the physical aspect (PA) of device access.
Generalizations and gaps In general, regardless of category, the most used control is media sanitization (MS) followed by the protected sharing controls (IS). The next most used control is the combined media protection policy and access controls (MP and MA). The least used control was the of out-of-band channels (labelled OC) followed by distributed processing (DP) and resource access controls (AC). The top used controls are more privacy-leaning while security-leaning controls (such as TC, CP, and DP) were more underutilized.
The categorization, to some extent, has localised the types of controls that were used by the approaches under the same category. Different sets of controls are required by each category and the table shows the clustering of the different approaches according to which controls they use. It shows the focus of most of the approaches in one category, and, conversely, which controls are not (primarily being those controls are not applicable). Furthermore, there were also other control families that were not captured in this summary table such as data mining protection, auditing (and reporting), and storage protection, but were covered by the generic non-MR approaches discussed in Section 3.2. This highlights the lack of data protection approaches that are targeted for MR. This is highlighted more in Figure 11 which shows the distribution of the discussed approaches. Technically, however, all of these approaches are protecting data, but we emphasize on the approaches that are not under the other four categories, i.e. those approaches focused in providing protection during data collection, processing, and storage. There are numerous generic data protection approaches but most have not been investigated, implemented, or evaluated in MR. Thus, it is opportune to investigate the feasibility of using these approaches (e.g. data mining protection, and storage protection) on MR data.
Aside from gaps in protection that can potentially be addressed by future work, there are open challenges in security and privacy as well as in mixed reality in general that needs to be addressed. The next section highlights these challenges and future directions.
4 Current Devices and Open challenges
On the previous sections, we have elaborated on the different security and privacy approaches (Section 3) in the research literature. Now, we present the remaining challenges brought about by the existing MR devices and the remaining security and privacy challenges as a whole.
4.1 Current Devices and Platforms
Most of the security and privacy work discussed in Section 3 have only been applied on predecessor or prototype platforms of MR. Now, we take a broader look on the security and privacy aspects of existing devices and platforms. First, we present a list of the currently-available software tool kits (i.e. SDKs and/or APIs) and devices in Tables 2 and 3, respectively. Then, proceed with the security and privacy challenges that arise from them.
Mixed Reality Platforms
Most of these tool kits are multi-operating system and vision-based. Specifically, they are utilizing monocular (single vision) detection for 2D, 3D , and (for some) synchronous localization and mapping (or SLAM). Of these multi-OS, vision-based tool kits, only the ARToolkit
There are also platform- and device-dependent tool kits listed. The ARkit
|Software Tool Kit||Detection or Tracking||Platform||Development Language||Licensing||Current Version|
|ARToolKit||2D objects||Android, iOS, Linux, macOS, Unity3D||C, C++, C#, Java||Free (open source)||ARToolKit6 (2017)|
|Layar||2D objects||Android, iOS, Blackberry||HTTP (RESTful), JSON||Paid||Layar v7.0|
|Vuforia||3D objects, marker-based visual SLAM||Android, iOS, Windows (selected devices), Unity 3D||C++, C#, Objective-C, Java||Free (with paid versions)||Vuforia 6.5|
|EasyAR||3D objects, visual SLAM||Android, iOS, macOS, Windows, Unity3D||C, C++, Objective-C, Java||Free (with paid version)||EasyAR SDK 2.0|
|kudan||3D object, visual SLAM||Android, iOS, Unity 3D||Objective-C, Java||Free (with paid versions)||kudan v1.0|
|ARkit||3D objects, visual and depth SLAM||iOS||Objective-C||Free||ARKit (2017)|
|OSVR||2D objects, orientation-based SLAM||Vuzix, OSVR, HTC Vive, Oculus, SteamVR||C, C++, C#, JSON for plug-ins)||Free (open source)||OSVR Hacker Development Kit (HDK) 2.0|
|Device||Sensors||Controls or UI||Processor||OS||Outputs||Connectivity|
|Google Glass 1.0 Explorer Edition||Camera (5MP still/ 720p video), Gyroscope, accelerometer, compass, light and proximity sensor, microphone||Touchpad, remote control via app (iOS, Android)||TI OMAP 4430 1.2GHz dual core (ARMv7)||Android 4.4||640x360 (LCoS, prism), Bone coduction audio transducer||WiFi 802.11 b/g, Bluetooth, microUSB|
|Google Glass 2.0 Enterprise Edition||Camera (5MP still/ 720p video), Gyroscope, accelerometer, compass, light and proximity sensor, microphone||Touchpad, remote control via app (iOS, Android)||Intel Atom (32-bit)||Android 4.4||640x360 (LCoS, prism), Audio speaker||WiFi 802.11 dual-band a/b/g/n/ac, Bluetooth LE, microUSB|
|ReconJet Pro||Camera, Gyroscope, accelerometer, compass, pressure, IR||2 buttons, 4-axis optical touchpad||1GHz dual-core ARM Cortex-A9||ReconOS 4.4 (Android-based)||16:9 WQVGA, Audio speaker||WiFi 802.11 b/g, Bluetooth, GPS, microUSB|
|Vuzix M300||Camera (13MP still/1080p video), proximity (inward and outward), gyroscope, accelerometer, compass, 3-DoF head tracking, dual microphones||4 buttons, 2-axis touchpad, remote control via app (iOS, Android), voice||Dual-core Intel Atom||Android 6.0||16:9 screen equivalent to 5 in screen at 17 in, Audio speaker||WiFi 802.11 dual-band a/b/g/n/ac, Bluetooth 4.1/2.1, microUSB|
|Vuzix M100||Camera (5MP still/1080p video), proximity (inward and outward), gyroscope, accelerometer, compass, 3-DoF head tracking, light, dual microphones||4 buttons, remote control via app (iOS, Android), voice, gestures||TI OMAP 4460 1.2GHz||Android 4.04||16:9 WQVGA (equivalent to 4 in at 14 in), Audio speaker||WiFi 802.11 b/g/n, Bluetooth, microUSB|
|ODG R-7||Camera (1080p @ 60fps, 720p @ 120fps), 2 accelerometer, 2 gyroscope, 2 magnetometer, altitude, humidity, light, 2 microphones||at least 2 buttons, trackpad, Bluetooth keyboard||Qualcomm SnapdragonTM 805 2.7GHz quad-core||ReticleOS (Android-based)||Dual 720p 16:9 stereoscopic see-thru displays, Stereo Audio ports with earbuds||WiFi 802.11ac, Bluetooth 4.1, GNSS (GPS/GLONASS), microUSB|
|Epson Moverio BT-300||Camera (5MP with motion tracking), magnetic, acceleromter, gyroscope, microphone||trackpad (tethered)||Intel Atom 5, 1.44GHz quad-core||Android 5.1||Binocular 1280x720 transparent display, Earphones||WiFi 802.11 dual-band a/b/g/n/ac, Bluetooth, GPS, microUSB|
|HTC Vive*||Camera, gyroscope, accelerometer, laser position sensor, microphones||Two handheld controllers with trackpad, triggers; two external base stations for tracking||(directly via host computer, GPU recommended)||Windows for host computer (SteamVR 3D engine)||Binocular 1080x1200 (110 FoV), Audio jack||HDMI or Display Port and USB 3.0|
|Meta 2 Glasses||Camera (720p), hand tracking, 3 microphones||hand gestures||(directly via host computer, GPU recommended)||Windows for host computer (Unity SDK 3D engine)||2550x1440 (90 FoV), 4 surround sound speakers||HDMI|
|Microsoft HoloLens||4 Environment understanding cameras, depth camera, 2MP camera, 1 MR capture camera, inertial measurement unit, light, 4 microphones||buttons, Voice and gestures||Intel 32-bit, and a dedicated holographic processing unit (HPU)||Windows 10||16:9 see-through holographic lenses (30 FoV), Audio (surround) speakers, audio jack||WiFi 802.11ac, Bluetooth 4.1 LE, microUSB|
Mixed Reality Devices
Table 3 lists some commercially available HMDs. Most of the devices listed are partially see-through devices such as the Recon Jet
The other type is the optical see-through HMD which uses specialised optics to display objects within the user’s FOV. The special optics are configured in such a way to allow light to pass through and let the users see their physical surroundings unlike the displays on the video see-through types. Epson’s Moverio BT-300
Security and Privacy of Existing Devices and Platforms
However, all these tool kits, platforms, and devices have not considered the security and privacy risks that were highlighted in this survey. There was an effort to pinpoint risks and identify long-term protection for some platforms, specifically Wikitude, Layar, and Junaio, [McPherson et al. (2015)] but no actual nor systematic evaluation was done on both the risks and the proposed protection. A systematic security and privacy analysis of MR applications, devices, and platforms has to be performed in order to identify other potential and latent risks. This is in order to evaluate these systems against further security and privacy requirements that addresses beyond the known threats and exploits, particularly, on those that have been brought up in Sections 2 and 3. Upon consideration of these requirements, we can, then, design protection mechanisms targeted for these devices and platforms.
Regulation of Mixed Reality applications Moreover, the current MR devices and platforms now provides a peek on what the future devices will offer. Given the capabilities of these devices, both in sensing and rendering, there are numerous concerns on security and privacy that have been raised and needs to be addressed [Conti et al. (2013), Dwight (2016), Tynan (2017)]. In fact, real vulnerabilities have already been revealed particularly on the numerous released applications. For example, Pokemon Go’s network messages between the application and their servers on its early releases were vulnerable to man-in-the-middle attacks [Colceriu (2016)] which allows hackers to get precise location information of users. In addition to these were the numerous malicious versions of the application [Burgess (2016), Frink (2016)] that has been downloaded and used by many users. Therefore, there is a need for regulation, whether applied locally on the devices or platforms, or in a larger scale through a much stringent reputation- and trust-based market of applications.
Native Support Furthermore, the entire MR pipeline processing (of detection, transformation, and rendering as shown in Figure 5) is performed by the applications themselves and in a very monolithic manner. They would need direct access to sensing interfaces to perform detection, and to output interfaces for rendering. Once provided permission, AR applications, inevitably, have indefinite access to these resources and their data. Native, operating system-level support for detection and rendering can provide an access control framework to ensure security and privacy of user data. OS-level abstractions have been proposed [D’Antoni et al. (2013)] to “expose only the data required to [third-party] applications”. Apple’s ARkit is one of those moves to provide native support to MR application development on their platforms, but it’s granularity of access control is still coarse and similar risks are still present.
Targeted Data Protection As highlighted in Section 3.6, there is a lack of MR-targeted data protection approaches except for one specific use case on virtual cloth try-ons. Due to the highly rich and dynamic data that is associated to MR, generic data protection approaches may not be readily applicable and translatable to MR systems. Thus, there is a huge necessity to design and evaluate data protection approaches (e.g. encryption, privacy-preserving algorithms, and personal data stores) targeted for MR data.
4.2 High-level challenges in Mixed Reality
Recently, mixed reality technology has been given a lift-off by companies, such as Microsoft and Google, which are now focusing on delivering technologies that will provide mixed reality services in a personal level. Even content and social media companies are also joining the trend with expectations that MR will enable new forms of content and that it will definitely be social [Rubin (2017)]. Now, we take a look at challenges in the broader context of usability, interoperability, and the future of MR as a technology in general.
User welfare, society, and policy Possible social consequences [Feiner (1999)] may arise from the user’s appearance while wearing the device, and, further, implications on bystander privacy which have already been brought up several times. Existing (universal and legal) concepts on privacy and policy may need to be revisited to catch up with MR becoming mainstream. As these technologies will be delivering information overlaid to the physical space, and thus the correctness, safety, and legality of this information has to be ensured [Roesner et al. (2014a)]. Other user welfare challenges include the reduction of physical burden from wearing the device; eye discomfort from occlusion problems, incorrect focus, or output jitter; and the cognitive disconnect from the loss of tactile feedback when “interacting” with the virtual objects.
Profitability and Interoperability Ultimately, before MR becomes widely adopted, interoperability and profitability are additional aspects that needs to be considered. Manufacturers and developers has to make sure the profitability of these devices once delivered to the market [Kroeker (2010)]. Subsequently, ensuring the interoperability of these devices with existing technologies is key to user adoption of AR and MR devices. There are already demonstrations of how web browsing can be interoperable between AR and non-AR platforms [Lee et al. (2009)]. As future and upcoming MR services will be web-based or web-assisted, this can serve as a basis.
Networking Challenges Due to the expected increase of MR applications, services, platforms, and devices, network demand is also expected to increase for these services to be delivered to the users at an acceptable latency. And it is not only confined to delivery or transmission of MR data, but also to processing (i.e. data transformation) and rendering. Current data rates that are provided by existing communication networks may not be ready for such demand [Braud et al. (2017)]. Thus, it is an imperative that current and upcoming network design and infrastructure has to be rethought and remains a challenge.
Smart Future Overall, the same with all the other technologies that try to push through the barriers, challenges on processing, data storage, communication, energy management, and battery still remains at the core. Despite all these, a smart world future is still in sight where smart objects integrate to form smart spaces that, then, form smart hyperspaces [Ma et al. (2005)]. This smart world will be composed of ‘things’ with enough intelligence to be self-aware and manage its own processing, security, and energy. MR has been shown to enable a cross-reality space [Lifton et al. (2009)] – freely crossing from physical to synthetic and vice versa – that can allow this self-aware devices to understand the real world while managing all information virtually. Lastly, we reiterate that existing networks have to be analysed and evaluated if it can support this future and should be redesigned if necessary.
This is the first survey to take in the endeavour of collecting, categorizing, and reviewing various security and privacy approaches in MR. In particular, we have raised the various known and latent security and privacy risks associated with the potential functionalities of MR, and gathered a comprehensive collection of security and privacy approaches on MR and related technologies. We have identified five aspects of protection in MR, namely input, data access, output, interactivity, and device integrity, and categorized the approaches according to these five aspects. In addition, we have identified the generic security and privacy controls used by the approaches. We have used these identified general controls to present a high-level description of the approaches and use it to compare them. Among the controls, media protection and sanitization were the most widely used which was primarily used for input protection, while there is a lack in the utilization of other security controls such as resource access control, data mining protection, cryptography, and so on. Therefore, while the utilization and adoption of these MR devices and systems are still not widespread, there is still enough time and opportunity to design, investigate, and implement security and privacy mechanisms – algorithms and methods – that can be integrated with existing and upcoming MR systems.
- thanks: J. de Guzman is a PhD student at University of New South Wales, Sydney, Australia through a scholarship from the Engineering Research and Development for Technology (ERDT) Program of the Department of Science and Technology (DOST) of the Philippine Government. He is conducting his studentship at Networks Group of Data 61 — CSIRO, Australia. E-mail: firstname.lastname@example.org K. Thilakarathna was a research scientist for the Networks Group of the Cyber Physical Systems Research Program of Data 61 and is now currently with University of Sydney. A. Seneviratne was the director of the Cyber Physical Systems Research Program of Data 61, in which the Networks Group was a part of, and he is also a professor at University of New South Wales.
- Transportation refers to the extent to which the users are transported from their physical locality to a virtual or remote space.
- Artificiality is the extent to which the user space has been synthesized from a real physical space.
- Spatiality is the extent to which properties of natural space and movement is supported by the shared space.
- Keyboards and other input pads can be considered as one-dimensional interfaces, while the mouse and the touch interfaces provides two-dimensional space interactions with limited third dimension using scroll, pan, and zoom capabilities.
- Anthropometric information are body measurements of an individual that capture size, shape, body composition, and so on.
- ARToolKit6, (Accessed August 4, 2017), https://artoolkit.org/
- wikitude, (Accessed August 4, 2017), https://www.wikitude.com/
- Layar, (Accessed August 4, 2017), https://www.layar.com/
- Zapbox: Mixed Reality for Everyone, (Accessed August 4, 2017), https://www.zappar.com/zapbox/
- ARKit - Apple Developer, (Accessed August 4, 2017), https://developer.apple.com/arkit/
- What is OSVR?, (Accessed August 4, 2017), http://www.osvr.org/what-is-osvr.html
- Meta 2 SDK Features, (Accessed August 4, 2017), https://www.metavision.com/develop/sdkfeatures
- Windows Mixed Reality, (Accessed August 4, 2017), https://developer.microsoft.com/en-us/windows/mixed-reality
- Recon Jet, (Accessed 4 Aug 2017), https://www.reconinstruments.com/products/jet
- Products, (Accessed 4 Aug 2017), https://www.vuzix.com/Products
- Buy VIVE Hardware, (Accessed 4 Aug 2017), https://www.vive.com/au/product/
- Moverio BT-300 Smart Glasses (AR/Developer Edition), (Accessed 4 Aug 2017), https://epson.com/For-Work/Wearables/Smart-Glasses/Moverio-BT-300-Smart-Glasses-%28AR-Developer-Edition%29-/p/V11H756020
- R-7 Smartglasses System, (Accessed 4 Aug 2017), https://shop.osterhoutgroup.com/products/r-7-glasses-system
- The Meta 2: Made for AR App Development, (Accessed 4 Aug 2017), https://buy.metavision.com/
- HoloLens hardware details, (Accessed 4 Aug 2017), https://developer.microsoft.com/en-us/windows/mixed-reality/hololens_hardware_details/
- Alessandro Acquisti. 2011. Privacy in the age of augmented reality. (2011).
- Alessandro Acquisti, Curtis R Taylor, and Liad Wagman. 2016. The economics of privacy. (2016).
- Paarijaat Aditya, Rijurekha Sen, Peter Druschel, Seong Joon Oh, Rodrigo Benenson, Mario Fritz, Bernt Schiele, Bobby Bhattacharjee, and Tong Tong Wu. 2016. I-pic: A platform for privacy-compliant image capture. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 235–248.
- Sarah J Andrabi, Michael K Reiter, and Cynthia Sturton. 2015. Usability of Augmented Reality for Revealing Secret Messages to Users but Not Their Devices.. In SOUPS. 89–102.
- Ilhan Aslan, Andreas Uhl, Alexander Meschtscherjakov, and Manfred Tscheligi. 2014. Mid-air authentication gestures: an exploration of authentication based on palm and finger motions. In Proceedings of the 16th International Conference on Multimodal Interaction. ACM, 311–318.
- Ronald Azuma, Yohan Baillot, Reinhold Behringer, Steven Feiner, Simon Julier, and Blair MacIntyre. 2001. Recent advances in augmented reality. IEEE computer graphics and applications 21, 6 (2001), 34–47.
- Ronald T Azuma. 1997. A survey of augmented reality. Presence: Teleoperators and virtual environments 6, 4 (1997), 355–385.
- Steve Benford, Chris Brown, Gail Reynard, and Chris Greenhalgh. 1996. Shared spaces: transportation, artificiality, and spatiality. In Proceedings of the 1996 ACM conference on Computer supported cooperative work. ACM, 77–86.
- Steve Benford, Chris Greenhalgh, Gail Reynard, Chris Brown, and Boriana Koleva. 1998. Understanding and constructing shared spaces with mixed-reality boundaries. ACM Transactions on computer-human interaction (TOCHI) 5, 3 (1998), 185–223.
- Mark Billinghurst and Hirokazu Kato. 1999. Collaborative mixed reality. In Proceedings of the First International Symposium on Mixed Reality. 261–284.
- T. Braud, F. H. Bijarbooneh, D. Chatzopoulos, and P. Hui. 2017. Future Networking Challenges: The Case of Mobile Augmented Reality. (June 2017), 1796-1807 pages. https://doi.org/10.1109/ICDCS.2017.48
- Matt Burgess. 2016. Malicious Pokémon Go app is putting Android phones at risk. (2016). http://www.wired.co.uk/article/pokemon-go-malicious-android-app-problems
- Andreas Butz, Clifford Beshers, and Steven Feiner. 1998. Of vampire mirrors and privacy lamps: Privacy management in multi-user augmented environments. In Proceedings of the 11th annual ACM symposium on User interface software and technology. ACM, 171–172.
- Andreas Butz, Tobias Höllerer, Steven Feiner, Blair MacIntyre, and Clifford Beshers. 1999. Enveloping Users and Computers in a Collaborative 3D Augmented Reality. In Proceedings of the 2nd IEEE and ACM International Workshop on Augmented Reality (IWAR ’99). IEEE Computer Society, Washington, DC, USA, 35–.
- Joy Jo-Yi Chang, Ming-Jheng Li, Yi-Chun Wang, and Justie Su-Tzu Juan. 2010. Two-image encryption by random grids. In Communications and Information Technologies (ISCIT), 2010 International Symposium on. IEEE, 458–463.
- Dimitris Chatzopoulos, Carlos Bermejo, Zhanpeng Huang, and Pan Hui. 2017. Mobile Augmented Reality Survey: From Where We Are to Where We Go. IEEE Access (2017).
- Jagmohan Chauhan, Yining Hu, Suranga Seneviratne, Archan Misra, Aruna Seneviratne, and Youngki Lee. 2017. BreathPrint: Breathing Acoustics-based User Authentication. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 278–291.
- Adrian David Cheok, Xubo Yang, Zhou Zhi Ying, Mark Billinghurst, and Hirokazu Kato. 2002. Touch-space: Mixed reality game space based on ubiquitous, tangible, and social computing. Personal and ubiquitous computing 6, 5-6 (2002), 430–442.
- Alina Colceriu. 2016. Catching Pokemon GO in Your Network. (2016). https://www.ixiacom.com/company/blog/catching-pokemon-go-your-network
- Gregory Conti, Edward Sobiesk, Paul Anderson, Steven Billington, Alex Farmer, Cory Kirk, Patrick Shaffer, and Kyle Stammer. 2013. Unintended, malicious and evil applications of augmented reality. (2013). https://www.helpnetsecurity.com/2013/02/12/65279unintended-malicious-and-evil-applications-of-augmented-reality/
- Andy Crabtree, Tom Lodge, James Colley, Chris Greenhalgh, Richard Mortier, and Hamed Haddadi. 2016. Enabling the new economic actor: data protection, the digital economy, and the Databox. Personal and Ubiquitous Computing 20, 6 (2016), 947–957.
- Keith Curtin. 2017. Mixed Reality will be most important tech of 2017. (Jan. 2017). https://thenextweb.com/insider/2017/01/07/mixed-reality-will-be-most-important-tech-of-2017
- Loris D’Antoni, Alan M Dunn, Suman Jana, Tadayoshi Kohno, Benjamin Livshits, David Molnar, Alexander Moshchuk, Eyal Ofek, Franziska Roesner, T Scott Saponas, et al. 2013. Operating System Support for Augmented Reality Applications.. In HotOS, Vol. 13. 21–21.
- Yves-Alexandre de Montjoye, Erez Shmueli, Samuel S Wang, and Alex Sandy Pentland. 2014. openpds: Protecting the privacy of metadata through safeanswers. PloS one 9, 7 (2014), e98790.
- Anthony DeVincenzi, Lining Yao, Hiroshi Ishii, and Ramesh Raskar. 2011. Kinected conference: augmenting video imaging with calibrated depth and audio. In Proceedings of the ACM 2011 conference on Computer supported cooperative work. ACM, 621–624.
- Ben Dickinson. 2016. 5 authentication methods putting passwords to shame. (March 2016). https://thenextweb.com/insider/2016/03/31/5-technologies-will-flip-world-authentication-head
- Klaus Dorfmuller-Ulhaas and Dieter Schmalstieg. 2001. Finger tracking for interaction in augmented environments. In Augmented Reality, 2001. Proceedings. IEEE and ACM International Symposium on. IEEE, 55–64.
- Davis Dwight. 2016. Real-world risks in an augmented reality. (2016). https://www.csoonline.com/article/3101644/techology-business/real-world-risks-in-an-augmented-reality.html
- Cynthia Dwork, Aaron Roth, et al. 2014. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science 9, 3–4 (2014), 211–407.
- Marc Eaddy, Gabor Blasko, Jason Babcock, and Steven Feiner. 2004. My own private kiosk: Privacy-preserving public displays. In Wearable Computers, 2004. ISWC 2004. Eighth International Symposium on, Vol. 1. IEEE, 132–135.
- Barrett Ens, Tovi Grossman, Fraser Anderson, Justin Matejka, and George Fitzmaurice. 2015. Candid interaction: Revealing hidden mobile and wearable computing activities. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology. ACM, 467–476.
- Úlfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. 2014. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security. ACM, 1054–1067.
- Chengfang Fang and Ee-Chien Chang. 2010. Securing interactive sessions using mobile device through visual channel and visual inspection. In Proceedings of the 26th Annual Computer Security Applications Conference. ACM, 69–78.
- Steven K Feiner. 1999. The importance of being mobile: some social consequences of wearable augmented reality systems. In Augmented Reality, 1999.(IWAR’99) Proceedings. 2nd IEEE and ACM International Workshop on. IEEE, 145–148.
- Adrienne Porter Felt, Elizabeth Ha, Serge Egelman, Ariel Haney, Erika Chin, and David Wagner. 2012. Android permissions: User attention, comprehension, and behavior. In Proceedings of the eighth symposium on usable privacy and security. ACM, 3.
- Lucas Silva Figueiredo, Benjamin Livshits, David Molnar, and Margus Veanes. 2016. Prepose: Privacy, Security, and Reliability for Gesture-Based Programming. In Security and Privacy (SP), 2016 IEEE Symposium on. IEEE, 122–137.
- Andrea G Forte, Juan A Garay, Trevor Jim, and Yevgeniy Vahlis. 2014. EyeDecrypt – Private interactions in plain sight. In International Conference on Security and Cryptography for Networks. Springer, 255–276.
- Batya Friedman and Peter H Kahn Jr. 2000. New directions: A Value-Sensitive Design approach to augmented reality. In Proceedings of DARE 2000 on designing augmented reality environments. ACM, 163–164.
- Lyle Frink. 2016. UPDATE: Augmented Malware with Pokémon Go. (2016). https://blog.avira.com/augmented-malware-pokemon-go/
- Ethan Gaebel, Ning Zhang, Wenjing Lou, and Y Thomas Hou. 2016. Looks Good To Me: Authentication for Augmented Reality. In Proceedings of the 6th International Workshop on Trustworthy Embedded Devices. ACM, 57–67.
- Craig Gentry. 2009. A fully homomorphic encryption scheme. Stanford University.
- Ceenu George, Mohamed Khamis, Emanuel von Zezschwitz, Marinus Burger, Henri Schmidt, Florian Alt, and Heinrich Hussmann. 2017. Seamless and Secure VR: Adapting and Evaluating Established Authentication Systems for Virtual Reality.
- Lauren Goode and Tom Warren. 2015. This is what Microsoft HoloLens is really like. (2015). https://www.theverge.com/2016/4/1/11334488/microsoft-hololens-video-augmented-reality-ar-headset-hands-on
- Raphaël Grasset and Jean-Dominique Gascuel. 2002. MARE: Multiuser Augmented Reality Environment on Table Setup. In ACM SIGGRAPH 2002 Conference Abstracts and Applications (SIGGRAPH ’02). ACM, New York, NY, USA, 213–213. https://doi.org/10.1145/1242073.1242226
- Masayuki Hayashi, Ryo Yoshida, Itaru Kitahara, Yoshinari Kameda, and Yuichi Ohta. 2010. An installation of privacy-safe see-through vision. Procedia-Social and Behavioral Sciences 2, 1 (2010), 125–128.
- Wenbo He, Xue Liu, Hoang Nguyen, Klara Nahrstedt, and Tarek Abdelzaher. 2007. Pda: Privacy-preserving data aggregation in wireless sensor networks. In INFOCOM 2007. 26th IEEE International Conference on Computer Communications. IEEE. IEEE, 2045–2053.
- Wenbo He, Xue Liu, Hoang Viet Nguyen, Klara Nahrstedt, and Tarek Abdelzaher. 2011. PDA: privacy-preserving data aggregation for information collection. ACM Transactions on Sensor Networks (TOSN) 8, 1 (2011), 6.
- Olli I Heimo, Kai K Kimppa, Seppo Helle, Timo Korkalainen, and Teijo Lehtonen. 2014. Augmented reality-Towards an ethical fantasy?. In Ethics in Science, Technology and Engineering, 2014 IEEE International Symposium on. IEEE, 1–7.
- A. Henrysson, M. Billinghurst, and M. Ollila. 2005. Face to face collaborative AR on mobile phones. In Fourth IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR’05). 80–89. https://doi.org/10.1109/ISMAR.2005.32
- Hong Hua, Leonard D Brown, and Chunyu Gao. 2004. SCAPE: supporting stereoscopic collaboration in augmented and projective environments. IEEE Computer Graphics and Applications 24, 1 (2004), 66–75.
- Yan Huang, David Evans, Jonathan Katz, and Lior Malka. 2011. Faster Secure Two-Party Computation Using Garbled Circuits.. In USENIX Security Symposium, Vol. 201.
- Suman Jana, David Molnar, Alexander Moshchuk, Alan M Dunn, Benjamin Livshits, Helen J Wang, and Eyal Ofek. 2013a. Enabling Fine-Grained Permissions for Augmented Reality Applications with Recognizers.. In USENIX Security.
- Suman Jana, Arvind Narayanan, and Vitaly Shmatikov. 2013b. A Scanner Darkly: Protecting user privacy from perceptual applications. In Security and Privacy (SP), 2013 IEEE Symposium on. IEEE, 349–363.
- Mohamed Khamis, Florian Alt, Mariam Hassib, Emanuel von Zezschwitz, Regina Hasholzner, and Andreas Bulling. 2016. Gazetouchpass: Multimodal authentication using gaze and touch on mobile devices. In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems. ACM, 2156–2164.
- Tadayoshi Kohno, Joel Kollin, David Molnar, and Franziska Roesner. 2016. Display Leakage and Transparent Wearable Displays: Investigation of Risk, Root Causes, and Defenses. Technical Report.
- Bernard Kress and Thad Starner. 2013. A review of head-mounted displays (HMD) technologies and applications for consumer electronics. In Proc. SPIE, Vol. 8720. 87200A.
- Kirk L Kroeker. 2010. Mainstreaming augmented reality. Commun. ACM 53, 7 (2010), 19–21.
- Patrik Lantz, Bjorn Johansson, Martin Hell, and Ben Smeets. 2015. Visual Cryptography and Obfuscation: A Use-Case for Decrypting and Deobfuscating Information Using Augmented Reality. In International Conference on Financial Cryptography and Data Security. Springer, 261–273.
- Kiron Lebeck, Tadayoshi Kohno, and Franziska Roesner. 2016. How to safely augment reality: Challenges and Directions. In Proceedings of the 17th International Workshop on Mobile Computing Systems and Applications. ACM, 45–50.
- Kiron Lebeck, Kimberly Ruth, Tadayoshi Kohno, and Franziska Roesner. 2017. Securing Augmented Reality Output. In Security and Privacy (SP), 2017 IEEE Symposium on. IEEE, 320–337.
- Linda Lee, Serge Egelman, Joong Hwa Lee, and David Wagner. 2015. Risk Perceptions for Wearable Devices. (2015). arXiv:1504.05694
- Ryong Lee, Daisuke Kitayama, Yong-Jin Kwon, and Kazutoshi Sumiya. 2009. Interoperable augmented web browsing for exploring virtual media in real space. In Proceedings of the 2nd International Workshop on Location and the Web. ACM, 7.
- Ang Li, Qinghua Li, and Wei Gao. 2016b. PrivacyCamera: Cooperative Privacy-Aware Photographing with Mobile Phones. In Sensing, Communication, and Networking (SECON), 2016 13th Annual IEEE International Conference on. IEEE, 1–9.
- Sugang Li, Ashwin Ashok, Yanyong Zhang, Chenren Xu, Janne Lindqvist, and Macro Gruteser. 2016a. Whose move is it anyway? Authenticating smart wearable devices using unique head movement patterns. In Pervasive Computing and Communications (PerCom), 2016 IEEE International Conference on. IEEE, 1–9.
- Joshua Lifton, Mathew Laibowitz, Drew Harry, Nan-Wei Gong, Manas Mittal, and Joseph A Paradiso. 2009. Metaphor and manifestation cross-reality with ubiquitous sensor/actuator networks. IEEE Pervasive Computing 8, 3 (2009).
- Pei-Yu Lin, Bin You, and Xiaoyong Lu. 2017. Video exhibition with adjustable augmented reality system based on temporal psycho-visual modulation. EURASIP Journal on Image and Video Processing 2017, 1 (2017), 7.
- Jianhua Ma, Laurence Tianruo Yang, Bernady O Apduhan, Runhe Huang, Leonard Barolli, Makoto Takizawa, and Timothy K Shih. 2005. A walkthrough from smart spaces to smart hyperspaces towards a smart world with ubiquitous intelligence. In Parallel and Distributed Systems, 2005. Proceedings. 11th International Conference on, Vol. 1. IEEE, 370–376.
- Anindya Maiti, Oscar Armbruster, Murtuza Jadliwala, and Jibo He. 2016. Smartwatch-based keystroke inference attacks and context-aware protection mechanisms. In Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security. ACM, 795–806.
- Anindya Maiti, Murtuza Jadliwala, and Chase Weber. 2017. Preventing shoulder surfing using randomized augmented reality keyboards. In Pervasive Computing and Communications Workshops (PerCom Workshops), 2017 IEEE International Conference on. IEEE, 630–635.
- Richard McPherson, Suman Jana, and Vitaly Shmatikov. 2015. No Escape From Reality: Security and Privacy of Augmented Reality Browsers. In Proceedings of the 24th International Conference on World Wide Web (WWW ’15). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 743–753. https://doi.org/10.1145/2736277.2741657
- Frank McSherry and Kunal Talwar. 2007. Mechanism design via differential privacy. In Foundations of Computer Science, 2007. FOCS’07. 48th Annual IEEE Symposium on. IEEE, 94–103.
- Paul Milgram and Fumio Kishino. 1994. A taxonomy of mixed reality visual displays. IEICE TRANSACTIONS on Information and Systems 77, 12 (1994), 1321–1329.
- Paul Milgram, Haruo Takemura, Akira Utsumi, Fumio Kishino, et al. 1994. Augmented reality: A class of displays on the reality-virtuality continuum. In Telemanipulator and telepresence technologies, Vol. 2351. 282–292.
- Prashanth Mohan, Abhradeep Thakurta, Elaine Shi, Dawn Song, and David Culler. 2012. GUPT: privacy preserving data analysis made easy. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. ACM, 349–360.
- Max Möllers and Jan Borchers. 2011. TaPS widgets: interacting with tangible private spaces. In Proceedings of the ACM International Conference on Interactive Tabletops and Surfaces. ACM, 75–78.
- Alessandro Mulloni, Daniel Wagner, and Dieter Schmalstieg. 2008. Mobility and social interaction as core gameplay elements in multi-player augmented reality. In Proceedings of the 3rd international conference on Digital Interactive Media in Entertainment and Arts. ACM, 472–478.
- Min Mun, Shuai Hao, Nilesh Mishra, Katie Shilton, Jeff Burke, Deborah Estrin, Mark Hansen, and Ramesh Govindan. 2010. Personal data vaults: a locus of control for personal data streams. In Proceedings of the 6th International COnference. ACM, 17.
- Joint Task Force Transformation Initiative NIST. 2013. Security and Privacy Controls for Federal Information Systems and Organizations. Technical Report.
- Jennifer Pearson, Simon Robinson, Matt Jones, Anirudha Joshi, Shashank Ahire, Deepak Sahoo, and Sriram Subramanian. 2017. Chameleon devices: investigating more secure and discreet mobile interactions via active camouflaging. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 5184–5196.
- Ihsan Rabbi and Sehat Ullah. 2013. A survey on augmented reality challenges and tracking. Acta Graphica znanstveni časopis za tiskarstvo i grafičke komunikacije 24, 1-2 (2013), 29–46.
- Kiran B Raja, Ramachandra Raghavendra, Martin Stokkenes, and Christoph Busch. 2015. Multi-modal authentication system for smartphones using face, iris and periocular. In Biometrics (ICB), 2015 International Conference on. IEEE, 143–150.
- Nisarg Raval, Animesh Srivastava, Kiron Lebeck, Landon Cox, and Ashwin Machanavajjhala. 2014. Markit: Privacy markers for protecting visual secrets. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication. ACM, 1289–1295.
- Nisarg Raval, Animesh Srivastava, Ali Razeen, Kiron Lebeck, Ashwin Machanavajjhala, and Lanodn P Cox. 2016. What you mark is what apps see. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 249–261.
- H.T. Regenbrecht, M. Wagner, and G. Baratoff. 2002. MagicMeeting: A Collaborative Tangible Augmented Reality System. Virtual Reality 6, 3 (01 Oct 2002), 151–166.
- Derek Reilly, Mohamad Salimian, Bonnie MacKay, Niels Mathiasen, W Keith Edwards, and Juliano Franz. 2014. SecSpace: prototyping usable privacy and security for mixed reality collaborative environments. In Proceedings of the 2014 ACM SIGCHI symposium on Engineering interactive computing systems. ACM, 273–282.
- Jingjing Ren, Ashwin Rao, Martina Lindorfer, Arnaud Legout, and David Choffnes. 2016. Recon: Revealing and controlling pii leaks in mobile network traffic. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 361–374.
- Franziska Roesner, Tamara Denning, Bryce Clayton Newell, Tadayoshi Kohno, and Ryan Calo. 2014a. Augmented reality: hard problems of law and policy. In Proceedings of the 2014 ACM international joint conference on pervasive and ubiquitous computing: adjunct publication. ACM, 1283–1288.
- Franziska Roesner, Tadayoshi Kohno, and David Molnar. 2014b. Security and Privacy for Augmented Reality Systems. Commun. ACM 57, 4 (April 2014), 88–96. https://doi.org/10.1145/2580723.2580730
- Franziska Roesner, David Molnar, Alexander Moshchuk, Tadayoshi Kohno, and Helen J Wang. 2014c. World-driven access control for continuous sensing. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. ACM, 1169–1181.
- Cynthia E Rogers, Alexander W Witt, Alexander D Solomon, and Krishna K Venkatasubramanian. 2015. An approach for user identification for head-mounted displays. In Proceedings of the 2015 ACM International Symposium on Wearable Computers. ACM, 143–146.
- Peter Rubin. 2017. Facebook’s Bizarre VR App Is Exactly Why Zuck Bought Oculus. (2017). https://www.wired.com/2017/04/facebook-spaces-vr-for-your-friends/
- Dieter Schmalstieg and Gerd Hesina. 2002. Distributed applications for collaborative augmented reality. In Virtual Reality, 2002. Proceedings. IEEE. IEEE, 59–66.
- Stefan Schneegass, Youssef Oualil, and Andreas Bulling. 2016. SkullConduct: Biometric user identification on eyewear computers using bone conduction through the skull. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 1379–1384.
- Yoones A Sekhavat. 2017. Privacy preserving cloth try-on using mobile augmented reality. IEEE Transactions on Multimedia 19, 5 (2017), 1041–1049.
- Jiayu Shu, Rui Zheng, and Pan Hui. 2016. Cardea: Context-Aware Visual Privacy Protection from Pervasive Cameras. arXiv preprint arXiv:1610.00889 (2016).
- Mark Simkin, Dominique Schröder, Andreas Bulling, and Mario Fritz. 2014. Ubic: Bridging the gap between digital cryptography and the physical world. In European Symposium on Research in Computer Security. Springer, 56–75.
- Latanya Sweeney. 2002. k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10, 05 (2002), 557–570.
- Zsolt Szalavári, Erik Eckstein, and Michael Gervautz. 1998. Collaborative gaming in augmented reality. In Proceedings of the ACM symposium on Virtual reality software and technology. ACM, 195–204.
- Zsolt Szalavári and Michael Gervautz. 1997. The personal interaction Panel–a Two-Handed interface for augmented reality. In Computer graphics forum, Vol. 16. Wiley Online Library.
- Piotr Szczuko. 2014. Augmented reality for privacy-sensitive visual monitoring. In International Conference on Multimedia Communications, Services and Security. Springer, 229–241.
- Robert Templeman, Mohammed Korayem, David J Crandall, and Apu Kapadia. 2014. PlaceAvoider: Steering First-Person Cameras away from Sensitive Spaces.. In NDSS.
- Bruce H Thomas and Wayne Piekarski. 2002. Glove based user interaction techniques for augmented reality in an outdoor environment. Virtual Reality 6, 3 (2002), 167–180.
- Khai Truong, Shwetak Patel, Jay Summet, and Gregory Abowd. 2005. Preventing camera recording by designing a capture-resistant environment. UbiComp 2005: Ubiquitous Computing (2005), 903–903.
- Dan Tynan. 2017. Augmented reality could be next hacker playground. (2017). https://www.the-parallax.com/2017/06/09/augmented-reality-hacker-playground/
- Krishna K Venkatasubramanian, Ayan Banerjee, and Sandeep Kumar S Gupta. 2010. PSKA: Usable and secure key agreement scheme for body area networks. IEEE Transactions on Information Technology in Biomedicine 14, 1 (2010), 60–68.
- John Vilk, David Molnar, Benjamin Livshits, Eyal Ofek, Chris Rossbach, Alexander Moshchuk, Helen J Wang, and Ran Gal. 2015. Surroundweb: Mitigating privacy concerns in a 3d web browser. In Security and Privacy (SP), 2015 IEEE Symposium on. IEEE, 431–446.
- John Vilk, David Molnar, Eyal Ofek, Chris Rossbach, Benjamin Livshits, Alexander Moshchuk, Helen J Wang, and Ran Gal. 2014. Least Privilege Rendering in a 3D Web Browser. Technical Report.
- Junjue Wang, Brandon Amos, Anupam Das, Padmanabhan Pillai, Norman Sadeh, and Mahadev Satyanarayanan. 2017. A Scalable and Privacy-Aware IoT Service for Live Video Analytics. (2017).
- Grace Woo, Andrew Lippman, and Ramesh Raskar. 2012. VRCodes: Unobtrusive and active visual codes for interaction by exploiting rolling shutter. In Mixed and Augmented Reality (ISMAR), 2012 IEEE International Symposium on. IEEE, 59–64.
- Chao Xu, Parth H Pathak, and Prasant Mohapatra. 2015. Finger-writing with smartwatch: A case for finger and hand gesture recognition using smartwatch. In Proceedings of the 16th International Workshop on Mobile Computing Systems and Applications. ACM, 9–14.
- Yan Xu, Maribeth Gandy, Sami Deen, Brian Schrank, Kim Spreen, Michael Gorbsky, Timothy White, Evan Barba, Iulian Radu, Jay Bolter, et al. 2008. BragFish: exploring physical and social interaction in co-located handheld augmented reality games. In Proceedings of the 2008 international conference on advances in computer entertainment technology. ACM, 276–283.
- Zhi Xu and Sencun Zhu. 2015. Semadroid: A privacy-aware sensor management framework for smartphones. In Proceedings of the 5th ACM Conference on Data and Application Security and Privacy. ACM, 61–72.
- Andrew Chi-Chih Yao. 1986. How to generate and exchange secrets. In Foundations of Computer Science, 1986., 27th Annual Symposium on. IEEE, 162–167.
- Eisa Zarepour, Mohammadreza Hosseini, Salil S Kanhere, and Arcot Sowmya. 2016. A context-based privacy preserving framework for wearable visual lifeloggers. In Pervasive Computing and Communication Workshops (PerCom Workshops), 2016 IEEE International Conference on. IEEE, 1–4.
- Juncheng Zhao and Hock Soon Seah. 2016. Interaction in marker-less augmented reality based on hand detection using leap motion. In Proceedings of the 15th ACM SIGGRAPH Conference on Virtual-Reality Continuum and Its Applications in Industry-Volume 1. ACM, 147–150.