Mobile Augmented Reality Survey: A Bottom-up Approach

# Mobile Augmented Reality Survey: A Bottom-up Approach

## Abstract

Augmented Reality (AR) is becoming mobile. Mobile devices have many constraints but also rich new features that traditional desktop computers do not have. There are several survey papers on AR, but none is dedicated to Mobile Augmented Reality (MAR). Our work serves the purpose of closing this gap. The contents are organized with a bottom-up approach. We first present the state-of-the-art in system components including hardware platforms, software frameworks and display devices, follows with enabling technologies such as tracking and data management. We then survey the latest technologies and methods to improve run-time performance and energy efficiency for practical implementation. On top of these, we further introduce the application fields and several typical MAR applications. Finally we conclude the survey with several challenge problems, which are under exploration and require great research efforts in the future.

## 1Introduction

#### Training and Education

The Naval Research Lab (NRL) developed a MAR-based military training system [?] to train soldiers for military operations in different environments. The battlefield was augmented with virtual 3D goals and hazards which could be deployed beforehand or dynastically at run time. Traskback and Haller [?] used MAR technology to augment oil refinery training. Traditional training was conducted in classrooms or on-site when the plant was shut down for safety consideration. The system enabled on-site device-running training so that trainees could look into run-time workflow. Klopfer et al. [?] proposed a collaborative MAR education system for museum. Players used Pocket PCs and walkie-talkies to interview virtual characters and operate virtual instruments. To finish the task, children were encouraged to engage in exhibits more deeply and broadly. Schmalstieg and Wagner [?] built a MAR game engine based on Studierstube ES. In order to finish the task, players had to search around with cues displayed on handheld device when players pointed their handheld device at exhibits. Freitas and Campos [?] developed an education system SMART for low-grade students. Virtual 3D models such as cars and airplanes were overlaid on real time video to demonstrate concepts of transportation and animations.

#### Geometry Modeling and Scene Construction

Baillot et al. [?] developed a MAR-based authoring tools to create geometry models. Modelers extracted key points from real objects and then constructed geometric primitives from points to create 3D models. Created models were registered and aligned with real objects for checking and verification. Piekarski and Thomas [?] built a similar system for outdoor objects creation. It used pinch gloves and hand tracking technologies to manipulate models. The system was specially suitable for geometrical model creation of giant objects (e.g. building) as users could stand a distance away. Ledermann et al. [?] developed a high-level authoring tool APRIL to design MAR presentation. They integrated it into a furniture design application. Users could design and construct virtual model with real furniture as reference in the same view. Henrysson et al. [?] employed MAR to construct 3D scene on mobile phone in a novel way. Motions of mobile phone were tracked and interpreted to translation and rotation manipulation of virtual objects. Bergig et al. [?] developed a 3D sketching system to create virtual scenes for augmented mechanical experiments. Users used their hands to design experiment scene superimposed on a real drawing. Hagbi et al. [?] extended it to support virtual scene construction for augmented games.

#### Assembly and Maintenance

Klinker et al. [?] developed a MAR application for nuclear plant maintenance. The system created an information model based on legacy paper documents so that users could easily obtain related information that was overlaid on real devices. The system was able to highlight fault devices and supplied instructions to repair them. Billinghurst et al. [?] created a mobile phone system to offer users step-by-step guidance for assembly. A virtual view of next step was overlaid on current view to help users decided which component to add and where to place it in the next step. Henderson and Feiner [?] developed a MAR-based assembly system. Auxiliary information such as virtual arrows, labels and aligning dash lines were overlaid on current view to facilitate maintenance. A study case showed that users completed task significant faster and more accurate than looking up guidebooks. An empirical study [?] showed that MAR helped to reduced assembly error by 82%. In addition, it decreased mental effort for users. However, how to balance user attention between real world and virtual contents to avoid distraction due to over-reliance is still an open problem.

#### Information Assistant Management

Goose et al. [?] developed a MAR-based industrial service system to check equipment status. The system used tagged visual markers to obtain identification information, which was sent to management software for equipment state information. Data such as pressure and temperature were sent back and overlaid for display on the PDA. White et al. [?] developed a head-worn based MAR system to facilitate management of specimens for botanists in the field. The system searched a species database and listed related species samples side-by-side with physical specimens for comparison and identification. Users slid virtual voucher list with head horizontal rotation and zoomed the virtual voucher by head nodding movements. Deffeyes [?] implemented an Asset Assistant Augmented Reality system to help data center administrators find and manage assets. QR code was used to recognize asset. Asset information was retrieved from a MAMEO server and overlaid on current view of asset.

MAR also finds its markets in other fields. MAR has been used to enhance visualization and plan operations by placing augmented graphic scans over surgeons’ vision field [?]. Rosenthal et al. [?] used the “x-ray vision” feature to look through the body and made sure the needle was inserted at the right place. MAR was also used to manage personal information [?]. Another large part of applications is AR browsers on mobile phones [?]. AR browser is similar to MAR navigation application, but more emphasizes on location-based service (LBS). Grubert et al. [?] conducted a detailed survey about AR browsers on mobile phones.

### 4.2Representative Systems

#### Mars

MARS [?] is a both indoor and outdoor MAR system developed by a team at Columbia University. They have built several iterations of the prototype and developed a series of hardware and software infrastructures. The system comprises a backpack laptop, a see-through head-worn display and several input devices including stylus and trackpad. An orientation tracker and RTK GPS are used to obtain pose information. A campus guide system has been developed based on MARS. Visitors are able to obtain detailed information overlaid on items in their current view field. They can also watch demolished virtual buildings on their original sites. It supports virtual menus overlaid on users view field to conduct different tasks. The system can also be used for other applications such as tourism, journalism, military training and wayfinding.

#### ARQuake

ARQuake [?] is a single-player outdoor MAR games based on popular desktop game Quake. Virtual monsters are overlaid on current view of real world. Player moves around real world and uses real props and metaphors to kill virtual monsters. Real buildings are modeled but not rendered for view occlusion only. The game adopts GPS and digital compass to track player’s position and orientation. A vision-based tracking method is used for indoor environments. As virtual objects may be difficult to recognize from natural environments at outdoor environments, system have to run several times to set a distinguishable color configuration for later use.

#### Bars

BARS [?] is a battlefield augmented reality system for soldiers training in urban environments. Soldiers’ perceptions of battlefield environment are augmented by overlaying building, enemies and companies locations on current field of view. Wireframe plan is superimposed on real building to show its interior structures. An icon is used to report location of sniper for threat warning or collaborative attacking. A connection and database manager is employed for data distribution in an efficient way. Each object is created at remote servers but only simplified ghost copy is used on clients to reduce bandwidth traffic. The system requires a two-step calibration. The first is to calculate mapping of result from a sensing device to real position and orientation of sensors; the second is to map sensing unit referential to viewpoint referential of user’s observation.

#### Medien.welten

Medien.welten [?] is a MAR system that has been deployed at Technisches Museum Wien in Vienna. The system is developed based on Studierstube ES. A scene graph is designed to facilitate construction and organization of 3D scenes. Total memory footprint is limited to 500k to meet severe hardware limitation on handheld devices. Game logic and state are stored in a XML database in case of client failure due to wireless single shielding. Medien.welten enables players to use augmented virtual interface on handheld devices to manipulate and communicate with real exhibits. With interactive manipulation, players gain both fun experience and knowledge of exhibits.

#### MapLens

MapLens [?] is an augmented paper map. The system employed mobile phones’ viewfinder, or “magic lens”, to augment paper map with geographical information. When users view a paper map through embedded camera, feature points on paper map are tracked and matched against feature points tagged with geographical information to obtain GPS coordinates and pose information. GPS coordinates are used to search an online HyperMedia database (HMDB) to retrieve location-based media such as photos and other metadata, which are overlaid on paper map from current pose. Augmented maps can be used in collaborative systems. Users can share GPS-tagged photos with others by uploading images to HMDB so that others can view new information. It establishes a common ground for multiple users to negotiate and discuss to solve the task in a collaborative way. Results show that it is more efficient than digital maps.

#### Virtual LEGO

Virtual LEGO [?] uses mobile phone to manipulate virtual graphics objects for 3D scene creation. The motion of mobile phone is employed to interact with virtual objects. Virtual objects are fixed relative to mobile phone. When users move their mobile phones, objects are also moved according to relative movement of mobile phones to the real world. In translation mode, the selected object is translated by the same distance as mobile phone. Translation of mobile phone is projected onto a virtual Arcball and converted as rotation direction and angle to rotate virtual object. The objects are organized in a hierarchical structure so that transformation of a parent object can be propagated to its sub-objects. A multiple visual markers tracking method is employed to guarantee accuracy and robustness of mobile phone tracking. Result shows that the manipulation is more efficient than button interface such as keypad and joypad, albeit with relative low accuracy.

#### InfoSPOT

InfoSPOT [?] is a MAR system to help facility managers (FMs) access building information. It augments FMs’ situation awareness by overlaying device information on view of real environment. It enables FMs to fast solve problems and make critical decisions in their inspection activities. The Building Information Modeling (BIM) model is parsed into geometry and data parts, which are linked with unique identifiers. Geo-reference points are surveyed beforehand to obtain accurate initial registration and indoor localization. The geometry part is used to render panoramas of specific locales to reduce sensor drift and latency. When FMs click virtual icon of physical object on IPad screen, its identifier is extracted to search data model to fetch information such as product manufacture, installation date and life of product.

Google Glass [?] is a wearable AR device developed by Google. It displays information on glass surface in front of users’ eyes and enables users to control interface with natural language voice commands. Google Glass supports several native functions of mobile phones such as sending messages, taking pictures, recording video, information searching and navigation. Videos and images can be shared with others through Google+. Current product uses smartphones as network transition for Internet access. As it only focuses on text and image based augmentation on a tangible interface, it does not require tracking and alignments of virtual and real objects. Presently it is only available for developers and reported to be widely delivered later this year.

#### Wikitude

Wikitude [?] is a LBS-based AR browser to augment information on mobile phones. It is referred as “AR browser” due to its characteristic of augmentation with web-based information. Wikitude overlays text and image information on current view when users point their mobile phones to geo-located sites. Wikitude combines GPS and digital compass sensors to track pose tracking. Contents are organized in KML and ARML formats to support geographic annotation and visualization. Users can also register custom web services to get specific information.

Table ? lists system components and enabling technologies of aforementioned MAR applications. Early applications employed backpack notebook computer for computing tasks. External HMDs were required to provide optical see-through display. As mobile devices become powerful and popular, numerous applications use mobile devices as computing platforms. Embedded camera and self-contained screen are used for video see-through display. Single tracking methods have also been replaced with hybrid methods to obtain high accurate results in both indoor and outdoor environments. Recently applications outsource computations on server and cloud to gain acceleration and reduce client mobile device requirements. With rapid advances from all aspects, MAR will be widely used in more application fields.

## 5Challenging problems

### 5.1Technology limitations

MAR develops based on various technologies as mentioned above. Many problems such as network QoS deficiency and display limitations remain unsolved in their own fields. Some problems are induced by the combination of multiple technologies. For instance, battery capacity is designed to be sustainable for common functions such as picture capturing and Internet access, which are supposed to be used at intervals. MAR applications require long-time cooperation of cameras capturing, GPS receiving and Internet connection. These tasks working together can drain battery quickly. Many high accurate tracking approaches are available in computer vision fields but they can not be directly used on mobile devices due to limited computing capability. We have discussed technology related challenges in previous sections. Several papers [?] also made detailed investigation of it. MAR also has several intrinsic problems from technology aspect, which are very much underexplored but worth great consideration. We will address these challenges in the following sections.

### 5.2Privacy and security

Privacy and security are especially serious for MAR due to various potential invasion sources including personal identification, location tracking and private data storage. Many MAR applications depend on personal location information to provide services. For instance, in client-server applications, user’s position is transmitted to third-party server for tracking and analysis, which may be collected over time to trace user activity. It is more serious for collaborative MAR applications as users have to share information with others, which not only provides opportunity for others to snoop around private information but also raise concern of how to trust quality and authenticity of user-generated information supplied by others. To guarantee privacy safety, we require both trust models for data generation and certification mechanisms for data access. Google Glass is a typical example to show users’ concern about privacy. Although it is not widely delivered yet, it has already raised privacy concern that users can identify strangers by using facial recognition or surreptitiously record and broadcast private conversations. Acquisti et al. [?] discussed privacy problem of facial recognition for AR applications. Their experiment implied that users could obtain inferable sensitive information by face matching against facial images from online sources such as social network sites (SNS). They proposed several privacy guidelines including openness, use limitation, purpose specification and collection limitation to protect privacy use.

### 5.3Application breakthrough

Most existing MAR applications are only prototypes for experiment and demonstration purposes. MAR has great potential to change our ways to interact with real world, but it still lacks killer applications to show its capability, which may make it less attractive for most users. Breakthrough applications are more likely to provide a way for drivers to see through buildings to avoid cars coming from cross streets or help backhoe operator to watch out fiber-optic cables buried underground in the field. We have witness similar experience for Virtual Reality (VR) development during past decades, so we should create feasible applications to avoid risk of the same damage to MAR as seen when VR was turned from hype to oblivion. Google Glass is a milestone product to raise public interest but it is still stuck with absence of killer applications. Google has delivered its explorer edition for developers so as to create several significant applications before it is widely available for public users.

### 5.4Over-emphasized self-support

Many MAR systems are designed to be self-contained to make it free from environmental support. Self-support is emphasized to map completely unknown surroundings and improve user experience. However, it introduces complexity and limitations. For instance, many systems employ visual feature method to get rid of beforehand deployed visual markers, but deficiencies of heavy computational overhead and poor robustness make the system even less applicable for most applications. Besides, what useful annotations can be expected if we know nothing about the environment? It is still unclear about the necessity to make it universal for completely unprepared surroundings. With development of pervasive computing and Internet of Things (IOTs), computing and identification are woven into the fabric of daily life and indistinguishable from environments. The system may be deeply integrated with environment other than isolated from it. Another reason to emphasize self-support necessity is for outdoor usage. A study case [?] showed that GPS usage coverage was very lower than expected. As GPS was shielded in indoor environments, it indicated that users may spent most of their time indoors, so there may be not so great urgency to make system completely self-contained.

### 5.5Social acceptance

Many factors such as device intrusion, privacy and safety considerations may affect social acceptance of MAR. To reduce system intrusion, we should both miniaturize computing and display devices and supply a nature interactive interface. Early applications equipped with backpack laptop computer and HMDs introduce serious device intrusion. Progresses in miniaturization and performance of mobile devices alleviate the problem to certain extent, but they do not work in a nature way. Users have to raise their hands to point cameras at real objects during system operation, which may cause physical fatigue for users. The privacy problem is also seriously concerned. A “Stop the Cyborgs” movement has attempted to convince people to ban Google Glass in their premises. Many companies also post anti-Google Glass signs in their offices. As MAR distracts users’ attention from real world occasionally, it also induce safety problems when users are operating motor vehicles or walking in the streets. All these issues work together to hurdle the social acceptance of MAR technology.

## 6Conclusions

In this paper, we give a complete and detailed survey of MAR technology in terms of system components, enabling technologies and application fields. Although there are still several problems from technical and application aspects, it is estimated as one of the most promising mobile applications. MAR has become an important manner to interact with real world and will change our daily life.

As fast development of cloud computing and wireless networks, mobile cloud computing becomes a new trend to combine the high flexibility of mobile devices and the high-performance capabilities of cloud computing. It will play a key role in MAR applications since it can undertake the heavy computational tasks to save energy and extend battery lifetime. Furthermore, cloud services for MAR applications can operate as caches and decrease the computational cost for both MAR devices and cloud providers. As MAR applications run on a remote server, we can also overcome limitations of mobile operating systems with help of mobile browsers. It is possible to combines multiple mobile devices for cooperative mobile computing which will be suitable for collaborative MAR applications such as multi-player games, collaborative design and virtual meeting. Although there are still several problems such as bandwidth limitation, service availability, heterogeneity and security, mobile cloud computing and cooperative mobile computing seem promising new technologies to promote MAR development to a higher level.

Present MAR applications are limited to mobile devices like PADs and mobile phones. We believe that these mobile devices are transient choices for MAR as they are not originally designed for MAR purpose. They happen to be suitable but not perfect for it. Only dedicated devices such as Google Glass can fully explore potential capability of MAR. As development of mobile computing and wearable computers such as AR glass and wristwatch devices, we are looking forward to its renaissance and prosperity around the corner.

You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minumum 40 characters