CollaboVR: A Reconfigurable Framework for Multi-user to Communicate in Virtual Reality

CollaboVR: A Reconfigurable Framework for Multi-user to Communicate in Virtual Reality

Abstract

Despite various collaborative software that supports expressing ideas, people still largely prefer physical notebooks or whiteboards. The reason is that they provide free-form expressions, co-presence of all participants and easy collaboration. However, when working with remote participants, people often choose the convenience of video conferencing, perhaps with screen sharing. We propose CollaboVR, a reconfigurable framework for distributed and co-located multi-user communication in Virtual Reality. We tested CollaboVR with an application that lets participants create freehand drawings and 3D objects, while allowing participants to adjust two key variables: (1) User arrangement (participants adjust the location of their views of other participants) and (2) Input orientation (participants adjust the input to be vertical or horizontal). Preliminary user studies show that CollaboVR is a useful collaborative tool. Users report that some user arrangements and input orientations work best for brainstorming, while others work best for presentation or collaborative refinement of designs.

Virtual Reality; freehand drawing; computer-supported collaborative work.
\toappear\numberofauthors

3

\teaser

CollaboVR is a reconfigurable framework for multi-user communication in Virtual Reality. Here are four use cases of CollaboVR. Figure (a) demonstrates how people use it to schedule daily life. Figure (b) demonstrates a face-to-face presentation on the topic of hyper dimension. Figure (c) illustrates how to discuss furniture placement with a roommate. Figure (d) illustrates a person learning baroque style design. Different combinations of user arrangement and input orientation are applied to the above user cases for better communication effects.

{CCSXML}

<ccs2012> <concept> <concept_id>10003120.10003121.10003124.10010866</concept_id> <concept_desc>Human-centered computing Virtual reality</concept_desc> <concept_significance>500</concept_significance> </concept> <concept> <concept_id>10003120.10003121.10003124.10011751</concept_id> <concept_desc>Human-centered computing Collaborative interaction</concept_desc> <concept_significance>500</concept_significance> </concept> </ccs2012>

\ccsdesc

[500]Human-centered computing Virtual reality \ccsdesc[500]Human-centered computing Collaborative interaction

1 Introduction

Virtual Reality (VR) is increasingly being explored as a tool for human-computer interaction (HCI), spurred by the availability of high quality consumer headsets in recent years. VR enables a rich design space in HCI by providing 3D input and immersive experiences. In 1998, the “Office of the future” [44], was proposed to allow remotely located people to feel as though they were together in a shared office space, via a hybrid of modalities including telepresence, large panoramic displays and manipulation of shared 3D objects. The core idea was that VR had the potential to enhance communication and collaboration among groups of people, as well as Augmented Reality (AR) and projection-based immersive experiences. Since then, significant progress has been made in exploring techniques for communication [21, 39], collaborative works [23, 52], infrastructure [31, 36, 55] and various modalities [11, 28, 29, 35] in the field of multi-user experiences. However, we found that many works are still using mono audio as the only direct communication method during immersive collaborative experiences [60], and others provide only indirect communication approaches such as changing the color of a shared target [19]. In this paper, we consider designing a VR communication framework that could be potentially beneficial to various collaborative experiences.

Much research has contributed to improving communication through VR, AR or tabletop interfaces for both co-located and distributed groups. As observed by Lindemann  [56], gestures or visual aids are commonly used while speaking to others in daily life. Awareness of these behaviors is technically considered as workspace awareness, defined as the up-to-the-moment understanding of other people’s interaction in a shared workspace [30]. Providing workspace awareness is widely employed in many research studies. One trend is improving communication through shared display. Some work enables face-to-face interaction, such as VideoDraw [53], ClearBoard [21] and FacingBoard [30]. More traditional techniques, including white-boards and side-by-side projected displays, have been shown to improve group performance as well [42]. Furthermore, MMSpace implemented kinetic displays so that each display can change its orientation to realize face-to-face interaction for different participants [39]. Another trend is the use of tabletop tangible devices [5, 24], which also enhance co-presence of other people in the group. It is highly relevant how we perceive and manage our interpersonal space, called proxemics [32]. Inspired by this previous work, we propose to design a multi-user system which supports different options for user arrangement (the ways that participants can adjust the views of other participants) and input orientation (the ways that participants can adjust the orientation of the input surface).

We propose CollaboVR, a reconfigurable framework for distributed and co-located multi-user communication in VR. The framework includes the following:

• A protocol connecting the application to the VR clients. To provide rich and smart visual aids, we choose Chalktalk, an open-source digital presentation and communication language to support freehand drawing and 3D objects [41]. More details are described later.

• A star network to support communication between distributed clients and the server.

• Rich interactions for manipulating drawings so that all of the visual aids (freehand drawing or 3D objects) can be further manipulated after being drawn.

• Multiple user arrangements in which participants can adjust the location of their views of other participants.

• Multiple input orientations in which participants can adjust the vertical versus horizontal angle of the input area.

Our main contributions in this paper are as follows:

1. Designing and implementing a reconfigurable framework for multi-user communication.

2. A system evaluation which indicates that CollaboVR is a useful multi-user tool in terms of communication.

3. Qualitative feedback which indicates that different combinations of two key variables (user arrangement and input orientation) function differently for different collaborative purposes.

2 Related Work

CollaboVR is a framework to assist communication. By definition, communication is the act of expressing and understanding among a group. Similarly, sensemaking is the understanding of the meaning of a communicative act [40]. Sensemaking is a widely researched concept in the area of information visualization. Dervin describes sensemaking as using ideas, emotions and memories to bridge a gap in understanding in a group [8]. Learning how collaborative sensemaking is supported through different design considerations is very useful for multi-user communication. In this section, we introduce collaborative sensemaking approaches first. Second, we summarize that how workspace awareness has positive effects on collaboration and how previous studies enhance workspace awareness. Last, we introduce immersive collaboration and communication and assess their advantages and limitations.

2.1 Collaborative Sensemaking

Many previous works have researched sensemaking in different domains in HCI and computer-supported collaborative work (CSCW) area [1, 4, 27, 40]. Given that sensemaking involves data analysis [61], different designs of 2D displays and digital tabletop are frequently discussed. Prior work has shown two things: First, large and shared displays have been shown to benefit sensemaking groups in several contexts. Sharodal designed CoSense [8] with a shared display, conducted an ethnographic study, and examined how collaborative sensemaking can be supported. Vogt et al. found that the large display facilitated the paired sensemaking process, allowing teams to spatially arrange information and conduct individual work as needed [57]. Moreover, multiple digital tabletops were used for sensemaking tasks  [20, 34]. Second, personal displays may lead to decreased collaboration in co-located settings  [7, 58]. When designing CollaboVR, we considered the idea of “multiple” displays, displays with “different” angles, as well as adding “personal” displays into the mix, which leads to the design of different input orientations and the placement of visual aids.

2.2 Workspace Awareness

Workspace awareness is the collection of up-to-the-minute knowledge a participant has of other participants’ interaction with the workspace [16]. It includes the awareness of others’ locations, activities, and intentions relative to the task and to the space. Maintaining workspace awareness enables participants to work together more effectively [17, 18]. Workspace awareness plays an important role in simplifying communication, taking turns and action prediction  [18]. In brief, maintaining and enhancing workspace awareness is beneficial to collaboration.

One trend is the use of see-through displays for distributed collaboration. The idea started with Tang and Minneman, who designed VideoDraw [53] and VideoWhiteBoard [54]. Both approaches are two-user experiences. On each side, a video camera was placed to capture the local user and the drawing. A projector was attached to present the remote user and the drawing on the top of the local display. ClearBoard [22] extended the idea and used digital media. Instead of using projectors, the media displayed the video feed of the remote user and drawing to maintain workspace awareness. Similarly, KinectArms [12] used a tangible tabletop as the media, and rendered the arm of the remote user for mixed presence. Furthermore, Jiannan et al. [30] developed FacingBoard with two-sided transparent displays. Analogous to ClearBoard, the entire upper-body is displayed to the other participants so that gaze awareness is supported. To maintain gaze interaction, FacingBoard reversed the graphics on the display. However, some column-sensitive content, such as text and map then became incorrect. To solve this problem, FacingBoard selectively flipped the column-sensitive content individually and adjusted the position of the content. However, when people pinpointed a specific sub-area within the content, the gaze and the place being pinpointed were inconsistent to both users. In our system, we proposed different user arrangements to enhance workspace awareness. We also provide a similar face-to-face experience. Differently, we manipulate the users’ locations to maintain gaze awareness rather than flipping the content, and we support collaboration between more than two people. This is detailed later.

2.3 Immersive Collaboration and Communication

Much work has been done in collaborative applications in VR and mixed reality (MR). Some has focused on multi-user gaming experiences. SynchronizAR designed a registration algorithm for mobile AR so that participants could join the co-located experience without needing to take extra steps to ensure high-quality positional tracking [19]. Some has focused on developing telepresence experiences and bridging the gap between the physical and virtual worlds. InForce created a set of novel interaction techniques, including haptic feedback for distributed collaboration  [35]. MetaSpace performed full body tracking for distributed users to create a sense of presence [50]. Holoportation demonstrated 3D reconstructions of an entire space including reconstruction of people [38]. Immersive group-to-group telepresence allowed distributed groups of users to meet in a shared virtual 3D world through two coupled projection-based setups [2]. Depth and color cameras were used for reconstruction. Some designed a collaborative tool for productive work, such as editing and modeling. SpaceTime focused on improving the experience for two experts collaborating on design work together [60]. It designed “parallel objects” to resolve interaction conflicts and to support design workflows. Some put more effort in object manipulation and navigation. T(ether) is a spatially aware display system for co-located collaborative manipulation and animation of objects  [26]. Trackable markers on pads and digital gloves allow participants to manipulate objects in space through gestures. Andre et al. designed an application to support object manipulation tasks and scene navigation from different views  [24]. Some developed a distributed system for remote assistance. Virtual Replicas for Remote Assistance is a remote collaboration system allowing a remote expert to guide local users to assemble machine parts by using virtual replicas [37]. Others such as Geollery [9, 10] focused on social experiences among users, by creating and the studying an interactive MR social media platform.

For collaborative purposes such as social networking and telepresence, engagement and a sense of being there are the most important qualities. In those scenarios, communication performance is not the focus. While for collaborative purposes, such as productive work, games, assistance or object manipulation, which require complicated and specific operations and information exchange during the process of collaboration, communication performance becomes more important. Our goal is to build a reconfigurable framework to fit different purposes of collaboration.

Next, we examine the trending of communication in immersive environments. Some have proposed asymmetrical communication. ShareVR enables communication between an HMD user and a non-HMD user [14]. By using floor projection and mobile displays to visualize the virtual world, the non-HMD user can interact with the HMD user and become part of the VR experience. Mutual human actuation [6] runs pairs of users at the same time and has them provide human actuation to each other. Communication between the pair is through shared interactive props.

Interacting with digital content in a shared space is also common. Three’s Company [52] explored the design of a system for three-way collaboration over a shared visual workspace. They illustrated the utility of multiple configurations of users around a distributed workspace. TwinSpace supports deep interconnectivity and flexible mappings between virtual and physical spaces [45]. Your Place and Mine explored three ways of mapping two differently sized physical spaces to shared virtual spaces to understand how social presence, togetherness, and movement are influenced [49]. Tan et al. built a face-to-face presentation system for remote audiences  [51]. Tele-Board [15] designed a groupware system focused on creative working modes using a traditional whiteboard and sticky notes in digital form for distributed users. Hrvoje Benko et al. proposed a unique spatial AR system that enables two users to interact with a shared virtual scene and each other in a face to face arrangement [3]. We found that much work focused on two-user or pair communication. Our work proposes a solution to scale the number of participants. In addition, we have observed that many studies have experimented with a face-to-face setup with shared display or a round table for communication. We propose two variances: variance in user arrangement and variance in input orientation, to investigate their respective advantages and limitations.

3 CollaboVR Overview

In this section, we present an overview of CollaboVR’s system architecture. CollaboVR consists, essentially, of a protocol that serializes input and display data from each user of an application, routes that data through a network, and then de-serializes and interprets the data to correctly render the results into graphics. Currently, we have tested CollaboVR with the application Chalktalk [41]. Chalktalk is an open-source digital presentation and communication language. It allows a presenter to create and interact with animated digital sketches on a blackboard-like interface. There are some other smart sketch-based online software programs, such as Autodraw [13], sketch2code [25] and Miro [48] that can assist drawing. We chose to use Chalktalk because it is an open-source platform, and so we can easily define the dataflow between the application and CollaboVR. If the input and output data structure of any given application is accessible, CollaboVR can be easily adapted to work with multiple applications.

CollaboVR consists of a star network. The network is written in Node.js and C# and it synchronizes data across devices and supports custom data formats. For CollaboVR, we have two kinds of information: rendering data and user data. For rendering data, we first pass the user input from each client to the server. Then, the server transmits the user input together with its user identifier to the application. Next, the server receives the serialized display data from the application. Finally, the server broadcasts the display data to each client for rendering. For user data, we directly broadcast the user avatar and audio data to each client after it has been received.

CollaboVR consists of rich interactions for manipulation of drawings. After the clients receive and render the display data from the application, the display data are considered as interactive objects in a 3D world. We provide manipulation of these objects for users to easily express their ideas. This manipulation includes duplication, linear transformation (rotating, scaling and translation), deletion, and colorization.

Deploying CollaboVR requires only a VR device running Unity for each client, a server machine running Node.js and an optional router for ensuring low latency for data transmission. If a VR client needs to run on a Windows machine, the server code could run on one of the client machines. If a router is not available for setup, the server and all the clients can communicate via their external IP addresses.

Figure  1 presents the per-frame workflow of CollaboVR. First, figure 1(a) and figure 1(b) illustrate the third-person perspective view of the VR client before data synchronization. The user in figure 1(a) is drawing a dinosaur to present his or her favorite animal. The user in figure 1(b) is waiting. Then, the data, including both users’ identifiers, avatar information, audio and the drawing input information is serialized and sent to the server. As figure 1(c) illustrates, different data are processed with different labels such as AVATAR_JOIN, STYLUS, SELECT_OBJ, MOVE_OBJ. The server behaves as a stateless machine. The server sends the drawing input related data to the application API and broadcasts user data to all clients. The application then receives the drawing input data from the server as shown in figure 1(d). The application (Chalktalk) processes the input and turns the drawing into 3D objects, as shown in figure 1(e). We can see that the freehand drawn dinosaur becomes a 3D dinosaur capable of interactive animation. Our protocol serializes DisplayData from the application (Chalktalk) and sends that data to the server. The server broadcasts DisplayData to all the clients. Finally, all the VR clients can see the interactive 3D objects and each other, as shown in figure 1(f) and figure 1(g).

4 Design Space

We propose two variables in the design space of CollaboVR: user arrangement and input orientation. Building on previous work on workspace awareness, we focus on maintaining and enhancing workspace awareness, to enable participants to work together more effectively. We added a control variable that users can use to alter their views of other participants. In other words, they can manipulate the spatial arrangement by which they see other users.

Inspired by previous work on collaborative sensemaking, we notice that multiple and shared large displays are useful for collaborative work in terms of 2D information. CollaboVR is an immersive 3D graphics world. Instead of “display”, we pre-placed multiple “interactive boards” in the virtual environment, meaning that if the z coordinate of the content is 0, the content will be placed on the interactive board. If the content has a positive or negative z coordinate, users will see that the content floats in front of or behind the interactive board. By default, the interactive board is placed vertically like a blackboard. We allow users to rotate the vertical tilt of the interactive input board from 0 to 90. In order to reduce the learning curve for users, we simply provide two options: vertical (0) and horizontal (90).

4.1 User Arrangement

We provide two user arrangements for CollaboVR: (1) default and (2) mirrored. A default user arrangement means that the virtual environment of each user will be overlapped directly. As figure 2 demonstrates, the green rectangle roughly shows the available tracking area. The blue rectangles show the interactive boards. All the sub-figures are top-down views. Figure 2(a) and (b) illustrate the original position of two clients in their own sides. Figure 2(c) shows the default user arrangement. The clients see the other participants in their original positions. For mirrored user arrangement, all the other users’ locations are flipped to the other side of the interactive board. In figure 2(d), user 1 is in a mirrored user arrangement, so user 2 is flipped to the other side of the left interactive board because user 2 is looking at that board. Now let’s take a look at the gaze interaction. Spot A is the same content that both users are looking at. After the flipping operation, the gaze direction of user 1 and user 2 are maintained. Different from FacingBoard [30], the content is not mirror reversed so the content is still correct to the viewer. In this user arrangement, the users can see each other for better workspace awareness. Figure 2(e) illustrates the scenario that two users are looking at different interactive boards. User 2 doesn’t block the view of user 1, and they can barely see each other when their focused boards are different. Figure 2(f) illustrates how we process multiple users in this setup. Each user will be flipped based on their looking direction.

4.2 Input Orientation

CollaboVR enables the user to adjust the interactive board for input to be either vertical or horizontal. By default, the interactive board is vertical as a large display, but in 3D. As figure CollaboVR: A Reconfigurable Framework for Multi-user to Communicate in Virtual Reality(a) demonstrates, the user’s experience is close to writing on a white-board. When user turns on the horizontal interactive board, it is close to writing on a tablet or desk, as seen in figure 3. For other users who are not writing, the content is rendered in real time on the vertical board, see figure 3(a). By doing this, we avoid the situation that the content is not readable for all the users around a table in real life. When user is writing on the horizontal board, he or she is free to look at the personal horizontal board or the shared vertical board. (See the dash dot representing potential gaze directions in figure 3(b).) Moreover, the content on the horizontal board is different from the content on the shared vertical board in two points. One is the scale. Considering that the reach distance when writing on a horizontal plane is smaller than on a vertical plane, we adjust the scale of the horizontal board. The other is the dimension. We squeeze the content on the horizontal board. (See how the table looks in figure 3(b).) We implement squeezing because we want to simulate a tablet-style input, and keep the designing space clean as well. To enhance the awareness of where the user is writing, we render the projection point of user’s controller as a 3D/2D cursor. (See the torus in figure 3.)

5 System Implementation

To build CollaboVR, we implement a network framework with a flexible protocol, a calibration approach for co-located users, user interface for drawing manipulation and other functions.

5.1 Network and Protocol

The network is a star network using UDP. We chose UDP because we value low-latency. The user data and rendering data need to be transmitted every frame. The server-side code is written in Node.js and the client code is written in Node.js and Unity C#. We defined a synchronizable object as an object that needs to be synchronized each frame for the client who registered it. Each synchronizable object has a label and data stream. The label is a unique identifier for the client to register. The data stream includes the frequency of sending this object and the real-time data. We provide two types of frequency in the system: one-time and per frame. A one-time synchronizable object is actually a command. It does not happen for each frame and does not need to be synchronized for each frame. For the command object, we use two-way handshaking. The client sends the object to the server, the server returns an object including acknowledgement back to the client, then the client deregisters the object with this local label. The per frame synchronizable object could be the current avatar representation, the audio data or the display data from the application. We design a protocol to wrap all the display data. The data protocol includes information of all the rendered lines and meshes by encoding their attributes, such as the width of the lines and shape of the meshes. Each client deserializes the data from the server and renders the points as lines and meshes.

5.2 Calibration for Co-located Scenarios

CollaboVR works for both co-located and distributed scenarois. For distributed users, we simply overlap their virtual environments because they do not have any spatial relationship in reality. For co-located users, we need to calibrate all the users, so they are all in the same coordinate system. The key idea for calibration is that different clients should have a shared proxy. For example, a shared image is provided to all the co-located users in that environment. Vuforia [43] is widely used for image recognition and tracking. Similarly, a shared map is helpful too [33]. Here we use HTC Vive Pro as the co-located device. Technically, HTC Vive Pro is capable of both VR and MR. We only use VR mode in our system. Meanwhile, MR mode helps us explain the co-located user setup as figure 4 illustrates. The shared proxy in Vive system is the Vive base station. Each machine running Vive can retrieve the transformation information of the base station. Because all machines (assuming machines) have their own coordinate systems, we have pairs of position and rotation of base station in different coordinate systems. We chose one base station as the proxy based on the unique serial number. Then, we used the first client that connected to the server as the reference node. Later, all the following clients applied the inverse matrix between the base station of reference node and their own base station to their VR environments. In figure 4, user 1 is drawing a physics model. Figure 4(a) presents the first-player perspective view from user 2, and figure 4(b) presents a front view of the scene.

5.3 Drawing Manipulation

CollaboVR includes a user interface for users to manipulate the objects after drawing. We provide the functionality of duplication, transforming, deletion and colorization. To achieve this, we designed a pie menu triggered by the controller. The following is the workflow for a user’s manipulation: first, place the controller so that it hovers over the drawing of interest; second, press the thumbstick of the dominant controller; and then, the pie menu appears as figure 5(c); later, move the thumbstick to select the specific menu (see figure 5(d)); afterward, apply corresponding movement in terms of the command and release the thumbstick. If the menu is deletion, the operation executes immediately after release. If the menu is transformation, the user must move the controller one more time to complete rotation, scaling or translation. If the menu is copy, the duplicated object will be placed at the position where the user releases the thumbstick (see figure 5(a)). The color palette is toggled by button one, illustrated in figure 5(a). The user can drag the color from the color palette to any drawing like world builder [46].

5.4 Other

The controller’s trigger switches the commands of the two controllers (see figure 5). As the user’s view might be blocked by other users’ avatar, we implement a spectator mode. Users can see the view from different users in the lower right corner. To encourage all users to work on the task together, we implement a permission strategy. Only one user we can draw at one time. Once the user who is drawing releases the permission, other users can grab permission to draw. Figure 5 shows the permission indicator.

6 Use Cases

CollaboVR is designed as a reconfigurable framework for users to communicate in VR. Here, we envision the potential user cases for CollaboVR.

Daily Life Schedule and Brainstorming

CollaboVR can be used for daily life scheduling as shown in figure CollaboVR: A Reconfigurable Framework for Multi-user to Communicate in Virtual Reality(a). With the default user arrangement and input orientation, the experience is close to whiteboard writing. CollaboVR enables freehand drawing, so users can write and draw their plans and coordinate with friends. Shapes and colors are useful while scheduling. When they have different ideas, users can easily duplicate the current drawing onto another interactive board to express new alternatives.

Presentation

CollaboVR can be used for presentation as shown in figure CollaboVR: A Reconfigurable Framework for Multi-user to Communicate in Virtual Reality(b). By enabling the mirrored user arrangement and the default input orientation, the presenter and the audience will be placed on different sides of the content. It is easy for the presenter to know where the audience’s focus is, and it is useful for the audience to follow the presenter’s gesture and content at the same time.

Spatial Arrangement

CollaboVR can help with spatial arrangements, especially for 3D scenarios. Imagine that the user has just moved in to a new apartment and needs to discuss the placement of the furniture with a roommate. As figure CollaboVR: A Reconfigurable Framework for Multi-user to Communicate in Virtual Reality(c) demonstrates, the user can draw furniture with a combination of primitive shapes and place them directly at the preferred locations. Spatial arrangement is difficult to describe clearly through words and gestures, and it often requires multiple views when using drawings. CollaboVR can be helpful for such scenarios.

Item Design

CollaboVR can help with collaborative design. By enabling the mirrored user arrangement and horizontal input orientation, users can draw their items in detail with friends. The experience is close to drawing on a digital tablet with a pen. Other users can explain their opinions while pointing at the area of interest.

7 User Study: System Evaluation

In this study, we evaluated the interaction cycle, design variables and prototype implementation of CollaboVR. We were interested in how participants interact with the system. During the study, we collected primarily qualitative feedback to gain insight into the experience of using the system.

Introduction and Training

Participants were grouped by four. Each group was first introduced to the user study and gave consent for video recording. Then they were given a 10-minute lecture through a large monitor on the concept of Chalktalk. During the lecture we demonstrated how to do freehand drawing and how to create pre-defined 3D objects. In the next 10 minutes, participants were given a live demo on how to use CollaboVR. As part of the demo, an experimenter put on the headset and described how to use each button on the controller as well as the functionality of the system, including obtaining permission to draw, manipulating drawings and objects in 3D, and other functionalities. After that, each participant was moved to a different location. They learned how to interact with CollaboVR individually until all of them were able to conduct drawing and manipulation (this took around 10 minutes). Finally, they were brought to the shared virtual environment for design tasks.

Living room design

Next, all of the groups were asked to experience three 10-minute sessions. Each session featured different combinations for user arrangement and input orientation: condition 1(C1) is default user arrangement and input orientation; condition 2(C2) is mirrored user arrangement and default input orientation; condition 3(C3) is default user arrangement and horizontal input orientation. We counterbalanced the order of the conditions for each group using Latin Square. In each session, the participants were asked to design a living room containing only three items: a table, a chair and a couch. Each participant picked one item and wrote down their designs for the items and their layout of the three items. They then used CollaboVR to express their original ideas and reach an agreement for the living room design. After each session, they wrote down their final decisions for the design of the living room. (The writing was collected for further analysis.) We provided three items for the group (of four people), so they needed to resolve conflicts and come to a consensus about the design of a living room through our system.

Semi-structured interview

Upon completion of all the sessions, we presented the participants with a set of statements and asked them to rate how much they agreed with each of those statements on a 7-point Likert scale. We then administered a semi-structured interview asking about their experience, trying to gain insight into usability and use cases of the system.

7.2 Participants

We recruited a total of 12 participants (5 females and 1 left-handed; age range: 20 - 30, , ) via campus email lists and word-of-mouth. None of the participants had been involved with this project before. The participants have various VR experience (Rating scale, 1 (less experience) to 7 (more experience), , ).

7.3 Apparatus

CollaboVR was implemented in Unity on Windows desktops with Nvidia GTX 1080 cards. We used Oculus Rift CV1 with two Oculus Touch controllers for the study. Before the experiments, we paired the controllers with the headset on four computers. We connected the four computers to the router through Ethernet cables.

7.4 Data Collection and Analysis

Through the survey and interview, the study mainly focused on the system usability evaluation and qualitative feedback. Besides, we conducted one-way analysis of variance (ANOVA) statistical tests to examine the variance between different user arrangement and input orientation on task performance, which is defined as the details of the living room design for each session per group. We analyzed what they wrote before and after each session by calculating the quantity of the details, such as color, shape, texture and layout. Our null hypothesis is that across different user arrangements and input orientations, there is no difference in task performance. We predict that (1) the mirrored user arrangement would foster higher performance and (2) the horizontal input orientation would lead to lower performance. Questionnaire data that related to usability of user arrangement and input orientation were analyzed using repeated measures analysis of variance (RM-ANOVA). The level of significance was set at .

7.5 Results and Discussion

Usability

CollaboVR’s pipeline was quickly understood by all users. Knowing how to get permission, conducting freehand drawing, drawing 3D objects, manipulating the drawing afterwards, and spectating other users. 83.33% of the participants agreed that the system is easy to use (, ). “It is intuitive to do the drawing in 3D.”. Participants found out the permission strategy design is useful: “having to get permission to draw is really good at making the environment easier to work in.”(P7,F). 91.67% of the participants agree that it is helpful to show/express the idea (, ).

There could be merit once you’re doing something more complex to ’hey look, this thing, let’s do x,y,z…’ and that’s a lot easier to use this system.(P11,M)

Furthermore, 91.67% of the participants agreed that it is easy to follow the content partner is working on (, ), and 83.33% of the participants agreed that it is helpful to keep track of partner’s idea (, ). Partly because the users can “move themselves easily”(P1,M), they can easily follow the partner’s drawing and thoughts. In fact, 75% of the participants agree to use this tool to collaborate on personal projects (, ).

That’s what I’d use for my own work. It’s like a more –not quite as artsy – business tool. I would use it to get my ideas across to people.(P8,M)

Similarly, P11(M) thought, “it’s totally a great prototyping idea/prototyping system. Can’t say it’ll replace AutoCAD, but in a few years it will do that.”.

In addition, 91.67% of the participants thought CollaboVR is helpful to communication and collaboration (, ), and 83.33% of the participants agreed that they can anticipate what the partner would do next (, ). P3(F) commented that, “when [another user] started to draw the legs for the table, I quickly get his idea about the design of the legs, so he doesn’t need to say what kind of legs he wants.”. The participants found it an engaged experience and love to spend more time with friends. “It’s definitely a fun environment, entertaining.”(P7,F).

User Arrangement

Based on table 1, there are no significant differences between the default user arrangement and mirrored user arrangement for task performance and ease of use. However, 58.33% of the participants rated this session as the favorite one, probably because the users have better workspace awareness in this session and because they can keep the content and the others in view together. Quotes, “[…] mirrored, it is easy and convenient to communicate with others.” from P3(F) and “people didn’t block my view, and I could see the content clearly.” from P5(M) shared the similar thoughts. 33% of the participants preferred the default setup most. P2(M) commented, “because it is comparable to reality.”. P1(M) had similar opinion. We thought there might be two reasons. One is that some participants have less VR experience. This session is the closest setup comparing to real life, so they can get used to it quickly. The other reason is that the mirrored user arrangement will flip other users to the other side of the interactive board which might lead to the feeling of “i felt really far away from everyone.”(P7,F).

Next, we collected feedback of the ideal scenarios for different user arrangement. Mirrored user arrangement could be user for delivering a presentation. P4(M) suggested that because “you can better control your drawing meanwhile keep an eye on people’s reaction through the mirror.”. P5(M) thought about it from a student’s view, “kind of felt like Khan Academy in 3D vision.”. This is quite consistent with previous work focusing on face-to-face experience. P9(F) thought she can benefit from this setup when brainstorming because no one is blocking the view, “you can see everybody but you have your own space.”.

Input Orientation

From table 1, we can see that significant differences were found between different input orientation. Horizontal input orientation negatively impacts performance and usability. P8(M) commented that “I think this setup could be really good.”, “like the tablet.”, and “the horizontal board should be slanted more.”. P8(M) is a 3D modeler who frequently used tablet a lot for detailed drawing, so he has more experience with the input in tablet style. However, P12(F) felt “that type of idea works well with a physical tablet”. One reason why this setup has negative impact is probably that most participants are not familiar with such an input. We rendered the same thing on both the personal horizontal board and the shared vertical board. Some users found it difficult to create the correct mapping between the two views. On the other hand, users drawing horizontally can rest their elbows on the chair arm when drawing, so they enjoyed drawing longer than when drawing in mid-air. Furthermore, the participants tended to draw richer details because they felt liking holding a pen. 8.3% of the participants prefer this setup the most and emphasized that “I had more control over what I was drawing.”. In summary, horizontal input orientation enables users to draw or write like on a tablet, which encourages the users to draw for a longer time and brings more control to the users. However, users may need more time to get used to it.

When discussing the suitable scenarios for this setup, P8(M) thought a live demo or presentation could be beneficial from this, especially for a time-consuming one. He described “himself giving a presentation to other people looking at the big monitor while an audience was looking at the large monitor-like board.” and “just want to focus on the board.”. Meanwhile, P11(M) thought it would be helpful for collaborative design and suggested us to replace the controller with a pen. We found all of the thoughts inspiring.

8 Discussion, Limitations and Future Work

The results of our study roughly proved that CollaboVR can be used to communicate with others for non-expert use, such as brainstorming and presentation. The system is relatively easy to use and the experience is engaging and entertaining. Participants are willing to hangout with friend in the system. Considering two key variables we proposed, mirrored user arrangement has positive but not significant effects on the experiment. Participants felt that others are synced up with them because of greater awareness. Horizontal input orientation indicates the potential for detailed and longer time experiences. However, its learning curve is larger than other setups for users who have less tablet experience. In addition, a physical tablet [47] or a pen [59] as input could be helpful for this setup.

One limitation of the user study is the lack of diversity of the participants. Most participants are from the department of computer science, and the age range is not wide enough. The results of our study may therefore not be easily generalizable to other users, such as artists, financial people and older adults. Another limitation of the system is that participants found it difficult to track who was speaking. Either visual hints like attaching a speaker icon to the avatar or mouth tracking would be useful.

In the future, we plan to conduct more user studies to explore how CollaboVR affects the communication in different scenarios, such as presentation and brainstorming. Inspired by P7(F), supporting search engine is an interesting direction. We also want to investigate other user arrangements and input orientations.

9 Conclusion

In this paper we presented CollaboVR, a reconfigurable framework for distributed and co-located multi-user communication in VR. We described the system architecture, the design of two key variables (user arrangement and input orientation), the rich user interface, and its implementation. The system was evaluated in a user study showing that participants can easily interact with the system, and found it entertaining and useful for communication. We discovered that participants stated they would consider CollaboVR as a daily-life tool and can envision its potential. We also evaluated the key variables in terms of both usability and performance. Users reported that different combinations of two key variables work differently for different collaborative purposes, such as brainstorming, presentation, and design.

10 Acknowledgments

We thank all the volunteers for participating in our experiments and video shoots.

References

1. S. Albolino, R. Cook and M. O’Connor (2007) Sensemaking, safety, and cooperative work in the intensive care unit. Cognition, Technology & Work 9 (3), pp. 131–137. Cited by: §2.1.
2. S. Beck, A. Kunert, A. Kulik and B. Froehlich (2013) Immersive group-to-group telepresence. IEEE Transactions on Visualization and Computer Graphics 19 (4), pp. 616–625. Cited by: §2.3.
3. H. Benko, A. D. Wilson and F. Zannier (2014) Dyadic projected spatial augmented reality. In Proceedings of the 27th annual ACM symposium on User interface software and technology, pp. 645–655. Cited by: §2.3.
4. D. Billman and E. A. Bier (2007) Medical sensemaking with entity workspace. In Proceedings of the SIGCHI conference on human factors in computing systems, pp. 229–232. Cited by: §2.1.
5. S. Brave, H. Ishii and A. Dahley (1998) Tangible interfaces for remote collaboration and communication.. In CSCW, Vol. 98, pp. 169–178. Cited by: §1.
6. L. Cheng, S. Marwecki and P. Baudisch (2017) Mutual human actuation. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology, pp. 797–805. Cited by: §2.3.
7. C. Chung, C. Lee and C. Liu (2013) Investigating face-to-face peer interaction patterns in a collaborative w eb discovery task: the benefits of a shared display. Journal of Computer Assisted Learning 29 (2), pp. 188–206. Cited by: §2.1.
8. B. Dervin (1992) From the mind’s eye of the user: the sense-making qualitative-quantitative methodology. Sense-making methodology reader. Cited by: §2.1, §2.
9. R. Du, D. Li and A. Varshney (2019) Geollery: a mixed reality social media platform. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pp. 685. Cited by: §2.3.
10. R. Du and A. Varshney (2016) Social street view: blending immersive street views with geo-tagged social media.. In Web3D, pp. 77–85. Cited by: §2.3.
11. S. Follmer, D. Leithinger, A. Olwal, A. Hogge and H. Ishii (2013) InFORM: dynamic physical affordances and constraints through shape and object actuation.. In Uist, Vol. 13, pp. 417–426. Cited by: §1.
12. A. M. Genest, C. Gutwin, A. Tang, M. Kalyn and Z. Ivkovic (2013) KinectArms: a toolkit for capturing and displaying arm embodiments in distributed tabletop groupware. In Proceedings of the 2013 conference on Computer supported cooperative work, pp. 157–166. Cited by: §2.2.
13. Google (2019) AutoDraw is a new kind of drawing tool that pairs the magic of machine learning with drawings from talented artists to help everyone create anything visual, fast.. Note: \urlhttps://www.autodraw.com/ Cited by: §3.
14. J. Gugenheimer, E. Stemasov, J. Frommel and E. Rukzio (2017) Sharevr: enabling co-located experiences for virtual reality between hmd and non-hmd users. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, pp. 4021–4033. Cited by: §2.3.
15. R. Gumienny, L. Gericke, M. Quasthoff, C. Willems and C. Meinel (2011) Tele-board: enabling efficient collaboration in digital design spaces. In Proceedings of the 2011 15th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pp. 47–54. Cited by: §2.3.
16. C. Gutwin and S. Greenberg (1996) Workspace awareness for groupware. In Conference Companion on Human Factors in Computing Systems, pp. 208–209. Cited by: §2.2.
17. C. Gutwin and S. Greenberg (1998) Design for individuals, design for groups: tradeoffs between power and workspace awareness. Cited by: §2.2.
18. C. Gutwin and S. Greenberg (2002) A descriptive framework of workspace awareness for real-time groupware. Computer Supported Cooperative Work (CSCW) 11 (3-4), pp. 411–446. Cited by: §2.2.
19. K. Huo, T. Wang, L. Paredes, A. M. Villanueva, Y. Cao and K. Ramani (2018) SynchronizAR: instant synchronization for spontaneous and spatial collaborations in augmented reality. In The 31st Annual ACM Symposium on User Interface Software and Technology, pp. 19–30. Cited by: §1, §2.3.
20. P. Isenberg, D. Fisher, M. R. Morris, K. Inkpen and M. Czerwinski (2010) An exploratory study of co-located collaborative visual analytics around a tabletop display. In 2010 IEEE Symposium on Visual Analytics Science and Technology, pp. 179–186. Cited by: §2.1.
21. H. Ishii, M. Kobayashi and J. Grudin (1993) Integration of interpersonal space and shared workspace: clearboard design and experiments. ACM Transactions on Information Systems (TOIS) 11 (4), pp. 349–375. Cited by: §1, §1.
22. H. Ishii and M. Kobayashi (1992) ClearBoard: a seamless medium for shared drawing and conversation with eye contact. In Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 525–532. Cited by: §2.2.
23. A. Kunert, A. Kulik, S. Beck and B. Froehlich (2014) Photoportals: shared references in space and time. In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing, pp. 1388–1399. Cited by: §1.
24. A. Kunert, T. Weissker, B. Froehlich and A. Kulik (2019) Multi-window 3d interaction for collaborative virtual reality. IEEE transactions on visualization and computer graphics. Cited by: §1, §2.3.
25. M. A. Lab (2019) Sketch 2 code. transform any hands-drawn design into a html code with ai.. Note: \urlhttps://sketch2code.azurewebsites.net/ Cited by: §3.
26. D. Lakatos, M. Blackshaw, A. Olwal, Z. Barryte, K. Perlin and H. Ishii (2014) T (ether): spatially-aware handhelds, gestures and proprioception for multi-user 3d modeling and animation. In Proceedings of the 2nd ACM symposium on Spatial user interaction, pp. 90–93. Cited by: §2.3.
27. J. Landgren and U. Nulden (2007) A study of emergency response work: patterns of mobile phone interaction. In Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 1323–1332. Cited by: §2.1.
28. D. Leithinger, S. Follmer, A. Olwal and H. Ishii (2014) Physical telepresence: shape capture and display for embodied, computer-mediated remote collaboration. In Proceedings of the 27th annual ACM symposium on User interface software and technology, pp. 461–470. Cited by: §1.
29. D. Leithinger, S. Follmer, A. Olwal and H. Ishii (2015) Shape displays: spatial interaction with dynamic physical form. IEEE computer graphics and applications 35 (5), pp. 5–11. Cited by: §1.
30. J. Li, S. Greenberg, E. Sharlin and J. Jorge (2014) Interactive two-sided transparent displays: designing for collaboration. In Proceedings of the 2014 conference on Designing interactive systems, pp. 395–404. Cited by: §1, §2.2, §4.1.
31. A. Maimone, X. Yang, N. Dierk, A. State, M. Dou and H. Fuchs (2013) General-purpose telepresence with head-worn optical see-through displays and projector-based lighting. In 2013 IEEE Virtual Reality (VR), pp. 23–26. Cited by: §1.
32. N. Marquardt and S. Greenberg (2015) Proxemic interactions: from theory to practice. Synthesis Lectures on Human-Centered Informatics 8 (1), pp. 1–199. Cited by: §1.
33. Microsoft (2019) HoloLens 2. mixed reality is ready for business.. Note: \urlhttps://www.microsoft.com/en-us/hololens Cited by: §5.2.
34. M. R. Morris, J. Lombardo and D. Wigdor (2010) WeSearch: supporting collaborative search and sensemaking on a tabletop display. In Proceedings of the 2010 ACM conference on Computer supported cooperative work, pp. 401–410. Cited by: §2.1.
35. K. Nakagaki, D. Fitzgerald, Z. J. Ma, L. Vink, D. Levine and H. Ishii (2019) InFORCE: bi-directionalforce’shape display for haptic interaction. In Proceedings of the Thirteenth International Conference on Tangible, Embedded, and Embodied Interaction, pp. 615–623. Cited by: §1, §2.3.
36. K. O’hara, J. Kjeldskov and J. Paay (2011) Blended interaction spaces for distributed team collaboration. ACM Transactions on Computer-Human Interaction (TOCHI) 18 (1), pp. 3. Cited by: §1.
37. O. Oda, C. Elvezio, M. Sukan, S. Feiner and B. Tversky (2015) Virtual replicas for remote assistance in virtual and augmented reality. In Proceedings of the 28th Annual ACM Symposium on UIST, pp. 405–415. Cited by: §2.3.
38. S. Orts-Escolano, C. Rhemann, S. Fanello, W. Chang, A. Kowdle, Y. Degtyarev, D. Kim, P. L. Davidson, S. Khamis and M. Dou (2016) Holoportation: virtual 3d teleportation in real-time. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology, pp. 741–754. Cited by: §2.3.
39. K. Otsuka (2016) MMSpace: kinetically-augmented telepresence for small group-to-group conversations. In Virtual Reality (VR), 2016 IEEE, pp. 19–28. Cited by: §1, §1.
40. S. A. Paul (2009) Understanding together: sensemaking in collaborative information seeking. Cited by: §2.1, §2.
41. K. Perlin, Z. He and K. Rosenberg (2018) Chalktalk: a visualization and communication language–as a tool in the domain of computer science education. arXiv preprint arXiv:1809.07166. Cited by: 1st item, §3.
42. C. Plaue and J. Stasko (2009) Presence & placement: exploring the benefits of multiple shared displays on an intellective sensemaking task. In Proceedings of the ACM 2009 international conference on Supporting group work, pp. 179–188. Cited by: §1.
43. PTC (2019) Innovate with industrial augmented reality.. Note: \urlhttps://www.ptc.com/en/products/augmented-reality Cited by: §5.2.
44. R. Raskar, G. Welch, M. Cutts, A. Lake, L. Stesin and H. Fuchs (1998) The office of the future: a unified approach to image-based modeling and spatially immersive displays. In Proceedings of the 25th annual conference on Computer graphics and interactive techniques, pp. 179–188. Cited by: §1.
45. D. F. Reilly, H. Rouzati, A. Wu, J. Y. Hwang, J. Brudvik and W. K. Edwards (2010) TwinSpace: an infrastructure for cross-reality team spaces. In Proceedings of the 23nd annual ACM symposium on User interface software and technology, pp. 119–128. Cited by: §2.3.
46. reklamistcomua (2019) World builder (virtual reality).. Note: \urlhttps://www.youtube.com/watch?v=FheQe8rflWQ Cited by: §5.3.
47. I. Rosenberg and A. Zarraga (2019) Sensel’s mission is to build the next wave of touch technologies that will empower users, disrupt industries, and revolutionize interaction with the digital world.. Note: \urlsensel.com Cited by: §8.
48. T. C. Software (2019) Miro. collaboration without constraints.. Note: \urlhttps://Miro.com/ Cited by: §3.
49. M. Sra, A. Mottelson and P. Maes (2018) Your place and mine: designing a shared vr experience for remotely located users. In Proceedings of the 2018 Designing Interactive Systems Conference, DIS ’18, New York, NY, USA, pp. 85–97. External Links: ISBN 978-1-4503-5198-0, Link, Document Cited by: §2.3.
50. M. Sra and C. Schmandt (2015) Metaspace: full-body tracking for immersive multiperson virtual reality. In Adjunct Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology, pp. 47–48. Cited by: §2.3.
51. K. Tan, D. Gelb, R. Samadani, I. Robinson, B. Culbertson and J. Apostolopoulos (2010) Gaze awareness and interaction support in presentations. In Proceedings of the 18th ACM International Conference on Multimedia, MM ’10, New York, NY, USA, pp. 643–646. External Links: ISBN 978-1-60558-933-6, Link, Document Cited by: §2.3.
52. A. Tang, M. Pahud, K. Inkpen, H. Benko, J. C. Tang and B. Buxton (2010) Three’s company: understanding communication channels in three-way distributed collaboration. In Proceedings of the 2010 ACM conference on Computer supported cooperative work, pp. 271–280. Cited by: §1, §2.3.
53. J. C. Tang and S. L. Minneman (1990) VideoDraw: a video interface for collaborative drawing. In Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 313–320. Cited by: §1, §2.2.
54. J. C. Tang and S. Minneman (1991) VideoWhiteboard: video shadows to support remote collaboration. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 315–322. Cited by: §2.2.
55. J. Thomas, R. Bashyal, S. Goldstein and E. Suma (2014) MuVR: a multi-user virtual reality platform. In 2014 IEEE Virtual Reality (VR), pp. 115–116. Cited by: §1.
56. B. Tversky, M. Suwa, M. Agrawala, P. Hanrahan, D. Phan, J. Klingner, M. Daniel, P. Lee and J. Haymaker (2003) Human behavior in design: individuals, teams, tools. Sketches for Design and Design of Sketches, pp. 79–86. Cited by: §1.
57. K. Vogt, L. Bradel, C. Andrews, C. North, A. Endert and D. Hutchings (2011) Co-located collaborative sensemaking on a large high-resolution display with multiple input devices. In IFIP Conference on Human-Computer Interaction, pp. 589–604. Cited by: §2.1.
58. J. R. Wallace, S. D. Scott, T. Stutz, T. Enns and K. Inkpen (2009) Investigating teamwork and taskwork in single-and multi-display groupware systems. Personal and Ubiquitous Computing 13 (8), pp. 569–581. Cited by: §2.1.
59. P. Wu, R. Wang, K. Kin, C. Twigg, S. Han, M. Yang and S. Chien (2017) DodecaPen: accurate 6dof tracking of a passive stylus. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology, pp. 365–374. Cited by: §8.
60. H. Xia, S. Herscher, K. Perlin and D. Wigdor (2018) Spacetime: enabling fluid individual and collaborative editing in virtual reality. In The 31st Annual ACM Symposium on UIST, pp. 853–866. Cited by: §1, §2.3.
61. J. S. Yi, Y. Kang, J. T. Stasko and J. A. Jacko (2008) Understanding and characterizing insights: how do people gain insights using information visualization?. In Proceedings of the 2008 Workshop on BEyond time and errors: novel evaLuation methods for Information Visualization, pp. 4. Cited by: §2.1.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters