Autonomous UAV Navigation with Domain Adaptation

Autonomous UAV Navigation with Domain Adaptation

Jaeyoon Yoo    Yongjun Hong    Sungrho Yoon
Abstract

Unmanned Aerial Vehicle(UAV) autonomous driving gets popular attention in machine learning field. Especially, autonomous navigation in outdoor environment has been in trouble since acquiring massive dataset of various environments is difficult and environment always changes dynamically. In this paper, we apply domain adaptation with adversarial learning framework to UAV autonomous navigation. We succeed UAV navigation in various courses without assigning corresponding label information to real outdoor images. Also, we show empirical and theoretical results which verify why our approach is feasible.

boring formatting information, machine learning, ICML

1 Introduction

Unmanned Aerial Vehicle(UAV) navigation without a human control has been a challenging task in the machine learning field. Most approaches for this task depend on lots of sensors such as GPS sensor, depth sensor and other kinds of sensors. However, using those sensors to navigate has been in trouble in that most sensors have limitations of their each own. For example, GPS sensor does not often operate well when UAV is located in indoors or in forest. In addition to that, depth sensor increases the weight of the body, which aggravates the motor’s burden. Also, the high price is the main obstacle for using such sensors.

Therefore, UAV flight with raw 2D images gets popular attention inevitably. Unlike other sensors, monocular camera is quite light and cheap. Also, acquiring images in real time does not suffer from any environments. In addition, current big achievement of deep learning in the image domain [13], gives us great opportunity to use monocular camera image for UAV navigation. In this paper, we exploit the deep learning framework to develop the UAV automatic navigation system.

Recently, lots of approaches of machine learning on UAV flight has been proposed [24, 7, 20]. However, UAV flight in the real outdoor environment is really tough because of some reasons. First of all, getting real images with exact label may be expensive and it is laborious to assign corresponding proper labels to massive existing images. In addition to labeling problem, it may be restrictive when it sometimes requires fully information such that distance from obstacles or altitude, which requires laboratory environment to acquire those data.

These troubles is not only limited to UAV flight but widely happens in various machine learning field. To address those troubles, there have been many trials that adopt the domain adaptation [19]. Domain adaptation represents adapting source domain into target domain in that one uses source domain data to train the model for some classifying tasks so that it can also work well in the target domain simultaneously. In the robotics including UAV navigation, one can use simulator as the source domain and make use of the simulator data to train the model that works well in the real environment, the target domain because simulator environment doesn’t have such limitations mentioned in the above. In addition, the quality of the recent artificial graphic environment has been developed considerably as one can see in the virtual reality, computer games, which gives more attraction to use simulator.

In this paper, we develop UAV autonomous navigation system by exploiting the domain adaptation, especially domain adaptation with adversarial learning networks which have been introduced recently. By training our deep learning model using lots of simulation rendered images with labels and a few real images without any label, we succeed to navigate UAV automatically in the real outdoor environment.

The rest of the paper is organized as follows. At section 2 and 3, we provide a brief introduction on Generative Adversarial Networks(GAN), domain adaptation and introduce the related work. In section 4, we discuss our approach on UAV navigation and the experimental result is provided in section 5. Section 6 is devoted to guarantee the performance of our approach mathematically. In section 7, we discuss the ideas presented and section 8 is a conclusion.

We summarize our contributions as follow.

  • We succeed in UAV navigation without any label of real surrounding images. As far as we know, we first apply domain adaptation with adversarial learning networks into UAV flight.

  • We build the simulation environment to be used as a source domain to use in domain adaption framework. Also, we improve the existing supervised learning version of UAV navigation with domain adaptation framework which is highly attractive in that it does not need target domain labels.

  • We provide our mathematical guarantee that shows why classification of real surrounding image works despite of the absence of label. We show that target domain classification performance has to be guaranteed with well-behaved classifier in transferred domain.

2 Background

2.1 Generative Adversarial Networks

Goodfellow et al. [8] proposes Generative Adversarial Networks (GAN) which is a popular method for synthesizing images by adopting the generator and the discriminator . produces synthetic images so as to fool and takes counterpart role of in that it aims to distinguish real images and fake images generated by . This adversarial behavior leads to following minimax objective function where refers to real data probability distribution and refers to the distribution of the data generated by where and stand for latent space and real data space respectively.

(1)

2.2 Domain Adaptation

Domain adaptation is a kind of transfer learning which tries to adapt source domain into the other unseen domain so called a target domain while task performance is maintained also in target domain. It can be thought as transferring prior known knowledge of source domain to target domain . More formally, let and be the data space and label space respectively. Then source domain has a distribution on and the target domain has a different distribution on the same space. Then domain adaptation solves following problem: given labeled source domain data  and unlabeled target domain data which comes from the marginal distribution of , learns a function which works on target domain distribution .

The simplest way to conduct the domain adaptation is just not to care about the difference between two domain which is known as domain discrepancy. Rather, training the model in the source domain and apply it on the target domain directly. However in many cases, it does not work well due to the domain discrepancy [14]. The most popular way to overcome domain discrepancy is to reconstruct a common representation space for two domains [2, 14, 22]. This approach attempts to obtain a domain invariant representation space where task performance is invariant from domains. Beyond this approach, there is an intuition that as distributions of two domain on the representation space become closer, the domain discrepancy gets smaller so that the classifying task for target domain gets to work better. Meanwhile, to find such representation space, one can use hand-crafted features but it can not be generalized and requires domain knowledge. Due to such reasons, there have been some trials to apply GAN framework to reconstruct such common representation autonomously [2, 22]. It can achieve the goal of the domain adaptation by leading the feature representations to be indistinguishable between the two domains and helpful for the task classifier by adversarial learning with the discriminator.

3 Related Work

To conduct UAV navigation with real images, some researches proposed a supervised learning method in that they collected images and corresponding labels, then trained the controller with such paired data. [20] adopted linear regression using real 2D images and corresponding command labels. They proposed algorithm called DAGGER to tackle the mismatch between the desirable path distribution in the training phase and the execution path distribution in the test phase. It trains the linear controller with typical supervised learning and executes the partially trained controller. It gathers new images it met during the execution and new images are assigned corresponding labels appropriately. It trains the linear controller again with new paired data which it has collected. By iteratively doing this process, it aims to improve the supervised learned controller’s performance. Meanwhile, Giusti et al. [7] suggested to view UAV navigation as the classification problem instead of the regression problem. By classifying the orientation where UAV is pointing on, it tries to give proper steering command to the UAV so that it goes along the trail. Smolyanskiy et al. [24] improved that classification problem by adding negative entropy regularization term in the objective functions and lateral classifier that classifies where UAV is located with respect to the road. On the other hand, Kahn et al. [11] trained the UAV navigation in the simulator system by exploiting guided policy search which can be thought as a kind of supervised learning. It is different from the typical supervised learning in that it gets the label automatically using trajectory optimization. In this paper, we follow Giusti et al. [7] and Smolyanskiy et al. [24]’s classification framework on our UAV navigation because it simplifies the problem and addresses the sparse steering command in the regression.

The common property of above researches is that they adopted a supervised learning method so they needed laborious labeling works. These kind of approaches have fundamental limitation in that labeling for images cannot be done automatically and the execution of the partially trained model in the real environment may cause serious damage on the UAV. In addition, Kahn et al. [11] needs full information about raw image such as depth image and numerical measurement, which is difficult to get in the outdoor environment. On the other hand, there is no such problem in the simulation environment as one can measure any physical properties such as location and velocity, and doesn’t need to care about UAV’s security problem. Mirowski et al. [16], Sadeghi & Levine [21] interpreted UAV navigation as reinforcement learning (RL) in that they aimed to train the agent from trial and errors in simulation environment. Since RL needs the exploration which may cause lots of collision at the early part of the training, applying RL on the real environment directly is highly improper. However, because of the discrepancy of the simulator image and real image such as texture and color distribution, the trained model on simulation environment is not guaranteed to work well in the real environment. Mirowski et al. [16] tackled such discrepancy issues by exploiting the depth image which is more invariant between the two environments. It used depth sensor or reconstructed depth from image to get depth image. However, using depth sensor increases the cost and weight of UAV and reconstructing depth image gives inaccurate images which incur error and high computing cost. Meanwhile, Sadeghi & Levine [21] applied Q-learning on the UAV navigation learning. It showed that its model also worked well in the real indoor environment even though it did not offer any real images during the training. However, the indoor environment is relatively simple and plain compared with the outdoor environment so that it is still doubtful if it also works well in the real outdoor environment.

There have been some researches that apply domain adaptation with adversarial learning framework for their each task.  Bousmalis et al. [5] adopted this framework for the grasping task. Their approach looks similar with ours, but we utilize domain adaptation in the UAV navigation task and show mathematical guarantee additionally.  Shrivastava et al. [23] refined the eye rendered image to be realistic using GAN. Using these refined images, it improved the performance of the eye classifier. Unlike us, it adopted two independent training steps of which first one is adversarial learning phase and other one is classification task.

4 Method

4.1 Building simulator environment data

We build our simulation environment using Gazebo [12], the widely known robot simulator. As our navigation system aims at the Dulle-gil trail made of a asphalt road surrounding the bush and tall tree as seen in top row of Figure 1, we build similar environment that is similar with the targeted real environment as seen in middle row of Figure 1.

Figure 1: (Top) Real target domain image examples-Gwanak Mountain Dulle-gil trail. (Middle) Simulator source domain image examples. (Bottom) Masked images for road in the simulator images.

Figure 2: The randomly generated road coordinates to build the simulator environment.

Figure 3: The lateral/head classes for a road. Though center class slightly covers the region cross the yellow centerline, most center class images are located at the right side to the centerline.

Figure 4: Overall flow

We build our simulation environment by two phases. First, we try to get diverse road data by randomly picking up course coordinates for generalization. By determining the road’s curvature and length according to the heuristic rule, we get natural road coordinates. Figure 2 shows the examples of generated road by simulator. After that, we make many objects such as maple tree, ginkgo tree, bushes, fences and so forth using the open source CAD tool Blender and arrange these objects alongside the road randomly.

4.2 UAV navigation with domain adaptation

As mentioned earlier, we view the UAV navigation as multi-class classification problem. As in  Smolyanskiy et al. [24], we train deep learning model that classifies where the UAV is located with respect to the road(left/center/right-lateral class) and where the UAV is heading on alongside the road(left/straight/right-head class) based on the camera image input. Figure  3 represents how we divide lateral and head classes. To train our model, we collect simulation rendered images with corresponding lateral, head labels and real data without any labels. To address a gap between simulation rendered images and target domain images and to deal with the absence of labels for real environment images, we adopt domain adaptation with adversarial learning framework.

Among several methods of domain adaptation with adversarial learning, Bousmalis et al. [4] has been popular in that it focuses on transferring source domain images as if they look like target domain images while other approaches only concentrate on extracting domain-invariant non-image features [2], [22]. We attempt to transfer source domain image which is a simulation rendered image into target domain image which is a real surrounding image as shown in Figure 5, so classifying in target domain can be achieved without any labels of target domain image intuitively.

The overall flow of our domain adaptation model in UAV navigation is represented in Figure  4. The model takes target images, source images and the noise vector picked from uniform distribution. In addition, the lateral class labels and the head class labels for the source images are given to the model. At the generator part, source image and random noise vector are converted into transferred image which should look like target image. The discriminator takes transferred images and target images then distinguishes whether its input comes from the generator or target domain. The classifier takes the transferred image and the ground-truth labels of source images which transferred images come from and learns to classify their lateral/head class respectively.

4.3 Object functions

Domain adaptation with adversarial learning framework is basically composed of transferred image generator , discriminator and task classifier as shown in Figure 4. Therefore, object functions for our UAV navigation system may need adversarial objective term, classification objective term. Following paragraphs in this section detail each objective term in aspect of domain adaptation.

4.3.1 Adversarial learning

Among lots of variants of GAN, we adopt Fisher GAN of Mroueh & Sercu [17]. The reason for using Fisher GAN is that Fisher GAN belongs to Integral Probability Metric(IPM) family which is strongly and consistently convergent and more robust to disjoint supports of probability distributions [10]. Also Fisher GAN is known to have computational benefits over other IPMs such as Arjovsky et al. [3], Gulrajani et al. [9]. It should be noted that since we adopt IPM family rather than standard GAN objective, we rename the discriminator as the critic . The objective term for the adversarial learning is as follows where is the Lagrange multiplier and is the quadratic penalty weight [17].

(2)

4.3.2 Classifier

We adopt typical softmax cross entropy objective function for the classification task. In addition to that, we are motivated from Smolyanskiy et al. [24] in that we add the negative entropy regularization term and the penalty loss term. The negative entropy regularization term prevents the classifier from producing the extremely sharp result and the penalty loss assigns the additional punishment for the classifier not to swap left and right label each other. The result of the classifying loss follows where , are classifier predictions of head position i {left, straight, right}, classifier prediction of lateral position j {left, center, right} respectively and , are the ground truth label of head and lateral position respectively. It should be noted that , in second equation refer to left label prediction probability when ground truth is right and right label prediction probability when ground truth is left respectively. Also refer to the weights of each loss term.

(3)

4.3.3 Overall objective

To sum up the above objective terms, we use following overall objective function to optimize our model where are the weights of and respectively.

(4)

4.4 Steering controller

As our model gives only the class output not the command to steer the UAV, we need the additional steering controller module. We modify the steering controller of Smolyanskiy et al. [24] as follows.

When our model gives softmax output for the lateral class and for the head class, our steering controller sends following y-axis linear velocity and the z-axis angular velocity to the UAV.

(5)

The positive y-axis linear velocity makes UAV go left by side and the positive z-axis angular velocity let UAV rotate by counter-clockwise. Aside from these commands, UAV flies forward by constant velocity as a default.

5 Experiment

5.1 Experiment Setup

We gathered around 8000 images for each lateral class and head class by randomizing the position of UAV in the simulator and fixing the titled angle by -45,0,45 degrees with respect to the road. Beyond the simulation rendered image for source domain, we got around total 30,000 images for target domain by recording at Gwanak Mountain trail in Seoul. In both case, we fixed the altitude of UAV around 1.5m. We used the parrot Bebop2 drone [18] model for our experiment. To send the command from the trained model, we utilized parrot bebop ROS package with joystick connection. We sent the command to drone via joystick and activated joystick steering only when drone was stuck into dangerous position. We implemented our UAV navigation system with Tensorflow [1]. During the test, we calculated our model’s output by every 50ms with GTX 1060. However, we set our commanding interval by every one second to protect the pedestrians. Also, we sent the median of gathered commands to UAV to alleviate noise.

5.2 The quality of transferred image and off-line test

It can be naturally thought that transferred images should look like target domain images to succeed in navigation. Figure 5 shows the source images, corresponding transferred images. Images in bottom row show the most similar images in target domain.

Figure 5: Each row stands for source domain images, transferred images and target domain images respectively. Each column represents lateral left/head straight, lateral center/head left, lateral center/head straight and lateral center/head right cases respectively. It should be noted that target domain images are chosen as the most similar ones to transferred image in each cases.
Model Lateral offset label Head orientation label
left center right left straight right
Source image SL 0.0% 65.5% 55.2% 76.8% 64.1% 45.0%
Source mask SL 31.4% 0.0% 30.3% 1.7% 45.2% 46.4%
Target image SL 90.1% 99.8% 55.4% 91.7% 100.0% 96.9%
Our model 92.0% 52.8% 46.5% 70.6% 55.8% 57.7%
(a) Classification accuracy
Model Lateral offset label Head orientation label
left center right left straight right
Source image SL 0.070.19 0.350.40 0.550.44 -0.700.52 0.090.55 0.040.89
Source mask SL 0.350.83 -0.280.79 -0.320.75 0.290.34 0.520.29 0.570.28
Target image SL -0.830.22 0.0020.04 0.520.37 -0.890.23 -0.0040.01 0.940.16
Our model -0.870.37 -0.140.59 0.310.66 -0.680.46 0.030.59 0.400.70
(b) Average steering command
Table 3: Off-line test result Source image SL(supervised learning) is a supervised learning model with simulation rendered image, source mask SL is a supervised trained model with simulation rendered image with road mask and target image SL is a supervised learning model with outdoor images. It should be noted that probability of right lateral in target image SL model is quite low than other classes in Table (a)a. Even though we have another target image SL model which has 99% accuracy in right lateral, this model does much poor performance on outdoor navigation than Table (a)a target SL model. We list up more well-behaved target image SL model into this table. Despite of low accuracy, appropriate command can lead to good navigation.

As shown in Figure 1, we firstly used two kinds of source images. One contains not only the road but the background such as trees, sky. The other only contains the road and the other part’s values are all zeros as seen in bottom row of Figure 1. Even though two versions both succeeded to transfer source domain images as if look like target domain images, the task classification performance of a model with unmasked source images was worse and less robust than a model with road masked source image during UAV navigation in real outdoor environment. Therefore, we used the road masked images afterwards.

We tried off-line tests for our trained models. We observed not only classification accuracy but also steering command on the target test data which wasn’t used in training phase. The reason why we checked both results is that UAV navigation depends on steering command not on classification accuracy. We compare our model with supervised learned(SL) model in Table  3. The steering command for left/straight/right in lateral side and left/center/right in head side should be negative, zero, positive respectively. The larger its absolute value, it means more sharp move. Our model succeeded to classify and give correct command compared to target supervised model while both source supervised model failed. Furthermore, it should be noted that even though classification accuracy for some classes is quite low in our model’s result, UAV may go forward because steering command guides UAV move properly.

Figure 6: Test course. Pink arrow indicates curved corner. [Left] course 1:two slight corner [Center] course 2:two sharp curve, unseen street during the training. [Right] course 3:different environment, not asphalt but sidewalk road with two corner.

5.3 UAV navigating result

We verify the performance of our approach based on two criteria. First, how many human interrupts are required to finish navigation in certain trail course. Second, how many times UAV recovers from intentional disturbance. We want to notify that as our approach focuses on the domain adaptation, not the architecture of the model, so we compare our model to SL model with source domain images and the one with target domain images. Since a source mask SL model shows the lowest performance on off-line test, we omit that case from navigation result.

5.3.1 Autonomous navigating

We test our model in three courses. The length of each course is about 190m, 60m and 110m respectively. These courses are thought to be sufficient to test our model because they contain road curving to the left and right as seen in Figure 6. Course 1 in Figure 6 has relatively slight curves but course 2 and 3 have sharp curves compared to course 1. Table 4 shows the average number of human interrupt until UAV finished each course. Considering the performance of a model trained with source domain images, we can conclude that domain adaptation is necessary. Also, our model shows successful performance compared to a supervised model trained with target domain images. In addition, the UAV navigated without human interrupt by 270m but it could have gone further if we didn’t terminate it. Our UAV navigation video is uploaded on https://youtu.be/Pxv6kJpC8tY.

Moreover, we want to notify that our model operated also well in course 3 where somewhat different with the environment where we collected target domain images. There are also road in the center and bushes alongside the road, but the road is covered by sidewalk blocks of which texture and color are different with those of asphalt lane. Most importantly, our model succeeded to navigate in course 3 where its environment images were never collected or used during training phase.

5.3.2 Recovery from disturbance

We also tested out model focusing on how many times UAV recovers from the disturbance. The wind blows in the outdoor and strong air current occurs by UAV itself. Also, UAV have to avoid objects including humans. Therefore, it is quite common for UAV to deviate from the desirable location or path thus it is very important to recover from those hindrance. We let UAV to be in non-desirable position intentionally and measure how much ratio it goes back to desirable position. Table 5 shows the ratio it recovers from the extreme case(lateral left/head left, lateral right/head right). Our model succeeded to recover as much as the target SL model.

course 1 course 2 course 3
Source image SL > 10
Target image SL 0.2(1/5) 1.2(6/5) 0.2(1/5)
Our model 0.8(4/5) 1.4(7/5) 0.0(0/5)
Table 4: Average number of human interrupt to finish We count the total number of interrupts in 5 navigation trials on each course. Our model shows better performance compared to target image SL model. On the other hand, Source image SL model failed to finish the course 1. Course 1 can be regarded as easier course than other ones in that it is seen in training whereas others not. Therefore, we omit performance of source image SL on course 2,3.
left/left right/right
Target image SL 95%(19/20) 60%(12/20)
Our model 95%(19/20) 60%(12/20)
Table 5: Success rate to recover We count the number of successful recovery in 20 intentional disturbance at extreme cases. Our model shows similar performance as target SL model. left/left and right/right stand for lateral left/head left and lateral right/head right respectively.

6 Mathematical discussion

Our paper addresses the lack of target domain’s label by making source images look like target images and giving them their labels of source domain. Beyond our solution, there is an intuition that if a classifier works well for the transferred images and the transferred images are similar to target domain images, then the classifier should also fit on the target domain images. In this section, we show that our intuition is theoretically correct and guarantee the performance of the classifier.

According to Theorem 1.1 in Mroueh & Sercu [17], Fisher IPM is equal to the Chi-squared distance where and refer to target domain image probability distribution and transferred image probability distribution respectively. Therefore, if adversarial learning with Fisher GAN framework is successful, we may assume that is satisfied for arbitrary small positive number . With the relationship of where is Pearson divergence, we can get following equation.

(6)

Gibbs & Su [6] shows that Total Variance(TV) and have following inequality which holds if is dominated by where are probability distributions.

(7)

Since the support of contains the support of , we can derive equation 8 using inequality 7 in that substitute into and into .

(8)

With the definition of where is the measure on the probability space, we can derive following equation 6 and  10.

(9)
(10)

Now, we assume there exists a classifier such that fits well in transferred data(i.e ). From now on, we will show the classifier also should work well in target domain using the above equations. The mathematical flow is quite similar with that of Ross et al. [20] but it can be thought as a special case with one time-step.

We define to be error rate(i.e inaccuracy in the interval 0,1) of the classifier for some fixed data. Then, error in transferred data domain can be derived as equation 11. It should be noted that we may assume that is satisfied for arbitrary small positive number for well-behaved classifier where stands for average error for domain .

(11)

Then, we show the error rate of the classifier on target domain is also small by following relation.

(12)

As are sufficiently small, error on target domain should also be small. That means, the classifier’s performance on target domain can be guaranteed as on transferred data domain and this result implies that why our empirical results in section 5 has pretty good performance.

7 Discussion

We succeeded to navigate UAV autonomously without any labels for target domain images. However, there were some failure cases. First of all, our model is less robust to light condition such as strong sunlight and dark. We collected real environment images during 3-5 pm in the fall and winter so almost images were exposed to only small amount of sunlight. Thus, the transferred image were necessarily biased to similar state. We tested on the same daytime and our model showed good performance at that time but failed at midday. When sun is strong, the road on the camera image changes its color from the gray to yellow and it makes navigation fail because there were no such images in target domain images. Therefore, to be robust to sunny light condition, lots of images under the various conditions may be necessary. UAV navigation also failed on the snowy road. As like sunlight, it is because that snowy road has never been appeared during the training. Secondly, not only our model but also target image SL model failed when branches from tree invades the road in the air. Though UAV is on the road, protruding branches are hung on the wing of UAV, which leads to failure too. Since the road is still visible through pine leaves, it could be difficult for UAV to avoid those obstacles. These problems remains as our future works.

There have been a few kind of models in the domain adaptation with adversarial learning framework. In this paper, we adopted the framework in  Bousmalis et al. [4]. We also tried Shen et al. [22]’s model but noticed that it was hard to judge the training progress with non-image features. By observing the transferred image, it was much easier to tune the proper hyper-parameter. However, we take different approach with Bousmalis et al. [4] in that we does not adopt content similarity objective term in training phase. Since the location of the road should be invariant between source image and transferred image for successful domain adaptation, we originally added the content similarity loss term as follows where stands for road mask and refers to element-wise multiplication.

We tested models with content similarity term but these cases guided the transferred images not as much as different from source images and resulted in poor performance at off-line test. Therefore, we masked simulation rendered image with road part instead of using content similarity term. We confirmed our model using masked source image without content similarity term still keeps the location of the road as shown in Figure  5 and performs much better than other models with content similarity term.

We also tried models with unmasked source images as shown in middle row of Figure 1. Though the transferred images of these models look quite similar with target domain images they performed badly at off-line test. Analyzing the reason for that also remains as our future work.

In the section 6, we showed the performance of our model. However, it assumes the successful adversarial learning which leads that the transferred image distribution and the real image distribution is close to each other. Though the visual image supports the assumption, we have showed its closeness by embedding those images to the low dimensional space using t-SNE [15]. As Figure 7 shows, the transferred data is much closer to real data than the source data, which supports our assumption.

Figure 7: T-SNE result.

8 Conclusion

In this paper, we apply domain adaptation with adversarial learning to train the deep neural network model for UAV navigation. It is noticeable that our model show quite good performance though we did not use any labels for real outdoor images. Also, we propose how to use the enormous and accessible simulator data for UAV navigation. In addition to the empirical results, we show the mathematical guarantee for our approach.

References

  • Abadi et al. [2016] Abadi, Martín, Agarwal, Ashish, Barham, Paul, Brevdo, Eugene, Chen, Zhifeng, Citro, Craig, Corrado, Greg S, Davis, Andy, Dean, Jeffrey, Devin, Matthieu, et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467, 2016.
  • Ajakan et al. [2014] Ajakan, Hana, Germain, Pascal, Larochelle, Hugo, Laviolette, François, and Marchand, Mario. Domain-adversarial neural networks. arXiv preprint arXiv:1412.4446, 2014.
  • Arjovsky et al. [2017] Arjovsky, Martin, Chintala, Soumith, and Bottou, Léon. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017.
  • Bousmalis et al. [2016] Bousmalis, Konstantinos, Silberman, Nathan, Dohan, David, Erhan, Dumitru, and Krishnan, Dilip. Unsupervised pixel-level domain adaptation with generative adversarial networks. arXiv preprint arXiv:1612.05424, 2016.
  • Bousmalis et al. [2017] Bousmalis, Konstantinos, Irpan, Alex, Wohlhart, Paul, Bai, Yunfei, Kelcey, Matthew, Kalakrishnan, Mrinal, Downs, Laura, Ibarz, Julian, Pastor, Peter, Konolige, Kurt, et al. Using simulation and domain adaptation to improve efficiency of deep robotic grasping. arXiv preprint arXiv:1709.07857, 2017.
  • Gibbs & Su [2002] Gibbs, Alison L and Su, Francis Edward. On choosing and bounding probability metrics. International statistical review, 70(3):419–435, 2002.
  • Giusti et al. [2016] Giusti, Alessandro, Guzzi, Jérôme, Cireşan, Dan C, He, Fang-Lin, Rodríguez, Juan P, Fontana, Flavio, Faessler, Matthias, Forster, Christian, Schmidhuber, Jürgen, Di Caro, Gianni, et al. A machine learning approach to visual perception of forest trails for mobile robots. IEEE Robotics and Automation Letters, 1(2):661–667, 2016.
  • Goodfellow et al. [2014] Goodfellow, Ian, Pouget-Abadie, Jean, Mirza, Mehdi, Xu, Bing, Warde-Farley, David, Ozair, Sherjil, Courville, Aaron, and Bengio, Yoshua. Generative adversarial nets. In Advances in neural information processing systems, pp. 2672–2680, 2014.
  • Gulrajani et al. [2017] Gulrajani, Ishaan, Ahmed, Faruk, Arjovsky, Martin, Dumoulin, Vincent, and Courville, Aaron. Improved training of wasserstein gans. arXiv preprint arXiv:1704.00028, 2017.
  • Hong et al. [2017] Hong, Yongjun, Hwang, Uiwon, Yoo, Jaeyoon, and Yoon, Sungroh. How generative adversarial nets and its variants work: An overview of gan. arXiv preprint arXiv:1711.05914, 2017.
  • Kahn et al. [2016] Kahn, Gregory, Zhang, Tianhao, Levine, Sergey, and Abbeel, Pieter. Plato: Policy learning using adaptive trajectory optimization. arXiv preprint arXiv:1603.00622, 2016.
  • Koenig & Howard [2004] Koenig, Nathan and Howard, Andrew. Design and use paradigms for gazebo, an open-source multi-robot simulator. In IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2149–2154, Sendai, Japan, Sep 2004.
  • Krizhevsky et al. [2012] Krizhevsky, Alex, Sutskever, Ilya, and Hinton, Geoffrey E. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp. 1097–1105, 2012.
  • Long et al. [2015] Long, Mingsheng, Cao, Yue, Wang, Jianmin, and Jordan, Michael. Learning transferable features with deep adaptation networks. In International Conference on Machine Learning, pp. 97–105, 2015.
  • Maaten & Hinton [2008] Maaten, Laurens van der and Hinton, Geoffrey. Visualizing data using t-sne. Journal of Machine Learning Research, 9(Nov):2579–2605, 2008.
  • Mirowski et al. [2016] Mirowski, Piotr, Pascanu, Razvan, Viola, Fabio, Soyer, Hubert, Ballard, Andy, Banino, Andrea, Denil, Misha, Goroshin, Ross, Sifre, Laurent, Kavukcuoglu, Koray, et al. Learning to navigate in complex environments. arXiv preprint arXiv:1611.03673, 2016.
  • Mroueh & Sercu [2017] Mroueh, Youssef and Sercu, Tom. Fisher gan. arXiv preprint arXiv:1705.09675, 2017.
  • Parrot [2016] Parrot, SA. Parrot bebop 2. Retrieved from Parrot. com: http://www. parrot. com/products/bebop2, 2016.
  • Patel et al. [2015] Patel, Vishal M, Gopalan, Raghuraman, Li, Ruonan, and Chellappa, Rama. Visual domain adaptation: A survey of recent advances. IEEE signal processing magazine, 32(3):53–69, 2015.
  • Ross et al. [2013] Ross, Stéphane, Melik-Barkhudarov, Narek, Shankar, Kumar Shaurya, Wendel, Andreas, Dey, Debadeepta, Bagnell, J Andrew, and Hebert, Martial. Learning monocular reactive uav control in cluttered natural environments. In Robotics and Automation (ICRA), 2013 IEEE International Conference on, pp. 1765–1772. IEEE, 2013.
  • Sadeghi & Levine [2016] Sadeghi, Fereshteh and Levine, Sergey. rl: Real single-image flight without a single real image. arXiv preprint arXiv:1611.04201, 2016.
  • Shen et al. [2017] Shen, Jian, Qu, Yanru, Zhang, Weinan, and Yu, Yong. Adversarial representation learning for domain adaptation. arXiv preprint arXiv:1707.01217, 2017.
  • Shrivastava et al. [2016] Shrivastava, Ashish, Pfister, Tomas, Tuzel, Oncel, Susskind, Josh, Wang, Wenda, and Webb, Russ. Learning from simulated and unsupervised images through adversarial training. arXiv preprint arXiv:1612.07828, 2016.
  • Smolyanskiy et al. [2017] Smolyanskiy, Nikolai, Kamenev, Alexey, Smith, Jeffrey, and Birchfield, Stan. Toward low-flying autonomous mav trail navigation using deep neural networks for environmental awareness. arXiv preprint arXiv:1705.02550, 2017.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
44875
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description