Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective

Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective

Jing Zhang 0000-0003-3516-0111 Wanqing Li Philip Ogunbona University of WollongongNorthfields AveWollongongNSW2522Australia jz960@uowmail.edu.au wanqing@uow.edu.au philipo@uow.edu.au  and  Dong Xu University of SydneySydneyAustralia dong.xu@sydney.edu.au
Abstract.

This paper takes a problem-oriented perspective and presents a comprehensive review of transfer learning methods, both shallow and deep, for cross-dataset visual recognition. Specifically, it categorises the cross-dataset recognition into seventeen problems based on a set of carefully chosen data and label attributes. Such a problem-oriented taxonomy has allowed us to examine how different transfer learning approaches tackle each problem and how well each problem has been researched to date. The comprehensive problem-oriented review of the advances in transfer learning with respect to the problem has not only revealed the challenges in transfer learning for visual recognition, but also the problems (e.g. eight of the seventeen problems) that have been scarcely studied. This survey not only presents an up-to-date technical review for researchers, but also a systematic approach and a reference for a machine learning practitioner to categorise a real problem and to look up for a possible solution accordingly.

1. Introduction

Humans have exceptional ability to transfer learning in one context to another context (Woodworth and Thorndike, 1901; Perkins and Salomon, 1992). Machine learning algorithms mostly inspired by human brains, however, usually require a huge amount of training examples to learn a new model from scratch and often fail to apply the learned model to test data acquired from the scenarios different from those of the training data mainly due to domain divergence and task divergence (Pan and Yang, 2010). This is particularly true in visual recognition (Torralba and Efros, 2011) where external factors such as environments, lighting, background, sensor types, view angles, and post-processing can cause the distribution shift or even feature space divergence of the same task in two datasets or even the tasks, i.e. categories of the objects, are different.

To use previously available data effectively for current tasks with scarce data, models or knowledge learned from one domain have to be transferred to a new domain for the current task. Transfer learning has been actively researched in the past decade and one of its topics, domain adaptation, has been especially extensively researched, where the previous and current tasks are the same. The extensive study has led to about a dozen of tutorial and survey papers published since 2009, from the analysis of the nature of dataset shift (Quionero-Candela et al., 2009) to the formal definition and task-oriented categorization of transfer learning (Pan and Yang, 2010), and to the recent tutorial and survey on deep learning based domain adaptation (Venkateswara et al., 2017a; Csurka, 2017). Most of these survey papers (Margolis, 2011; Moreno-Torres et al., 2012; Beijbom, 2012; Cook et al., 2013; Sun et al., 2015; Shao et al., 2015; Lu et al., 2015; Patel et al., 2015; Weiss et al., 2016; Venkateswara et al., 2017a) are method-driven and provide up to the time a review of the evolution of the technologies. Many of them are on particular topics, for instance, domain adaptation (Margolis, 2011; Beijbom, 2012; Sun et al., 2015; Patel et al., 2015; Venkateswara et al., 2017a; Csurka, 2017), dataset shift (Moreno-Torres et al., 2012), activity recognition (Cook et al., 2013), and speech and language processing (Wang et al., 2015). While these review papers have provided researchers in the field valuable references and contributed significantly to the advances of the technologies, they have not examined the full landscape of transfer learning and maturity of technologies to serve as a reference for machine learning practitioners. Unlike these existing survey papers, this paper takes a new problem-oriented perspective and presents a comprehensive review of transfer learning methods for cross-dataset visual recognition. Specifically,

  • It defines a set of data and label attributes, categorises in a fine-grained way the cross-dataset recognition into seventeen problems based on these attributes, and presents a comprehensive review of the transfer learning methods, both shallow and deep, developed to date for each problem.

  • The paper has also provided an assessment of the suitability of widely used datasets for transfer learning in evaluating algorithms for each of the seventeen problems.

  • The problem-oriented taxonomy has allowed us to examine how different transfer learning approaches tackle each problem, how well each problem has been studied to date and the available solutions to each problem.

  • Through the problem-oriented analysis, challenges and future directions have been identified. Particularly, little studies have been reported on eight of the seventeen problems.

  • This survey not only presents an up-to-date technical review for researchers, but also a systematic approach and a reference for a machine learning practitioner to categorise a real problem and to look up for a possible solution accordingly.

In addition, none of the previous survey papers covers all of the seventeen problems. For instance, Weiss et al. Weiss et al. (2016) focuses on nine (of the seventeen) problems on homogeneous and heterogeneous domain adaptation and transfer learning with heterogeneous label spaces; Venkateswara et al. Venkateswara et al. (2017b) mainly reviewed the literature of two problems in homogeneous domain adaptation using deep-learning; and Csurka Csurka (2017) focuses on seven problems in domain adaptation.

The rest of the paper is organised as follows. Section 2 explains the terminologies used in the paper, defines the problem-oriented taxonomy of cross-dataset recognition, and summarises the transfer learning approaches to cross-dataset recognition. The seventeen problems identified in the taxonomy are categorised into four scenarios: homogeneous feature and label spaces, heterogeneous feature spaces, heterogeneous label spaces and heterogeneous feature and label spaces. Sections 3 through 6 review and analyse respectively the advances of techniques in addressing the problems under the four scenarios. Section 7 discusses and examines the suitability of the most commonly used datasets for cross-dataset transfer learning for all the problems. Section 8 discusses the challenges and future research directions. Section 9 concludes the paper.

2. Overview

This section begins with the definitions of terminologies used throughout the paper and then provides a summary of the approaches that have been developed for transfer learning.

2.1. Terminologies and Definitions

In this paper, we follow the definitions of “domain” and “task” given by (Pan and Yang, 2010).

Definition 2.1 ().

(Domain (Pan and Yang, 2010)) “A domain is defined as , which is composed of two components: a feature space and a marginal probability distribution , where .”

Definition 2.2 ().

(Task (Pan and Yang, 2010)) “Given a specific domain, a task is defined as , which is composed of two components: a label space and a predictive function , where can be seen as a conditional distribution and .”

Definition 2.3 ().

(Dataset) A dataset is defined as , which is a collection of data that belong to a specific domain with a specific task .

Often and are unknown and need to be estimated and learned respectively. If for each sample in the dataset its label is given, is labelled, Otherwise, is unlabelled.

Definition 2.4 ().

(Transfer Learning (Pan and Yang, 2010)) “In general, given a source domain and learning task , a target domain and learning task , transfer learning aims to help improve the learning of the target predictive function in using the knowledge in and , where , or .” Note that a special topic where and is known as Domain Adaptation. Specifically, in the context of cross-dataset recognition, the aim of transfer learning is to learn a robust classifier from a dataset (i.e. target dataset ) by effectively utilising the knowledge offered through other datasets (i.e. source datasets ).

2.2. Problem-oriented Taxonomy of Cross-dataset Recognition

In cross-dataset recognition, there are often two datasets. One, referred to as a source dataset, is used in training and the other, referred to as a target dataset, is to be recognized. Their domains and/or tasks are different and their characteristics determines what methods can or should be used. In this paper, we define a set of attributes to characterise the source or target datasets. These attributes have led to a comprehensive taxonomy of cross-dataset recognition problems that provides a unique perspective for this survey.

  • Attributes on data:

    • Feature space: the consistency of feature spaces (i.e. different feature extraction methods or different data modalities) between the source and target datasets.

    • Data availability: the availability and sufficiency of target data in the training stage.

    • Balanced data: whether the numbers of data samples in each class are balanced.

    • Sequential/Online data: whether the data are sequential/online and evolving over time.

  • Attributes on label:

    • Label availability: the availability of labels in source and target datasets.

    • Label space: whether the data categories of the two datasets are identical.

Based on these attributes, the following four scenarios are defined as the first layer of the problem taxonomy to guide the survey.

  • Homogeneous feature spaces and label spaces: The feature spaces and label spaces of the source and target datasets are identical. But domain divergence (i.e. different data distributions) exists across the source and target datasets.

  • Heterogeneous feature spaces: the feature spaces of the source and target datasets are different (i.e. domain divergence occurs), but their label spaces are the same.

  • Heterogeneous label spaces: the label spaces of the source and target datasets are different (i.e. task divergence occurs), but their feature spaces are the same.

  • Heterogeneous feature spaces and label spaces: both the feature spaces and the label spaces of the source and target datasets are different (i.e. both domain and task divergence occurs).

The problems corresponding to the four scenarios are further divided into sub-problems using other data attributes such as the data being balanced and/or sequential/online. Fig. 1 shows the problem-oriented taxonomy for cross-dataset recognition, which shows seventeen different problems.

Figure 1. A problem-oriented taxonomy for cross-dataset recognition including the number of papers that are found to address the problems.

2.3. Approaches

Many approaches have been developed for transfer learning across datasets (Pan and Yang, 2010) at instance level, i.e. re-weighting some source samples based on their divergence from the target domain, at the feature level, i.e. learning “good" feature representations that have minimum domain shift, and at the classifier level, i.e. learn an optimal target classification model by using the data from both source and target domains as well as the source model. This section summarises several most typical approaches to transfer learning for cross-dataset recognition, including Statistical approach, Geometric approach, Higher-level Representation, Correspondence approach, Class-based approach, Self Labelling, and Hybrid approach. These approaches have been reported explicitly or implicitly in the literature. In particular, the basic assumptions of each approach are analysed and presented in this section. Moreover, several commonly used methods are illustrated under each approach. Due to page limit, only brief description of each approach and its methods is presented. See the supplementary material for details.

Statistical Approach:

is employed in transferring the knowledge at the levels of instances, features and classifiers by measuring and minimizing the divergence of statistical distributions between the source and target datasets. This approach generally assumes sufficient data in each dataset to approximate the respective statistical distributions. The typical methods are Instance re-weighting (Huang et al., 2006), Feature space mapping (Pan et al., 2009) and Classifier parameter mapping (Quanz and Huan, 2009).

Geometric Approach:

bridges datasets according to their geometrical properties. It assumes domain shift can be reduced using the relationship of geometric structures between the source and target datasets. Typical methods include Subspace alignment (Fernando et al., 2013), Intermediate subspaces (Gopalan et al., 2011; Gong et al., 2012), and Manifold alignment (without correspondence) (Cui et al., 2014a).

Higher-level Representation Approach:

aims at finding higher-level representations that are representative, compact, and invariant between datasets. This approach does not require any labelled data, or the existence of correspondence set, but assumes that there exist the domain invariant higher-level representations between datasets. Note that this approach is commonly used together with other approaches for better transfer, but it is also used independently without any mechanism to reduce the domain divergence explicitly. Typical methods are Sparse coding (Raina et al., 2007), Low-rank representation (Shao et al., 2012), Deep Neural Networks (Donahue et al., 2014; Razavian et al., 2014; Yosinski et al., 2014), Stacked Denoising Auto-encoders (SDAs) (Glorot et al., 2011; Chen et al., 2012), and Attribute space (Lampert et al., 2009; Akata et al., 2013).

Correspondence Approach:

uses paired correspondence samples from different domains to construct the relationship between domains. A set of corresponding samples (i.e. the same object captured from different view angles, or by different sensors) are required. The typical methods are Sparse coding with correspondence (Zheng et al., 2012) and Manifold alignment (with correspondence) (Zhai et al., 2010) .

Class-based Approach:

uses label information as a guidance for connecting the source and target datasets. Hence, the labelled data from each dataset are assumed to be available, whether sufficient or not. The commonly used methods include Feature augmentation (Daumé III, 2007), Metric learning (Saenko et al., 2010), Linear Discriminative Model (Yang et al., 2007), and Bayesian Model (Fei-Fei et al., 2006).

Self Labelling:

uses the source domain samples to train an initial model to obtain the pseudo labels of target domain data. Then the target data and their pseudo labels are incorporated to retrain the model. The procedure continues iteratively until convergence. A typical example is Self-training (Dai et al., 2007b; Tan et al., 2009).

Hybrid Approach:

combines two or more above approaches for better transferring of knowledge. Several example combinations are Correspondence and Higher-level representation (Huang and Wang, 2013), Higher-level representation and Statistic (Long et al., 2013a; Long and Wang, 2015; Wei et al., 2016), Statistic and Geometric (Zhang et al., 2017a), Statistic and Self labelling (Dai et al., 2007a), Correspondence and Class-based (Diethe et al., 2008), Statistic and Class-based (Duan et al., 2012a), and Higher-level representation and Class-based (Zhu and Shao, 2014).

In the following sections, we present a comprehensive review on what approaches have been or can be used for the cross-dataset recognition problems shown in Figure 1.

3. Homogeneous Feature Spaces and Label Spaces

In this scenario, and . Hence, the and are generally different in their distributions (). Sufficiently labelled source domain data are generally assumed available and different assumptions are made on the target domain, leading to different sub-problems.

3.1. Labelled Target Dataset

In this problem, a small number of labelled data in target domain are available. However, the labelled target data are generally insufficient for learning an effective classifier. This is also called supervised domain adaptation or few-shot domain adaptation in the literature.

Class-based Approach

The most commonly used approach in supervised domain adaptation is class-based since the labelled data from both domains are available in the training stage. For example, Daumé III Daumé III (2007) propose a feature augmentation based method where each feature is replicated into a high-dimensional space containing the general and domain-specific version.

(1)

where is the source domain data, is the target domain data, is the feature dimension, and are the total number of samples in the source and target domains, respectively.

The idea of supervised metric learning has also been used (Zhang and Yeung, 2010; Perrot and Habrard, 2015). The core idea is to exploit the task relationships between domains to boost the target task. Another group of methods (Yang et al., 2007; Jiang et al., 2008; Xu et al., 2014b) transfer the parameters of discriminative classifiers (e.g. SVM) across datasets. Recently, Motiian et al. Motiian et al. (2017a) propose to create pairs of source and target instances to handle the scarce target labelled data. In addition, they extend adversarial learning (Goodfellow et al., 2014) to align the semantic information of classes.

A more realistic setting is that samples from only a subset of classes are available in the target domain. Then the adapted features are generalized to unseen categories in the target dataset. While some categories are not available in the target dataset, we still assume the same label spaces between the two domains. So we discuss these methods under the problem of homogeneous label spaces. Generally, these methods assume the shift between domains is category-independent. For example, Saenko et al. Saenko et al. (2010) present a supervised metric learning-based method to learn a metric that minimizes the distribution shift by using target labelled data from a subset of categories:

(2)

where are the threshold parameters, and represent the source domain sample and target domain sample, respectively and and represent their corresponding labels, is the distance between and , and is the distance matrix that will be learned. Then the transformation is applied to unseen target test data that may come from different categories from the target training data. Similarly, some recent methods learn to recognize unseen target categories (but have been seen in the source domain) under the deep learning frameworks by exploiting the semantic structure either via soft labels (which is the averaged softmax activations over all source samples in each category) (Tzeng et al., 2015) or by the Siamese architecture (Motiian et al., 2017b). For example, Figure 2 illustrates the network architecture of the domain and task transfer method proposed by Tzeng et. al. (Tzeng et al., 2015), which uses soft labels. In this work (Tzeng et al., 2015), the learned source semantic structure is transferred to the target domain by optimizing the network to produce activation distributions that match those learned for source data.

Figure 2. The network architecture of the domain and task transfer method  Tzeng et al. (2015). (Figure used courtesy of Tzeng et al. (2015))

Self Labelling

Dai et al. Dai et al. (2007b) propose TrAdaBoost to extend boosting-based methods by decreasing the weights of the instances that are most dissimilar to the target distribution in order to weaken their impacts.

Hybrid Approach

The higher-level representation approach and class-based approach have been used together for better cross-dataset representation. For example, the discriminative dictionary can be learned such that the same class samples from different domains have similar sparse codes.  (Zhu et al., 2014; Shekhar et al., 2013). Except for the discriminative dictionary learning, the label information can also be used for guiding the deep neural networks to reduce domain shift. For example, Koniusz et al. Koniusz et al. (2017) fuse the source and target CNN streams at the classifier level, where the scatters of the two network streams of the same class are aligned while the between-class are separated.

3.2. Labelled plus Unlabelled Target Dataset

Compared to the scenario where only limited labelled target data are presented, additional redundant unlabelled target data are also presented in training in this problem (often known as semi-supervised domain adaptation in the literature) to provide additional structural information. This setting is realistic in real-world applications because unlabelled data are easy to obtain.

Class-based Approach

Duan et al. Duan et al. (2012b) extend SVM-based supervised classifier transfer methods with unlabelled target data. They proposed a regularizer which enforces that the learned target classifiers and the pre-learned source classifiers should have the similar decision values on the unlabelled target instances:

(3)

where and represent the decision values of the unlabelled target samples from the target classifier and the -th auxiliary classifier, and are the number of labelled target samples and the total number of target samples, is the weight for measuring the relevance between the -th source domain and the target domain.

Self Labelling

Some researches extend distance-based classifiers, such as the k-Nearest Neighbour  (Tommasi and Caputo, 2013) and Nearest Class Mean  (Csurka et al., 2014) classifiers, to learn the domain invariant metric iteratively. Specifically, Tommasi and Caputo Tommasi and Caputo (2013) present a method that learns a metric per class based on the NBNN algorithm. by progressively selecting target instances and combining it with a subset of the source data while imposing a large margin separation hyperplanes among classes. Similarly, Csurka et al. Csurka et al. (2014) extend the NCM classifier to a Domain Specific Class Means (DSCM) classifier and iteratively add high confidence unlabelled target samples to the training set. A co-training-based method is proposed by (Chen et al., 2011) to facilitate the gradual inclusion of target features and instances in training. This method iteratively learns feature views and a target predictor upon the views.

Hybrid Approach

A group of methods for semi-supervised domain adaptation combines class-based and statistical approach to make use of both labelled and unlabelled target data. The key idea is that the statistical criteria (e.g. MMD metric between source data and unlabelled target data) are used as an additional constraint in discriminative learning methods (e.g. multiple kernel learning (MKL) (Duan et al., 2012d, a), or least square method (Yao et al., 2015)).

Yamada et al. Yamada et al. (2014) generalize the EASYADAPT method (Daumé III, 2007) to semi-supervised setting. They proposed to project input features into a higher dimensional space as well as estimate weights for the training samples based on the ratio of test and training marginal distributions in that space using unlabelled target samples.

3.3. Unlabelled Target Dataset

In this problem, no labelled target domain data are available but sufficient unlabelled target domain data are observable for transfer learning. This problem is also named unsupervised domain adaptation. The unsupervised domain adaptation has attracted increasing attention nowadays, which is certainly more realistic and challenging.

Statistical Approach

The Maximum Mean Discrepancy (MMD) criterion is commonly used in unsupervised domain adaptation. Generally, the MMD distance between domains is reduced by re-weighting the samples (Huang et al., 2006; Sun et al., 2011; Gong et al., 2013a), or mapping to another feature space (Pan et al., 2009; Baktashmotlagh et al., 2013; Long et al., 2013b; Zhang et al., 2017a), or regularizing the source domain classifier using target domain unlabelled data (Quanz and Huan, 2009; Long et al., 2014). For example, Pan et al. Pan et al. (2009) proposed to find a domain invariant feature mapping function such that the marginal distributions between the two domains and in the mapped feature space is small when using the MMD criterion:

(4)

Except for MMD, other statistical criteria, such as Kullback-Leibler divergence (Sugiyama et al., 2008), Hellinger distance (Baktashmotlagh et al., 2014), Quadratic divergence (Si et al., 2010), and mutual information (Shi and Sha, 2012), are also used for comparing two distributions. Sun et al. Sun et al. (2016) propose the CORrelation ALignment (CORAL) to minimize distribution divergence by mapping the covariance of data.

Instead of learning a global transformation, Optimal Transport (Courty et al., 2016) learns a local transformation such that each source datum is mapped to target data and the marginal distribution is preserved.

Rather than assuming single domain in a dataset, some methods assume a dataset may contain several distinctive sub-domains due to the large variations in visual data. For example, Gong et al. Gong et al. (2013b) automatically discover latent domains from multi-source domains to characterize the inter-domain variations and, hence, to construct discriminative models.

Geometric Approach

Gopalan et al. Gopalan et al. (2011) proposed a Sampling Geodesic Flow (SGF) method by sampling intermediate subspace representations between the source and target generative subspaces. The two generative subspaces are viewed as two points on a manifold. Then they sample the intermediate subspaces on the geodesic flow between the two subspaces. Lastly, all the data are mapped to the concatenation of all the subspaces to obtain the final representation. Figure 3 illustrates the SGF method. Gong et al. Gong et al. (2012) extend SGF to a geodesic flow kernel (GFK) method by proposing a kernel method, such that an infinite number of subspaces are integrated to represent the incremental changes. The methods in (Gopalan et al., 2011, 2014) and (Gong et al., 2012, 2014a) open the opportunity for researches to construct intermediate representations to characterize the domain changes. For example, Zhang et al. Zhang et al. (2013b) bridge the source and target domains by inserting virtual views along a virtual path for cross-view recognition. Rather than manipulating on the subspaces, Cui et al. Cui et al. (2014b) represent source and target domains as covariance matrices and interpolate some intermediate covariance matrices to bridge the two domains. Some methods (Ni et al., 2013; Xu et al., 2015) are proposed to generate several intermediate domains by learning the domain-adaptive dictionaries between domains. The idea of intermediate domains is also employed in the deep learning framework (Chopra et al., 2013).

Figure 3. Illustration of the SGF method (Figure used courtesy of Gopalan et al. (2011))

Instead of modelling intermediate domains, some methods align the two domains directly (Fernando et al., 2013; Aljundi et al., 2015; Cui et al., 2014a; Lu et al., 2017). For instance, Fernando et al. Fernando et al. (2013) propose to align the source subspace to the target subspace directly by learning a linear transformation function.

Higher-level Representation

The low-rank criterion is commonly used to learn the domain invariant representations (Jhuo et al., 2012; Shao et al., 2014; Ding et al., 2015a). Generally, these methods assume that the data from different domains lie in a shared low-rank structure.

Bengio Bengio (2012) argue that more transferable features can be learned by deep networks since they are able to extract the unknown factors of variation that are intrinsic to the data. Donahue et al. Donahue et al. (2014) propose the deep convolutional representations named DeCAF, where a deep CNN model is pre-trained using the source dataset (generally large-scale) in a fully supervised fashion. Then they transfer the features (defined by the pre-learned source convolutional network weights) to the target data. The deep auto-encoders are also used for the cross-dataset tasks by exploiting more transferable features by reconstruction (Glorot et al., 2011; Kan et al., 2015; Chen et al., 2012; Jiang et al., 2016; Ghifary et al., 2016b). For instance, Ghifary et al. Ghifary et al. (2016b) propose a Deep Reconstruction-Classification Network (DRCN) to learn a shared deep CNN model for both classification task of the source samples and reconstruction task of the target samples.

Self Labelling

Recently, Panareda Busto and Gall Panareda Busto and Gall (2017) propose an open set domain adaptation problem, where only some of the classes are shared between the source and target datasets. The task is to label all the target samples either by one of the classes shared between the two domains or as unknown. We discuss this setting under the homogeneous label space problem because the unknown classes are simply detected as unknown rather than recognized as certain classes. They solve this problem by first assigning some of the target data with the labels of the known classes and then reducing the shift between the shared classes in the source and target datasets by a subspace alignment method (similar to (Fernando et al., 2013)). The two procedures are learned iteratively.

Hybrid Approach

Combining different approaches generally trigger better transferring of knowledge. Some methods (Zheng et al., 2012; Huang and Wang, 2013) learn two dictionaries on pairs of correspondence samples and encourage the sparse representation of each sample pair to be similar. Some methods use both geometric and statistical approach (Sun and Saenko, 2015; Zhang et al., 2017a). For example, Zhang et al. Zhang et al. (2017a) propose to learn two projections for the source and target domain respectively to reduce the geometrical shift and statistical shift. Differently, Gholami et al. Gholami et al. (2017) jointly learn a low dimensional subspace and a classifier through a Bayesian learning framework.

Though deep networks can generally learn more transferable features (Bengio, 2012; Donahue et al., 2014), the higher level features computed by the last few layers are usually task-specific and are not transferable to new target tasks (Yosinski et al., 2014). Hence, some recent work imposes statistical approach into the deep learning framework (high-level representation approach) to further reduce domain bias. For instance, the MMD loss is incorporated into the objective of the deep models to reduce the divergence of marginal distributions (Tzeng et al., 2014; Long and Wang, 2015; Long et al., 2016; Venkateswara et al., 2017b) (e.g. Figure 4 illustrates the Deep Adaptation Networks (DAN) proposed in (Long and Wang, 2015)) or joint distributions (Long et al., 2017) between domains.

Figure 4. Illustration of the DAN method (Figure used courtesy of Long and Wang (2015))

Instead of using MMD metric, Sun and Saenko Sun and Saenko (2016) extend the CORrelation ALignment (CORAL) method (Sun et al., 2016) that aligns the covariance of the source and target data to a deep learning-based method. Zellinger et al. Zellinger et al. (2017) propose the Central Moment Discrepancy (CMD) method, which align the higher order central moments of distributions through order-wise moment differences. Instead of statistical approach, the self-labelling is also used in deep neural network-based method. Saito et al. Saito et al. (2017) propose an asymmetric tri-training method, where feature extraction layers are used to drive three classifier sub-networks. The first two networks are used to label unlabelled target samples and the third network is to learn the final adapted classifier to operate on the target domain with the pseudo-labels obtained on the first two networks.

The statistical approaches (e.g. MMD distance (Wei et al., 2016; Bousmalis et al., 2016), and divergence (Bousmalis et al., 2016)) are also incorporated into deep autoencoders for learning more transferable features.

Figure 5. Illustration of the ReverseGrad method (Figure used courtesy of Ganin and Lempitsky (2015))

Motivated by adversarial learning (Goodfellow et al., 2014), the GAN-based domain adaptation methods are proposed with the key idea that the JS divergence between domains are reduced (Ganin and Lempitsky, 2015; Ganin et al., 2016; Tzeng et al., 2017; Bousmalis et al., 2017). For example, the gradient reversal algorithm (ReverseGrad) proposed by Ganin and Lempitsky Ganin and Lempitsky (2015) minimizes the -divergence by considering the domain invariance as a binary classification task and employing a gradient reversing strategy (as shown in Figure 5). Tzeng et al. Tzeng et al. (2017) propose to learn separate feature extraction networks for different domains, and a domain classifier is incorporated such that the embeddings produced by the source or target CNN cannot be distinguished. Bousmalis et al. Bousmalis et al. (2017) propose a GAN-based method to adapt the source domain data from the pixel level, such that they are not distinguishable to the target domain data. Differently, Liu and Tuzel Liu and Tuzel (2016) propose a Coupled GAN (CoGAN) method that learns a joint distribution by jointly modelling two GANs, where the first one generates the source data while the second generates the target images. Instead of enforcing samples from different domains to be non-discriminant, the CoGAN enforce the layers that decode high-level features to share the weights so as to enforce the assumption that the images from different domains share the same high-level representations but have different low-level representations.

3.4. Imbalanced Unlabelled Target Dataset

This problem assumes the target domain is class imbalanced and only with unlabelled data. Thus, the statistical approach can be used. This problem is quite common in practice and known as prior probability shift, or imbalanced data in classification. For instance, the abnormal activities (e.g. kick, punch, fight, and fall down) are much less frequent than normal activities (e.g. walk, sit, eat, and drink) in the video surveillance but require higher recognition rate.

Statistical Approach

In the classification scenario, the prior probability () shift was often considered to be a class imbalance problem (Japkowicz and Stephen, 2002; Zhang et al., 2013a). Zhang et al. Zhang et al. (2013a)tackle the prior probability shift by re-weighting the source samples using the similar idea as the Kernel Mean Matching method (Huang et al., 2006). They also define the situation where both and are shifted across datasets and propose a kernel approach to reduce the distribution shift by re-weighting and transforming the source data. It is assumed that the source data are able to be transferred to the target domain by location-scale (LS) transformation (i.e. only differs in the location and scale). Instead of assuming that all the features can be transferred to the target domain by LS transformation, Gong et al. Gong et al. (2016) propose to learn the conditional invariant components through a linear transformation, and then the source samples are re-weighted to reduce shift of and between domains.

Recently, Yan et al. Yan et al. (2017) take both the domain shift and class weight bias across domains into account. To take the class prior probability into account, they introduce class-specific weights. Specifically, the domain adaptation is performed by iteratively generating the pseudo-labels to the target samples, learning the source class weights, and tuning the deep CNN model parameters.

3.5. Sequential/Online Labelled Target Data

In practice, the target data can be sequential video streams or continuous evolving data. The distribution of the target data may also change with time. Since the target data are labelled, this problem is named supervised sequential/online domain adaptation.

Self Labelling

Xu et al. Xu et al. (2014c) assume a weak-labelling setting and propose an incremental method for object detection across domains. Specifically, the adaptation model is a weighted ensemble of the source and target classifiers and the ensemble weights are updated with time.

3.6. Sequential/Online Unlabelled Target Data

Similar to the problem in 3.5, the target data are sequential in this problem, however, no labelled target data is available, which is named unsupervised sequential/online domain adaptation and related to but different from concept drift. The concept of drift (Gama et al., 2014) refers to changes in the conditional distribution (), while the marginal distribution () stays unchanged, whereas in sequential/online domain adaptation the changes between the two domains are caused by the changes of the input data distribution.

Geometric Approach

Hoffman et al. Hoffman et al. (2014) extend the Subspace Alignment method (Fernando et al., 2013) to handle continuous evolving target domain, as shown in Figure 6. Both the subspaces and subspace metrics that align the two subspaces are updated after each new target sample is received. Bitarafan et al. Bitarafan et al. (2016) tackle the continuously evolving target domain using the idea of GFK (Gong et al., 2012) to construct linear transformation. The linear transformation is updated after a new batch of unlabelled target domain data come. Each batch of arrived target data are classified after the transformation and included in the source domain for recognizing the next batch of data.

Figure 6. Illustration of the continuous domain adaptation method Hoffman et al. (2014) (Figure used courtesy of Hoffman et al. (2014))

Self Labelling

Jain and Learned-Miller Jain and Learned-Miller (2011) address the online adaptation in the face detection task by adapting pre-trained classifiers using a Gaussian process regression scheme. The intuition is that the “easy-to-detect” faces can help the detection of “hard-to-detect” faces by normalizing the co-occurring “hard-to-detect” faces and thus reducing their difficulty of detection. Xu et al. Xu et al. (2016) propose an online domain adaptation model for multiple object tracking using a two-level hierarchical tree framework, where the leaf nodes correspond to the object detectors while the root node corresponds to the class detector. The adaptation is executed in a progressive manner.

3.7. Unavailable Target Data

This problem is also named domain generalization in literature, where the target domain data are not presented for adaptation. Thus, multiple source datasets are generally required to learn the dataset invariant knowledge that can be generalized to a new dataset. Note that domain generalization is distinguished from multi-source domain adaptation (MSDA)(Sun et al., 2015; Duan et al., 2009; Hoffman et al., 2012; Duan et al., 2012b; Gong et al., 2013b; Xu et al., 2018) since MSDA generally requires the access to the target data for adaptation. We will discuss transfer learning from multiple sources in details in Section 8.3.

Higher-level Representation

Most of the existing work tackle this problem by learning domain invariant and compact representation from multiple source domains (Blanchard et al., 2011; Khosla et al., 2012; Muandet et al., 2013; Fang et al., 2013; Stamos et al., 2015; Ghifary et al., 2015; Ghifary et al., 2016a; Motiian et al., 2017b; Li et al., 2017g). For example, Khosla et al. Khosla et al. (2012) explicitly model the bias of each source domain and try to estimate the weights for the unbiased data by removing the source domain biases. Muandet et al. Muandet et al. (2013) propose the Domain-Invariant Component Analysis (DICA), a kernel-based method, to learn an invariant mapping that reduces the domain shift and preserve discriminative information at the same time. Fang et al. Fang et al. (2013) propose an unbiased metric learning approach to learn unbiased metric from multiple biased datasets. Ghifary et al. Ghifary et al. (2015) propose a Multi-Task Autoencoder (MTAE) method. It substitutes artificially induced corruption in standard denoising autoencoder with some specific variations of the objects (e.g. rotation) to form multiple views. Hence, MTAE learns representations that are invariant to multiple related domains.

Ensembling classifiers learned from multiple sources is also used for generalizing to unseen target domain (Xu et al., 2014a; Niu et al., 2015a, b; Li et al., 2017f). Xu et al. Xu et al. (2014a) propose to reduce the domain shift in an exemplar-SVMs framework by regularizing positive samples from the same latent domain to have similar likelihoods from each exemplar classifier. Similarly, Niu et al. Niu et al. (2015a) extend this idea to the source domain samples with multi-view features. Niu et al. Niu et al. (2015b) explicitly discover the multiple hidden domains (Gong et al., 2013b), and then an ensemble of classifiers is formed by learning a single classifier for each individual category in each discovered hidden domain.

4. Heterogeneous Feature Spaces

This section discusses the problems that and are different due to , but . The different feature spaces can be generated from different data modalities or different feature extraction methods. Similar to the scenario defined in Section 3, sufficient labelled source domain data are assumed to be available in the following sub-problems.

4.1. Labelled Target Dataset

This problem assumes limited labelled target data are presented for adaptation. This problem is named supervised heterogeneous domain adaptation.

Higher-level Representation

Some methods assume that only the feature spaces are different while the distributions are the same between source and target datasets. Since the labelled data in the target dataset are scarce, Zhu et al. Zhu et al. (2011) propose to use the auxiliary heterogeneous data that contain both modalities from Web to extract the semantic concept and find the shared latent semantic feature space between different modalities.

Class-based Approach

The class-based approach has also been used to connect heterogeneous feature spaces. Finding the relationship between different feature spaces can be seen as translating between different languages. Hence, Dai et al. Dai et al. (2009) propose a translator using a language model to translate between different data modalities or feature spaces by borrowing the class label information. Kan et al. Kan et al. (2012) propose a multi-view discriminant analysis method that learns view-specific linear mappings for each view to find a view-invariant space by using label information: ,where the between-class variation from all views are maximized while the within-class variation from all views are minimized, are the optimized transformations for different views. Manifold alignment method (Wang and Mahadevan, 2011) is also used for heterogeneous domain adaptation with the class-based approach.

Inspired by (Daumé III, 2007), the feature augmentation based method has also been proposed (Duan et al., 2012c; Li et al., 2014) for heterogeneous domain adaptation, which transforms the data from two domains into a shared subspace, and then two transformations are proposed such that the transformed features in the subspace are augmented with the original data as well as zeros (as shown in Figure 7).

Figure 7. Illustration of a feature augmentation method for heterogeneous domain adaptation. (Figure used courtesy of Li et al. (2014))

Kulis et al. Kulis et al. (2011) extend (Saenko et al., 2010) to learn an asymmetric mapping that transforms samples between domains using labelled data from both domains, with the similar assumption as (Saenko et al., 2010) that the label spaces of target training set and target test set are non-overlapping subsets of source label space. Different from previous metric learning based domain adaptation that learns the asymmetric feature transformation between heterogeneous features (Kulis et al., 2011), the asymmetric metric of classifiers can also be learned to bridge source and target classifiers on heterogeneous features (Zhou et al., 2014).

Hybrid Approach

The first group of work focuses on cross-modal representation learning by combing class-based and higher level representation approaches. Gong et al. Gong et al. (2014b) propose a three-view Canonical Correlation Analysis (CCA) model that explicitly incorporates the high-level semantic information (i.e. high-level labels or topics) as a third view. A recent work(Wang et al., 2017) incorporates the adversarial learning to the supervised representation learning for cross-modal retrieval.

Another line of research assumes that both the feature spaces and the data distributions are different. Shekhar et al. Shekhar et al. (2015) extend (Shekhar et al., 2013) to heterogeneous feature spaces, where the two projections and a latent dictionary are jointly learned to simultaneously find a common discriminative low-dimensional space and reduce the distribution shift. Similarly, Sukhija et al. Sukhija et al. (2016) assume the label distributions between domains are shared. Then the shared label distributions are used as pivots to derive a sparse projection between the two domains.

4.2. Labelled plus Unlabelled Target Dataset

In this problem, both limited labelled and sufficient unlabelled target data are presented, which is named semi-supervised heterogeneous domain adaptation.

Statistical Approach

Tsai et al. Tsai et al. (2016) propose the Cross-Domain Landmark Selection (CDLS) method for heterogeneous domain adaptation (HDA) using the statistical approach (MMD). Specifically, the CDLS method derives a heterogeneous feature transformation which results in a domain-invariant subspace for associating the heterogeneous domains. and assigns the weight to each instance according to their adaptation ability using both labelled and unlabelled target samples.

Correspondence Approach

Zhai et al. Zhai et al. (2010) assume in addition to a set of labelled correspondence pairs between the source and target datasets, some unlabelled data from both datasets are also available. Specifically, given a set of correspondence samples C between the two domains, one can learn the mapping matrices and for the source and target sets respectively in order to preserve the correspondence relationships after mapping:

(5)

where and represent the source domain sample and the target domain sample, respectively, and are the manifold regularization terms which are used to preserve the intrinsic manifold structures of the source and target domains.

Class-based Approach

Xiao and Guo Xiao and Guo (2015) propose a kernel matching method, where a kernel matrix of the target domain is matched to a source domain sub-matrix by exploiting the label information such that the target samples are mapped to similar source samples. The unlabelled target samples are expected to be aligned with the source samples from the same class with the guides of labelled target samples via the function of kernel affinity measures between samples.

Hybrid Approach

Wu and Ji Wu and Ji (2016) introduce a constrained deep transfer feature learning method by incorporating the correspondence into the high-level representation approach. Specifically, several pairs of source and target samples are used to capture the joint distribution and bridge the two domains. Then a large amount of additional source samples are transferred to the target domain through pseudo labelling for further target domain feature learning.

4.3. Unlabelled Target Dataset

This problem assumes no labelled target domain data is available. We name this problem as unsupervised heterogeneous domain adaptation. In this problem, the feature spaces could be completely different between datasets. It can also be assumed that the source data consist of multiple modalities while the target data only contain one of the modalities, or vice versa.

Statistical Approach

Chen et al. Chen et al. (2014) and Li et al. Li et al. (2017b) assume the source datasets contain multiple modalities and target dataset only contains one modality and the distribution shift between datasets also exists. Specifically, the statistical approach (e.g. MMD) is used such that the source and target common modalities are projected to a shared subspace to reduce the distribution mismatch. In the meantime, the multiple source modalities are also transformed to the same representation in the shared space. They iteratively refine the shared space and the robust classifier.

Correspondence Approach

The co-occurrence data between different feature spaces or modalities have been employed for heterogeneous domain adaptation (Qi et al., 2011; Yang et al., 2016).

Hybrid Approach

The correspondence approach or statistical approach are generally incorporated into higher-level representation approach for transferring between data modalities or feature spaces.

Canonical Correlation Analysis (CCA)(Anderson, 1984) is a standard approach to learning two linear projections of two sets of data that are maximally correlated. Neither supervised data nor the paired data are required. Many cross-modal recognition or retrieval methods incorporate the idea of CCA(Andrew et al., 2013; Feng et al., 2014; Yan and Mikolajczyk, 2015) into deep models. Cross-media multiple deep networks (CMDN)(Peng et al., 2016) jointly preserve the intra-media and inter-media information and then hierarchically combine them for learning the rich cross-media correlation. Castrejón et al. Castrejón et al. (2016) introduce a cross-modal representation method across RGB modality, sketch modality, clipart, and textual descriptions of indoor scenes. The cross-modal convolutional neural networks are regularized using statistical regularization so that they have a shared representation that is invariant to different modalities.

The paired correspondence data are used in (Gupta et al., 2016), where a cross-modal supervision transfer method is proposed. The deep CNNs are pre-trained on the source data (e.g. a large-scale labelled RGB dataset). Then the paired target data (unlabelled RGB and depth image pairs) are used for transferring the source parameters to the target networks by constraining the paired samples from different modalities to have the similar representations.

A line of research focuses on the task of translation between different domains. For example, in machine translation between languages, the sentence pairs are presented in the form of a parallel training corpus for learning the translation system. Traditional translation system (Koehn et al., 2003) is generally phrase-based, whose sub-components are usually learned separately. Differently, a newly emerging approach, named Neural machine translation (Kalchbrenner and Blunsom, 2013; Sutskever et al., 2014; Cho et al., 2014; Bahdanau et al., 2015), constructs and trains a neural network that inputs a sentence and outputs the translated sentence.

Similarly, in the computer vision domain, image-to-image translation (Isola et al., 2017) has also been extensively exploited, which aims at converting an image from one representation of a given scene to another (e.g. texture synthesis (Li and Wand, 2016), sketch to photograph (Isola et al., 2017), RGB to depth (Gupta et al., 2016), time hallucination (Shih et al., 2013; Laffont et al., 2014; Isola et al., 2017), image to semantic labels (Long et al., 2015; Eigen and Fergus, 2015; Xie and Tu, 2015), stimulated to real image (Shrivastava et al., 2017), style transfer (Li and Wand, 2016; Wang and Gupta, 2016; Gatys et al., 2016; Johnson et al., 2016; Zhang et al., 2016a), and general image-to-image translation (Liu and Tuzel, 2016; Isola et al., 2017; Yi et al., 2017; Kim et al., 2017; Zhu et al., 2017; Benaim and Wolf, 2017; Li et al., 2017c; Liu et al., 2017)). The key idea for tackling these tasks is to learn a translation model between paired (correspondence approach) or unpaired samples (statistical approach) from different domains. The recent deep learning based techniques have greatly advanced the image-to-image translation task. For example, the deep convolutional neural networks based methods (Long et al., 2015; Xie and Tu, 2015; Eigen and Fergus, 2015; Gatys et al., 2016; Johnson et al., 2016; Zhang et al., 2016a), and the Generative Adversarial Networks (GANs (Goodfellow et al., 2014)) based methods (Wang and Gupta, 2016; Li and Wand, 2016; Liu and Tuzel, 2016; Shrivastava et al., 2017; Isola et al., 2017; Yi et al., 2017; Kim et al., 2017; Zhu et al., 2017; Benaim and Wolf, 2017; Li et al., 2017c; Liu et al., 2017) have been exploited for learning the translation model. Though the original purposes of some of these work on translation between domains may not be cross-dataset recognition, the ideas can be borrowed for cross-modality or cross feature spaces recognition. If a proper translation between domains can be obtained, the target task can be boosted by the translated source domain data.

5. Heterogeneous Label Spaces

This section discusses the problems that and . For example, in the classification tasks, when the label spaces between datasets are different, there still exists shared knowledge between previous categories (e.g. horse) and new categories (e.g. zebra) that can be used for learning new categories. The source domain is assumed to be labelled except for the last sub-problem (Section 5.5).

5.1. Labelled Target Dataset

This setting is commonly used in deep learning context. In practice, the deep networks are rarely trained from scratch (with random initialization), since the target datasets rarely have sufficient labelled data. Thus, transfer learning is generally used. The pre-trained deep models from a very large source dataset are used either as an initialization (then fine-tune the model according to the target data) or as a fixed feature extractor for the target, which is generally different from the original task (i.e. different label spaces).

The fine-tuning procedure is similar to one-shot learning or few-shot learning. The key difference is that the available target data are sufficient for the target task in fine-tuning but in few-shot learning, the target data are generally rare (e.g. only one sample per class in the extreme case). The few-shot learning also has close connection with multi-task learning. The difference is that one-shot learning emphasizes on the recognition of the target data with limited labelled data while the objective of multi-task learning is to improve all the tasks with good training data in each task.

Higher-level Representation Approach

Since the training of deep learning models requires a large scale dataset to avoid overfitting, the transfer learning techniques (Yosinski et al., 2014) can be used for small scale target datasets. The most commonly used transfer learning technique is to initialize the weights from a pre-trained model and then the target training data are used to fine-tune the parameters for the target task. When the pre-trained source model is used as the initialization, two strategies can be employed. First is to fine-tune all the layers of the deep neural network, while the second strategy is to freeze several earlier layers and only fine-tune the later layers to reduce the effects of overfitting. This is inspired by the observation that the features extracted from the early layers show more general features (e.g. edge or color) that are transferable to different tasks. However, the later layers are gradually more specific to the details of the original source tasks. Other transfer methods (Donahue et al., 2014; Razavian et al., 2014) directly use the pre-trained deep convolutional nets (normally after removing the last one or two fully connected layers) on a large dataset (e.g. ImageNet (Deng et al., 2009)) as a fixed feature extractor for the target data.

Note that when the pre-trained deep models are used as an initialization or a fixed feature extractor in the deep learning frameworks, only the pre-trained weights need to be stored without the need of storing the original large scale source data, which is appealing.

Class-based Approach

Patricia and Caputo Patricia and Caputo (2014) treat the pre-trained models from multi-source domains as experts to augment the target features. The output confidence values of prior models are treated as features and the features from the target samples are augmented with these confidence values to build a target classifier. Several classifier-based methods are proposed to transfer the parameters of classifiers using generative models (Fei-Fei et al., 2006; Lake et al., 2011), or discriminative models (Tommasi et al., 2010; Aytar and Zisserman, 2011; Ma et al., 2014; Jie et al., 2011). The key idea is using source models as prior knowledge to regularize the models of the target task. These methods are also called the Hypothesis Transfer Learning (HTL) since it assumes no explicit access to the source domain data and only uses source models learned from a source domain. The HTL has been theoretically analysed (Kuzborskij and Orabona, 2013; Wang and Schneider, 2014; Du et al., 2017)

Hybrid Approach

Recently, the deep learning based approaches have been proposed for few-shot learning, most of which are metric learning based methods. One early neural network approach to one-shot learning was provided by Siamese networks (Koch et al., 2015), which employs a structure to rank similarity between inputs. Vinyals et al. Vinyals et al. (2016) propose the matching networks, where a differentiable neural attention mechanism is used over a learned embedding of the limited labelled target data. This method can be considered as a weighted nearest-neighbour classifier in an embedded space. Snell et al. Snell et al. (2017) transform the input into an embedding space by proposing a prototypical network and the prototype from each class is taken as the mean of the embedded support set. Differently, Ravi and Larochelle Ravi and Larochelle (2017) propose a meta-learning-based few-shot learning method, where a meta-learner LSTM (Hochreiter and Schmidhuber, 1997) model is used to produce updates for training the few-shot neural network classifier. Given a few target labelled examples, this approach can generalize well on the target set.

5.2. Unlabelled Target Dataset

Some researches also try to tackle the heterogeneous label space problem by assuming that only unlabelled target data are presented. This problem can be named as unsupervised transfer learning.

Higher-level Representation

The higher-level representation approach is generally used for this problem. Two different scenarios are considered in literature.

The first scenario assumes that only the label spaces between datasets are disjoint while the distribution shift is not considered. Since no labelled target data are available, the unseen class information is generally gained from a higher level semantic space shared between datasets. For example, some research assumes that the human-specified high-level semantic space (e.g. attributes (Palatucci et al., 2009), or text descriptions (Reed et al., 2016)) shared between datasets are available. Given a defined attribute or text description ontology, a vector in the semantic space can be used for representing each class. However, it is expensive to acquire the attribute annotations or text descriptions. Hence, to avoid human involved annotations, another strategy learns the semantic space by borrowing the large and unrestricted, but freely available, text corpora (e.g. Wikipedia) to derive a word vector space (Frome et al., 2013; Mikolov et al., 2013; Socher et al., 2013). The related work on semantic space (e.g. attributes, text descriptions, or word vector) will be further discussed in Section 5.4, since the target data are generally not required when the semantic space is involved.

The second scenario assumes that apart from the different label spaces, the domain shift (i.e. the distribution shift of features) also exists between datasets (Fu et al., 2015; Li et al., 2015; Kodirov et al., 2015; Wang et al., 2016; Zhang and Saligrama, 2016b; Ye and Guo, 2017; Xu et al., 2017b). This is named the projection domain shift problem by (Fu et al., 2015). For example, as illustrated in Figure 8, both zebra and pig have the same attribute ’hasTail’, but the visual appearances and the distributions of the tails of zebra and pig are very different. To reduce the domain shift explicitly, the training data (unlabelled) in the target domain are generally required to be available. For example, Fu et al. Fu et al. (2015) introduce a multi-view embedding space in a transductive setting, such that different semantic views are aligned. Kodirov et al. Kodirov et al. (2015) propose a regularised sparse representation framework that utilizes the target class prototypes estimated from target images to regularise the projections of the target data and thus overcomes the projection domain shift problem.

Figure 8. Examples of projection domain shift.(Figure used courtesy of Fu et al. (2015))

5.3. Sequential/Online Labelled Target Data

This problem assumes the target data are sequential and can be from different classes, which is also called sequential/online transfer learning, and closely related to lifelong learning (Thrun, 1998; Ruvolo and Eaton, 2013; Li and Hoiem, 2016). Both concepts focus on the continuous learning processes for evolving tasks. However, sequential/online transfer learning emphasizes on how to improve the target domain performance (without sufficient target training data), but lifelong learning tries to improve the future target task (with sufficient target training data) as well as all the past tasks (Chen et al., 2015). Also, the lifelong learning can be seen as incremental/online multi-task learning.

Self Labelling

Nater et al. Nater et al. (2011) address an action recognition scenario where the unseen activities to be recognized only have one labelled sample per new activity. They build a multi-class model which uses the prior knowledge of seen classes and progressively learns the new classes. Then the newly labelled activities are integrated into the previous model to update the activity model. Zhao and Hoi Zhao and Hoi (2010) propose an ensemble learning based online transfer learning method (OTL) that learns a classifier in an online fashion using the target data, and combines it with the pre-learned source classifier. The combination weights are tuned dynamically based on the loss between the ground-truth label of the incoming sample and the current prediction. Tommasi et al. Tommasi et al. (2012) then extended OTL (Zhao and Hoi, 2010) and addressed the case of online transfer learning from multiple sources.

5.4. Unavailable Target Data

This problem is also named zero-shot learning in literature, where unseen target categories are to be recognized without having access to the target data. Different from domain generalization (see Section 3.7), the categories of unseen target data are different from the source categories in zero-shot learning. As mentioned in Section 5.2, the unseen categories can be generally connected via some auxiliary information, such as a common semantic space.

Higher-level Representation

Most of the methods for this problem rely on the existence of a labelled source dataset of seen categories and the prior knowledge about the semantic relationship between the unseen and seen categories. In general, the seen and unseen categories are correlated in a high-level semantic space. Such a semantic space can be an attribute space (Palatucci et al., 2009), text description space (Reed et al., 2016), or a word vector space (Frome et al., 2013; Mikolov et al., 2013; Socher et al., 2013). Since multiple semantic spaces are often complementary to each other, some methods are proposed to fuse multiple semantic spaces (Akata et al., 2015; Zhang et al., 2017b).

The attribute space is the most commonly used intermediate semantic space. The attributes are defined as properties observable in images, which are described with human-designated names such as “white”, “hairy”, “four-legged”. Hence, in addition to label annotation, the attribute annotations are required for each class. However, the attributes are annotated per-class rather than per-image. Thus, the effort to annotate a new category is small. Two main strategies are proposed for recognizing unseen object categories using attributes. The first is recognition using independent attributes, consists of learning an independent classifier per attribute (Lampert et al., 2009; Palatucci et al., 2009; Kumar et al., 2009; Liu et al., 2011; Parikh and Grauman, 2011). At test time, the attribute values for test data are predicted using the independent classifiers and the labels are then inferred. Since attribute detectors are expected to generalize well on both seen and unseen categories, some research is devoted to discovering discriminant attributes (Rastegari et al., 2012; Chen and Grauman, 2014; Qin et al., 2017), or modelling the uncertainty of attributes (Wang and Ji, 2013; Jayaraman and Grauman, 2014), or robustly detecting attributes from images (Gan et al., 2016; Bucher et al., 2016). However, Akata et al. Akata et al. (2013) argue that the attribute classifiers in previous works are learned independently of the end-task, and thus they may be able to predict the attributes from new images but may not be able to effectively infer the classes. Hence, the second strategy is recognition by assuming a fixed transformation (W) between the attributes and the class labels (Akata et al., 2015; Romera-Paredes and Torr, 2015; Zhang and Saligrama, 2015, 2016a; Akata et al., 2016; Qiao et al., 2016; Xian et al., 2016; Li et al., 2017d) to learn all attributes simultaneously: , where and represent image and class embeddings, both are given. To sum up, the attribute-based zero-shot learning methods are promising for recognizing unseen classes, while with a key drawback that the attribute annotations are still required for each class. Instead of using attributes, the second semantic space is image text descriptions (Reed et al., 2016), which provides a natural language interface. However, similar to attribute space, the expensive manual annotation is required for obtaining the good performance. The third semantic space is the word vector space (Frome et al., 2013; Mikolov et al., 2013; Socher et al., 2013; Lei Ba et al., 2015), which is derived from a huge text corpus and generally learned by a deep neural network. The word vector space is attractive since extensive annotations are not required for obtaining the semantic space.

5.5. Unlabelled Source Dataset

This problem assumes that the source data are unlabelled but the contained information (e.g. basic visual patterns) can be used for target tasks, which is known as self-taught learning.

Higher-level Representation

Raina et al. Raina et al. (2007) firstly presented the idea of “self-taught learning”. They learn the sparse coding from the source data to extract higher-level features. Some variations of Raina et al. Raina et al. (2007)’s method are proposed either by generalizing the Gaussian sparse coding to exponential family sparse coding (Lee et al., 2009) , or by taking the supervision information contained in labelled images into consideration (Wang et al., 2013). Moreover, Kumagai Kumagai (2016) provide a theoretical analysis for self-taught learning with the focus on discussing the learning bound of sparsity-based methods.

The idea of self-taught learning has also been used in deep learning framework, where the unlabelled data are used for pre-training the network to obtain good starting point of parameters (Le et al., 2011; Gan et al., 2014; Kuen et al., 2015). For instance, Gan et al. Gan et al. (2014) use the unlabelled samples to pre-train the first layer of Convolutional deep belief network (CDBN) for initializing the network parameters. Kuen et al. Kuen et al. (2015) extract the domain-invariant features from unlabelled source image patches for the tracking tasks using stacked convolutional autoencoders.

6. Heterogeneous Feature Spaces and Label Spaces

In this section, a more challenging scenario is discussed, where and . There is little work regarding this scenario due to the challenges and the common assumption that sufficient source domain labelled data is available.

6.1. Labelled Target Dataset

This problem assumes the labelled target data are available. We name this problem as heterogeneous supervised transfer learning.

Higher-level Representation

Rather than assuming completely different feature spaces, most methods in this setting assume that the source domain contains data with multi-modality but the target domain only has one of the source domain modalities. Ding et al. Ding et al. (2015b) propose to uncover the missing target modality by finding similar data from the source domain, where a latent factor is incorporated to uncover the missing modality based on the low-rank criterion (as illustrated in Figure 9). Similarly, Jia et al. Jia et al. (2014) propose to transfer the knowledge of RGB-D (RGB and depth) data to the dataset that only has RGB data. They applied the latent low-rank tensor method to discover the common subspace of the two datasets.

Hybrid Approach

Hu and Yang Hu and Yang (2011) assume the feature spaces, the label spaces, as well as the underlying distributions are all different between source and target datasets and propose to transfer the knowledge between different activity recognition tasks by learning a mapping between different sensors. They adopt the similar idea of translated learning (Dai et al., 2009) to find a translator between different feature spaces using statistical approach (e.g. JS divergence). Then the Web knowledge is used to link the different label spaces using self-labelling.

Figure 9. Example of multiple source modalities and one target modality.(Figure used courtesy of Ding et al. (2015b))

6.2. Sequential/Online Labelled Target Data

This problem assumes the sequential/online target data have different feature space with source data, which is named as heterogeneous sequential/online transfer learning.

Self Labelling

As mentioned in Section 5.3, Zhao and Hoi Zhao and Hoi (2010) propose the OTL method for online transfer learning. They also consider the case of heterogeneous feature spaces by assuming the source domain feature space to be a subspace of the target domain feature space. Then a multi-view approach is proposed by adopting a co-regularization principle of online learning of two target classifiers simultaneously from the two views (the source domain feature space and the new space). The unseen target example is classified by the combination of the two target classifiers.

7. Datasets

Table 1 lists the commonly used visual datasets for transfer learning. They are categorised into object recognition, Hand-Written digit recognition, face recognition, person re-identification, scene categorization, action recognition and video event detection. In the table, the ✓indicates the dataset has been evaluated on the corresponding problem while the # indicates the datasets that have the potential to be used in the evaluation of the algorithms for the problem though reported results are not publicly available to our knowledge. Due to the page limit, readers are referred to the supplementary material and the references for more detailed information of the datasets.

Datasets P3.1 P3.2 P3.3 P3.4 P3.5 P3.6 P3.7 P4.1 P4.2 P4.3 P5.1 P5.2 P5.3 P5.4 P5.5 P6.1 P6.2

Object

Office(Saenko et al., 2010)
Office+Caltech(Gong et al., 2012) # #
Cross-dataset testbed(Tommasi and Tuytelaars, 2014) # # # # #
Office-Home(Venkateswara et al., 2017b) #
VLCS(Khosla et al., 2012) #
ImageCLEF-DA(Caputo and Patricia, 2014) # # #
PACS(Li et al., 2017g) # # #
CIFAR-10 v.s. STL-10(French et al., 2017) # #
RGB-DCaltech256(Chen et al., 2014) # # # #
Syn Signs v.s. GTSRB(Ganin and Lempitsky, 2015) # #
NUS-WIDE(Chua et al., 2009) # #
Wikipedia dataset(Pereira et al., 2014) # #
Pascal Sentence(Rashtchian et al., 2010) # #
MSCOCO(Lin et al., 2014) # #
aP&Y(Farhadi et al., 2009)
AwA(Lampert et al., 2009)
Caltech-UCSD CUB(Wah et al., 2011)
Caltech-256(Tommasi et al., 2012)
Car over time(Hoffman et al., 2014)
STL-10 dataset(Coates et al., 2011)
LabelMe NUS-WIDE(Wang et al., 2013)
Outdoor scene v.s. Caltech101(Raina et al., 2007)

Digit&Character

MNIST v.s. MNIST-M(Ganin and Lempitsky, 2015) #
MNIST v.s. SVHN(Ganin and Lempitsky, 2015) #
USPS v.s. SVHN(Ganin and Lempitsky, 2015) #
SYN DIGITS v.s. SVHN(Ganin and Lempitsky, 2015) #
Omniglot(Lake et al., 2011)
Digits v.s. English characters(Raina et al., 2007)
English characters v.s. Font characters(Raina et al., 2007)

Face

CMU Multi-PIE(Gross et al., 2010)
CMU-PIE v.s. Yale B(Ding et al., 2015b)
Oulu-CASIA NIR&VIS v.s. BUAA-VisNir(Ding et al., 2015b)
CUHK Face Sketch(Wang and Tang, 2009) #
CASIA NIR-VIS 2.0(Li et al., 2013) # #
ePRIP VIS-Sketch(Mittal et al., 2014) # #

Person

VIPeR(Gray et al., 2007)
CUHK02(Li and Wang, 2013)
PRID(Hirzer et al., 2011)
ILIDS(Zheng, 2009)
CAVIAR(Cheng et al., 2011)
3DPeS(Baltieri et al., 2011)

Scene

CMPlaces(Castrejón et al., 2016) # #
SUN Attribute(Patterson et al., 2014)
Scene over time(Hoffman et al., 2014)
NYUD2(Gupta et al., 2016)

Action

UCF YouTube v.s. HMDB51(Zhu and Shao, 2014)
KTH v.s. UCF YouTube(Ma et al., 2014)
KTH v.s. CareMedia(Ma et al., 2014)
KTH MSR Action(Cao et al., 2010)
HumanEva v.s. KSA(Ma et al., 2014)
A combination of KTH, Weizmann, UIUC(Liu et al., 2011)
Multiview IXMAS dataset (Weinland et al., 2007)
N-UCLA Multiview Action3D(Wang et al., 2014)
ACT dataset (Cheng et al., 2012; Niu et al., 2015a)
MSR pair action 3D MSR daily(Jia et al., 2014)
Transferring Activities(Nater et al., 2011)

Event

TRECVID 2005(Yang et al., 2007)
TRECVID 2010&2011(Ma et al., 2012)
TRECVID MED 13(Xu et al., 2017a)
ImageNetTRECVID 2011(Tang et al., 2012) # #
ImageNetLabelMe Video(Tang et al., 2012) # #
Table 1. Suitability of the widely used datasets where the ✓indicates the dataset has been used the corresponding problems while the # indicates the datasets can be potentially used for the problem. Problem notations: P3.1, supervised domain adaptation (DA); P3.2, Semi-supervised DA; P3.3, Unsupervised DA; P3.4, Supervised online DA; P3.5, Supervised online DA; P3.6, Unsupervised online DA; P3.7, Domain generalization; P4.1, Supervised Heterogeneous DA; P4.2, Semisupervised Heterogeneous DA; P4.3, Unsupervised Heterogeneous DA; P5.1, Few-shot Learning; P5.2, Unsupervised transfer learning (TL); P5.3, Online TL; P5.4, Zero-shot Learning; P5.5, Self-taught Learning; P6.1, Heterogeneous TL; P6.2, Heterogeneous online TL.

8. Challenges and Future Directions

Transfer learning is a promising and important approach to cross-dataset visual recognition and has been extensively studied in the past decades with much success. Figure 1 shows the problem-oriented taxonomy and the statistics on the number of papers for each problem has showed that most previous works concentrate on a subset of problems presented in Figure 1. Specifically, only nine out of the seventeen problems are relatively well studied where the source and target domains share at least either their feature spaces or label spaces, the source domain data are labelled and balanced, target domain data are balanced and non-sequential. The rest eight problems especially those where the target data is imbalanced and sequential are much less explored. Such a landscape together with the recent fast-advancing deep learning approach has revealed many challenges and opened many future opportunities as elaborated below for cross-dataset visual recognition.

8.1. Deep Transfer Learning

As deep learning advances, transfer learning is also shifted from traditional shallow-learning based approaches to deep neural network based approaches. In practice, the deep networks for the target task are rarely trained from scratch (i.e. with random initialization), since the target datasets rarely have sufficient samples. Thus, transfer learning is generally used. The pre-trained deep models from a very large source dataset are used either as an initialization (Yosinski et al., 2014) (then fine-tune the model according to the target data) or a fixed feature extractor for the target task of interest (Donahue et al., 2014; Razavian et al., 2014).

Similarly, in deep domain adaptation, the deep models are either used as feature extractors (then shallow-based domain adaptation methods are used for further adaptation) (Yao et al., 2015; Ghifary et al., 2016a; Courty et al., 2016; Tsai et al., 2016; Zhang et al., 2017a; Koniusz et al., 2017), or used in an end-to-end fashion (i.e. the domain adaptation module is integrated into the deep model) (Tzeng et al., 2015; Long and Wang, 2015; Long et al., 2016; Ganin et al., 2016; Long et al., 2017; Tzeng et al., 2017; Bousmalis et al., 2017). It is still unclear which approach would perform better. The advantage of using deep models as feature extractors is that the computational cost is much lower since shallow-based DA methods are generally much faster than deep learning-based methods. Another advantage is that many shallow-based methods have a global optimum value. The drawback is that the degree of adaptation may be insufficient in the shallow-based methods to fully leverage the deeply extracted features. On the other hand, the advantage of integrating an adaptation module into deep models is two-fold. First, it is end-to-end trainable. Secondly, the adaptation can be performed in multiple levels of features. While the drawbacks are the computational cost and the local optimum. To date, these two approaches have produced similar performance on some datasets (Long et al., 2016; Koniusz et al., 2017; Motiian et al., 2017b; Zhang et al., 2018b) though the end-to-end deep systems involve more parameters and require more computational costs. One of the missing study in the literature is a systematic study and comparison of the two approaches under same or similar conditions. For instance, both deep and shallow-based methods can use MMD metric between distributions as a constraint to the objective function. Thus, the comparison between the two approaches using MMD metric may be conducted.

The adversarial nets derived from GANs (Goodfellow et al., 2014) are appealing in deep learning-based transfer methods. The adversarial loss measures the JS divergence between two sets of data. In practice, the adversarial loss achieves better results and requires smaller batch sizes compared to the MMD loss (Ganin et al., 2016; Li et al., 2017a). Currently, the adversarial nets-based transfer methods have been used on many transfer learning tasks, such as domain adaptation (Ganin and Lempitsky, 2015; Ganin et al., 2016; Liu and Tuzel, 2016; Tzeng et al., 2017; Bousmalis et al., 2017), partial domain adaptation (Cao et al., 2017; Zhang et al., 2018a), cross-modal transfer (Wang and Gupta, 2016; Li and Wand, 2016; Liu and Tuzel, 2016; Shrivastava et al., 2017; Isola et al., 2017; Yi et al., 2017; Kim et al., 2017; Zhu et al., 2017; Benaim and Wolf, 2017; Li et al., 2017c; Liu et al., 2017), and zero-shot learning (Zhu et al., 2018; Xian et al., 2018). However, some of the drawbacks of GANs may also remain in adversarial nets-based transfer methods, such as unclear stopping criteria and hard training.

8.2. Partial Domain Adaptation

Partial domain adaptation aims at adapting from a source dataset to an unlabelled target dataset whose label space is known to be a subspace of that of the source (Hsu et al., 2015; Cao et al., 2017; Zhang et al., 2018a) or in a more general and challenging setting where only a subset of the label spaces between the source and target is overlapping (Panareda Busto and Gall, 2017). The former may be considered to be a special case of transfer learning between heterogeneous label spaces and a typical and practical example is to transfer from a large source dataset with more classes to a small target dataset with less classes. The latter is a problem bearing both domain adaptation and zero shot learning. Generally, the distribution shift is caused not only by label space difference but also by the intrinsic divergence of distributions (i.e. the distribution shifts exist even on shared classes between source and target). Partial domain adaptation has a more realistic setting than conventional unsupervised domain adaptation. Solutions to this problem would expand the applications of domain adaptation and provide a basic mechanism for online transfer learning and adaptation. However, few papers have been found on partial domain adaptation.

8.3. Transfer Learning from Multiple Sources

The multi-source domain adaptation (MSDA) (Sun et al., 2015; Duan et al., 2009; Hoffman et al., 2012; Duan et al., 2012b; Gong et al., 2013b; Xu et al., 2018) refers to adaptation from multiple source domains that have exactly the same label space as the target domain. Intuitively, the MSDA methods should be able to obtain superior performance compared to the single source setting. However, in practice, the adaptation from multiple sources generally can only give similar or even worse results compared to transferring from one of the source domains (though not every one of them) (Jhuo et al., 2012; Shekhar et al., 2013). This is probably due to the negative transfer issue. In addition, most source data contains multiple unknown latent domains (Hoffman et al., 2012; Gong et al., 2013b) in the real-world applications. Thus, how to discover latent domains and how to measure the domain similarities are still fundamental issues.

A more realistic setting is incomplete multi-source domain adaptation (IMSDA) (Ding et al., 2016; Xu et al., 2018) here each source label space is only a subset in the target domain and the union of the multiple source label spaces covers the target label space. IMSDA is a more challenging problem compared with MSDA, since the distribution shifts among the sources as well as the target domain are harder to be reduced due to the incompleteness of each source domain. In addition, when the number of sources increases, this problem will become challenging.

Multiple sources can be generalised to a target task, referred to as domain generalization (Blanchard et al., 2011; Khosla et al., 2012; Muandet et al., 2013; Fang et al., 2013; Stamos et al., 2015; Ghifary et al., 2015; Ghifary et al., 2016a; Motiian et al., 2017b; Li et al., 2017g) without the need of any target data. Domain generalization is of practical significance, but less addressed in the previous research. Since there is no target data available, domain generalization often has to learn semantically meaningful model shared across different domains.

8.4. Sequential/Online Transfer Learning

In sequential/online transfer learning (Zhao and Hoi, 2010), source data may not be fully available when the adaptation or transfer learning is being performed and/or the target data may also arrive sequentially. In addition, the source or even the target data cannot be fully stored and revisited in the future learning process. The adapted model is often required to perform well not only on the new target data but also to maintain its performance on the source data or previously seen data. Such a setting is sometimes known as incremental learning or transfer learning without forgetting under certain assumptions (Li and Hoiem, 2016; Lee et al., 2017; Shin et al., 2017). Few studies on this problem have been reported as shown in Figure 1.

8.5. Data Imbalance

The issue of data imbalance in the target dataset has been much neglected in the previous research, while imbalanced source data may be converted to balanced ones by discarding or re-weighting the training (source) data during the learning procedure. However, the target data can hardly follow such a process especially when the target data is insufficient. Data imbalance can be another source of distribution divergence between datasets and is ubiquitous in real-world applications. So far, there has been little study on how the existing algorithms for cross-dataset recognition would perform on imbalanced target data or how the imbalance would affect the algorithm performance.

8.6. Few-shot and Zero-shot Learning

Few-shot learning and Zero-shot learning are interesting and practical sub-problems in transfer learning which aim to transfer the source models efficiently to the target task with only a few (few-shot) or even no target data (zero-shot). In few-shot learning, the target data are generally rare (i.e. only one training sample is available for each class in the extreme case). Thus, the standard supervised learning framework could not provide an effective solution for learning new classes from only few samples (Fei-Fei et al., 2006; Lake et al., 2011). This challenge becomes more obvious in the deep learning context, since it generally relies on larger datasets and suffers from overfitting with insufficient data (Vinyals et al., 2016; Snell et al., 2017).

Compared to few-shot learning, zero-shot learning does not require any target data. A key challenge in zero-shot learning is the issue of projection domain shift  (Fu et al., 2014), which is neglected by most previous work. Since the source and target categories are disjoint, the projection obtained from the source categories is biased if they are applied to the target categories directly. For example, both zebra (one of the source class) and pig (one of the target class) have the same attribute ’hasTail’, but the visual appearances of the tails of zebra and pig are very different (as shown in Figure 8). However, to deal with the projection domain shift problem, the unlabelled target data are generally required. Thus, further exploration of new solutions to reduce the projection domain shift is useful for effective zero-shot learning. Another future direction is the exploration of more high-level semantic spaces for connecting seen and unseen classes. The most frequently used high-level semantics are manually annotated attributes or text descriptions. Some recent work (Frome et al., 2013; Mikolov et al., 2013; Socher et al., 2013; Lei Ba et al., 2015) employs the word vector as semantic space without relying on human annotation, but the performance of zero-shot learning using word vector is generally poorer than that using manually labelled attributes.

A recent work (Xian et al., 2017) presents a comprehensive analysis of the recent advances in zero-shot learning. They critically compare and analyse the state-of-the-art methods and unifies the data splits of training and test sets as well as the evaluation protocols for zero-shot learning. Their evaluation protocol emphasizes on the generalized zero-shot learning, which is considered more realistic and challenging. The traditional zero-shot learning generally assumes that the training categories do not appear at test time. By contrast, the generalized zero-shot setting relaxes this assumption and generalizes to the case where both seen and unseen categories are presented in the test stage, which provides standard evaluation protocols and data splits for fair comparison and realistic evaluation in the future.

8.7. Cross-modal Recognition

The cross-modal transfer, a sub-problem of heterogeneous domain adaptation and heterogeneous transfer learning as shown in Figure 1, refers to transfer between different data modalities (e.g. text v.s. image, image v.s. video, RGB v.s. Depth, etc.). Compared to cross-modal retrieval (Wang et al., 2017) and translation (Isola et al., 2017), fewer works are dedicated to cross-modal recognition through adaptation or transfer learning. The recognition across data modalities is ubiquitous in the real-world applications. For instance, the depth images acquired by the newly released depth cameras are much rarer compared to RGB images. Effectively using rich and massive labelled RGB images to help the recognition of depth images can reduce the extensive efforts of data collection and annotation. Some preliminary works can be found in (Jia et al., 2014; Gupta et al., 2016; Li et al., 2017c, b; Wang et al., 2018).

8.8. Transfer Learning from Weakly Labelled Web Data

The data on the Internet are generally weakly labelled. Textual information (e.g., caption, user tags, or description) can be easily obtained from the web as additional meta information for visual data. Thus, effectively adapting the visual representations learned from the weakly labelled data (e.g. web data) or co-existent other modality data to new tasks is interesting and practically important. A recent work releases a large scale weakly labelled web image dataset (WebVision (Li et al., 2017e)).

8.9. Self-taught Learning

A natural assumption among most of the literature is that the source data are extensive and labelled. This may be because the source data are generally treated as the auxiliary data for instructing or teaching the target task and the unlabelled source data could be unrelated and may lead to negative transfer. However, some research works argue that the redundant unlabelled source data can still be a treasure as a good starting point of parameters for target task as mentioned in Section 5.5. How to effectively leverage the massively available unlabelled source data to improve the transfer learning approaches is an interesting problem.

8.10. Large Scale and Versatile Datasets for Transfer Learning

The development of algorithms usually depends very much on the available datasets for evaluation. Most of the current visual datasets for cross-dataset recognition are small scale in terms of either number of classes or number of samples and they are especially not suitable for evaluating deep learning algorithms. An establishment of truly large scale versatile (i.e. suitable for different problems) and realistic dataset would drive the research a significant step forward. As well known, the creation of a large scale dataset may be unaffordably expensive. Combinations and re-targeting of existent datasets can be an effective and economical way as demonstrated in (Zhang et al., 2016b). As shown in Table 1, there are few visual recognition datasets designed for online transfer learning (e.g. P3.5, P3.6, P5.3, and P6.2). Most of the current online transfer learning deals with the detection tasks(Xu et al., 2014c) or text recognition tasks(Zhao and Hoi, 2010). To advance the transfer learning approaches for more broad and realistic applications, it is essential to create a few large scale datasets for online transfer learning.

9. Conclusion

Transfer learning from previous data for current tasks has a wide range of real-world applications. Many transfer learning algorithms for cross-dataset visual recognition have been developed in the last decade as reviewed in this paper. A key question that often puzzles a practitioner or a researcher is that which algorithm should be adopted for a given task. This paper intends to answer the question by providing a problem-oriented taxonomy of transfer learning for cross-dataset recognition and a comprehensive survey of the recently developed algorithms with respect to the taxonomy. Specifically, we believe the choice of an algorithm for a given target task should be guided by the attributes of both source and target datasets and the problem-oriented taxonomy offers an easy way to look up the problem and the methods that are likely to solve the problem. In addition, the problem-oriented taxonomy has also shown that many challenging problems in transfer learning for visual recognition have not been well studied. It is likely that research will focus on these problems in the future.

Though it is impossible for this survey to cover all the published papers on this topic, the selected works have well represented the recent advances and in-depth analysis of these works have revealed the future research directions in transfer learning for cross-dataset visual recognition.

Acknowledgements.
This work is partially supported by the Australian Research Council Future Fellowship under Grant FT180100116.

References

  • (1)
  • Akata et al. (2016) Zeynep Akata, Mateusz Malinowski, Mario Fritz, and Bernt Schiele. 2016. Multi-cue zero-shot learning with strong supervision. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 59–68.
  • Akata et al. (2013) Zeynep Akata, Florent Perronnin, Zaid Harchaoui, and Cordelia Schmid. 2013. Label-embedding for attribute-based classification. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 819–826.
  • Akata et al. (2015) Zeynep Akata, Scott Reed, Daniel Walter, Honglak Lee, and Bernt Schiele. 2015. Evaluation of output embeddings for fine-grained image classification. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 2927–2936.
  • Aljundi et al. (2015) Rahaf Aljundi, Rémi Emonet, Damien Muselet, and Marc Sebban. 2015. Landmarks-based Kernelized Subspace Alignment for Unsupervised Domain Adaptation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 56–63.
  • Anderson (1984) Theodore Wilbur Anderson. 1984. An introduction to multivariate statistical analysis. Vol. 2.
  • Andrew et al. (2013) Galen Andrew, Raman Arora, Jeff Bilmes, and Karen Livescu. 2013. Deep canonical correlation analysis. In Proc. International Conference on Machine Learning. 1247–1255.
  • Aytar and Zisserman (2011) Yusuf Aytar and Andrew Zisserman. 2011. Tabula rasa: Model transfer for object category detection. In Proc. IEEE International Conference on Computer Vision. IEEE, 2252–2259.
  • Bahdanau et al. (2015) Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proc. International Conference on Learning Representations.
  • Baktashmotlagh et al. (2013) Mahsa Baktashmotlagh, Mehrtash T Harandi, Brian C Lovell, and Mathieu Salzmann. 2013. Unsupervised domain adaptation by domain invariant projection. In Proc. IEEE International Conference on Computer Vision. IEEE, 769–776.
  • Baktashmotlagh et al. (2014) Mahsa Baktashmotlagh, Mehrtash T Harandi, Brian C Lovell, and Mathieu Salzmann. 2014. Domain adaptation on the statistical manifold. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2481–2488.
  • Baltieri et al. (2011) Davide Baltieri, Roberto Vezzani, and Rita Cucchiara. 2011. 3dpes: 3d people dataset for surveillance and forensics. In Proc. joint ACM workshop on Human gesture and behavior understanding. ACM, 59–64.
  • Beijbom (2012) Oscar Beijbom. 2012. Domain adaptations for computer vision applications. Technical Report. University of California San Diego.
  • Benaim and Wolf (2017) Sagie Benaim and Lior Wolf. 2017. One-Sided Unsupervised Domain Mapping. In Advances in Neural Information Processing Systems.
  • Bengio (2012) Yoshua Bengio. 2012. Deep Learning of Representations for Unsupervised and Transfer Learning. Unsupervised and Transfer Learning Challenges in Machine Learning, Volume 7 (2012), 19.
  • Bitarafan et al. (2016) Adeleh Bitarafan, Mahdieh Soleymani Baghshah, and Marzieh Gheisari. 2016. Incremental Evolving Domain Adaptation. IEEE Transactions on Knowledge and Data Engineering 28, 8 (Aug 2016), 2128–2141.
  • Blanchard et al. (2011) Gilles Blanchard, Gyemin Lee, and Clayton Scott. 2011. Generalizing from several related classification tasks to a new unlabeled sample. In Proc. Advances in Neural Information Processing Systems. 2178–2186.
  • Bousmalis et al. (2017) Konstantinos Bousmalis, Nathan Silberman, David Dohan, Dumitru Erhan, and Dilip Krishnan. 2017. Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks. In Proc. IEEE Conference on Computer Vision and Pattern Recognition.
  • Bousmalis et al. (2016) Konstantinos Bousmalis, George Trigeorgis, Nathan Silberman, Dilip Krishnan, and Dumitru Erhan. 2016. Domain separation networks. In Advances in Neural Information Processing Systems. 343–351.
  • Bucher et al. (2016) Maxime Bucher, Stéphane Herbin, and Frédéric Jurie. 2016. Improving Semantic Embedding Consistency by Metric Learning for Zero-Shot Classiffication. In Proc. European Conference on Computer Vision. Springer, 730–746.
  • Cao et al. (2010) Liangliang Cao, Zicheng Liu, and Thomas S Huang. 2010. Cross-dataset action detection. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1998–2005.
  • Cao et al. (2017) Zhangjie Cao, Mingsheng Long, Jianmin Wang, and Michael I Jordan. 2017. Partial Transfer Learning with Selective Adversarial Networks. arXiv preprint arXiv:1707.07901 (2017).
  • Caputo and Patricia (2014) Barbara Caputo and Novi Patricia. 2014. Overview of the imageclef 2014 domain adaptation task. In ImageCLEF 2014: Overview and analysis of the results.
  • Castrejón et al. (2016) Lluís Castrejón, Yusuf Aytar, Carl Vondrick, Hamed Pirsiavash, and Antonio Torralba. 2016. Learning Aligned Cross-Modal Representations from Weakly Aligned Data. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2940–2949.
  • Chen and Grauman (2014) Chao-Yeh Chen and Kristen Grauman. 2014. Inferring analogous attributes. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 200–207.
  • Chen et al. (2014) Lin Chen, Wen Li, and Dong Xu. 2014. Recognizing RGB images by learning from RGB-D data. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1418–1425.
  • Chen et al. (2011) Minmin Chen, Kilian Q Weinberger, and John Blitzer. 2011. Co-training for domain adaptation. In Proc. Advances in Neural Information Processing Systems. 2456–2464.
  • Chen et al. (2012) Minmin Chen, Zhixiang Xu, Fei Sha, and Kilian Q Weinberger. 2012. Marginalized Denoising Autoencoders for Domain Adaptation. In Proc. International Conference on Machine Learning. 767–774.
  • Chen et al. (2015) Zhiyuan Chen, Nianzu Ma, and Bing Liu. 2015. Lifelong learning for sentiment classification. In Association for Computational Linguistics.
  • Cheng et al. (2011) Dong Seon Cheng, Marco Cristani, Michele Stoppa, Loris Bazzani, and Vittorio Murino. 2011. Custom Pictorial Structures for Re-identification. Proc. British Machine Vision Conference.
  • Cheng et al. (2012) Zhongwei Cheng, Lei Qin, Yituo Ye, Qingming Huang, and Qi Tian. 2012. Human daily action analysis with multi-view and color-depth data. In Proc. European Conference on Computer Vision. Springer, 52–61.
  • Cho et al. (2014) Kyunghyun Cho, Bart van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches. Syntax, Semantics and Structure in Statistical Translation (2014), 103.
  • Chopra et al. (2013) Sumit Chopra, Suhrid Balakrishnan, and Raghuraman Gopalan. 2013. DLID: Deep learning for domain adaptation by interpolating between domains. In Proc. ICML Workshop on Challenges in Representation Learning. Citeseer.
  • Chua et al. (2009) Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: a real-world web image database from National University of Singapore. In Proc. ACM international conference on image and video retrieval. ACM, 48.
  • Coates et al. (2011) Adam Coates, Andrew Ng, and Honglak Lee. 2011. An analysis of single-layer networks in unsupervised feature learning. In Proc. international conference on artificial intelligence and statistics. 215–223.
  • Cook et al. (2013) Diane Cook, Kyle D Feuz, and Narayanan C Krishnan. 2013. Transfer learning for activity recognition: A survey. Knowledge and information systems 36, 3 (2013), 537–556.
  • Courty et al. (2016) Nicolas Courty, Rémi Flamary, Devis Tuia, and Alain Rakotomamonjy. 2016. Optimal transport for Domain adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence (2016).
  • Csurka (2017) Gabriela Csurka. 2017. A comprehensive survey on domain adaptation for visual applications. In Domain Adaptation in Computer Vision Applications. Springer, 1–35.
  • Csurka et al. (2014) Gabriela Csurka, Boris Chidlovskii, and Florent Perronnin. 2014. Domain adaptation with a domain specific class means classifier. In Proc. European Conference on Computer Vision Workshops. Springer, 32–46.
  • Cui et al. (2014a) Zhen Cui, Hong Chang, Shiguang Shan, and Xilin Chen. 2014a. Generalized unsupervised manifold alignment. In Proc. Advances in Neural Information Processing Systems. 2429–2437.
  • Cui et al. (2014b) Zhen Cui, Wen Li, Dong Xu, Shiguang Shan, Xilin Chen, and Xuelong Li. 2014b. Flowing on Riemannian manifold: Domain adaptation by shifting covariance. IEEE Transactions on Cybernetics 44, 12 (2014), 2264–2273.
  • Dai et al. (2009) Wenyuan Dai, Yuqiang Chen, Gui-Rong Xue, Qiang Yang, and Yong Yu. 2009. Translated learning: Transfer learning across different feature spaces. In Proc. Advances in Neural Information Processing Systems. 353–360.
  • Dai et al. (2007a) Wenyuan Dai, Gui-Rong Xue, Qiang Yang, and Yong Yu. 2007a. Transferring naive bayes classifiers for text classification. In Proc. AAAI Conference on Artificial Intelligence, Vol. 22. 540.
  • Dai et al. (2007b) Wenyuan Dai, Qiang Yang, Gui-Rong Xue, and Yong Yu. 2007b. Boosting for transfer learning. In Proc. International Conference on Machine Learning. ACM, 193–200.
  • Daumé III (2007) Hal Daumé III. 2007. Frustratingly easy domain adaptation. In Proc. Annual Meeting of the Association of Computational Linguistics. 256�–263.
  • Deng et al. (2009) Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 248–255.
  • Diethe et al. (2008) Tom Diethe, David R Hardoon, and John Shawe-taylor. 2008. Multiview Fisher discriminant analysis. In In NIPS Workshop on Learning from Multiple Sources. Citeseer.
  • Ding et al. (2015a) Zhengming Ding, Ming Shao, and Yun Fu. 2015a. Deep Low-Rank Coding for Transfer Learning. In Proc. International Joint Conference on Artificial Intelligence. 3453–3459.
  • Ding et al. (2015b) Zhengming Ding, Ming Shao, and Yun Fu. 2015b. Missing Modality Transfer Learning via Latent Low-Rank Constraint. IEEE Transactions on Image Processing 24, 11 (2015), 4322–4334.
  • Ding et al. (2016) Zhengming Ding, Ming Shao, and Yun Fu. 2016. Transfer learning for image classification with incomplete multiple sources. In Proc. International Joint Conference on Neural Networks. IEEE, 2188–2195.
  • Donahue et al. (2014) Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, and Trevor Darrell. 2014. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. In Proc. International Conference on Machine Learning. 647–655.
  • Du et al. (2017) Simon S Du, Jayanth Koushik, Aarti Singh, and Barnabas Poczos. 2017. Hypothesis Transfer Learning via Transformation Functions. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). 574–584.
  • Duan et al. (2012a) Lixin Duan, Ivor W Tsang, and Dong Xu. 2012a. Domain transfer multiple kernel learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 3 (2012), 465–479.
  • Duan et al. (2009) Lixin Duan, Ivor W Tsang, Dong Xu, and Tat-Seng Chua. 2009. Domain adaptation from multiple sources via auxiliary classifiers. In Proc. International Conference on Machine Learning. ACM, 289–296.
  • Duan et al. (2012d) Lixin Duan, Dong Xu, IW-H Tsang, and Jiebo Luo. 2012d. Visual event recognition in videos by learning from web data. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 9 (2012), 1667–1680.
  • Duan et al. (2012b) Lixin Duan, Dong Xu, and Ivor W Tsang. 2012b. Domain adaptation from multiple sources: A domain-dependent regularization approach. IEEE Transactions on Neural Networks and Learning Systems 23, 3 (2012), 504–518.
  • Duan et al. (2012c) Lixin Duan, Dong Xu, and Ivor W Tsang. 2012c. Learning with Augmented Features for Heterogeneous Domain Adaptation. In Proc. International Conference on Machine Learning. 711–718.
  • Eigen and Fergus (2015) David Eigen and Rob Fergus. 2015. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In Proc. IEEE International Conference on Computer Vision. 2650–2658.
  • Fang et al. (2013) Chen Fang, Ye Xu, and Daniel Rockmore. 2013. Unbiased metric learning: On the utilization of multiple datasets and web images for softening bias. In Proc. IEEE International Conference on Computer Vision. IEEE, 1657–1664.
  • Farhadi et al. (2009) Ali Farhadi, Ian Endres, Derek Hoiem, and David Forsyth. 2009. Describing objects by their attributes. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1778–1785.
  • Fei-Fei et al. (2006) Li Fei-Fei, Rob Fergus, and Pietro Perona. 2006. One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 4 (2006), 594–611.
  • Feng et al. (2014) Fangxiang Feng, Xiaojie Wang, and Ruifan Li. 2014. Cross-modal retrieval with correspondence autoencoder. In Proc. ACM international conference on Multimedia. ACM, 7–16.
  • Fernando et al. (2013) Basura Fernando, Amaury Habrard, Marc Sebban, and Tinne Tuytelaars. 2013. Unsupervised visual domain adaptation using subspace alignment. In Proc. IEEE International Conference on Computer Vision. IEEE, 2960–2967.
  • French et al. (2017) Geoffrey French, Michal Mackiewicz, and Mark Fisher. 2017. Self-ensembling for domain adaptation. arXiv preprint arXiv:1706.05208 (2017).
  • Frome et al. (2013) Andrea Frome, Greg S Corrado, Jon Shlens, Samy Bengio, Jeff Dean, Tomas Mikolov, et al. 2013. Devise: A deep visual-semantic embedding model. In Advances in neural information processing systems. 2121–2129.
  • Fu et al. (2014) Yanwei Fu, Timothy M Hospedales, Tao Xiang, Zhenyong Fu, and Shaogang Gong. 2014. Transductive multi-view embedding for zero-shot recognition and annotation. In European Conference on Computer Vision. Springer, 584–599.
  • Fu et al. (2015) Yanwei Fu, Timothy M Hospedales, Tao Xiang, and Shaogang Gong. 2015. Transductive multi-view zero-shot learning. IEEE transactions on pattern analysis and machine intelligence 37, 11 (2015), 2332–2345.
  • Gama et al. (2014) Jo ao Gama, Indr.e vZliobait.e, Albert Bifet, Mykola Pechenizkiy, and Abdelhamid Bouchachia. 2014. A survey on concept drift adaptation. Comput. Surveys 46, 4 (2014), 44.
  • Gan et al. (2016) Chuang Gan, Tianbao Yang, and Boqing Gong. 2016. Learning Attributes Equals Multi-Source Domain Generalization. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 87–97.
  • Gan et al. (2014) Junying Gan, Lichen Li, Yikui Zhai, and Yinhua Liu. 2014. Deep self-taught learning for facial beauty prediction. Neurocomputing 144 (2014), 295–303.
  • Ganin and Lempitsky (2015) Yaroslav Ganin and Victor Lempitsky. 2015. Unsupervised Domain Adaptation by Backpropagation. In Proc. International Conference on Machine Learning. 1180–1189.
  • Ganin et al. (2016) Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor Lempitsky. 2016. Domain-adversarial training of neural networks. Journal of Machine Learning Research 17, 59 (2016), 1–35.
  • Gatys et al. (2016) Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 2414–2423.
  • Ghifary et al. (2016a) Muhammad Ghifary, David Balduzzi, W Bastiaan Kleijn, and Mengjie Zhang. 2016a. Scatter Component Analysis: A Unified Framework for Domain Adaptation and Domain Generalization. IEEE Transactions on Pattern Analysis and Machine Intelligence PP, 99 (2016), 1–1.
  • Ghifary et al. (2015) Muhammad Ghifary, W Bastiaan Kleijn, Mengjie Zhang, and David Balduzzi. 2015. Domain Generalization for Object Recognition with Multi-task Autoencoders. In Proc. IEEE International Conference on Computer Vision. IEEE, 2551–2559.
  • Ghifary et al. (2016b) Muhammad Ghifary, W Bastiaan Kleijn, Mengjie Zhang, David Balduzzi, and Wen Li. 2016b. Deep Reconstruction-Classification Networks for Unsupervised Domain Adaptation. In European Conference on Computer Vision. Springer, 597–613.
  • Gholami et al. (2017) Behnam Gholami, Ognjen (Oggi) Rudovic, and Vladimir Pavlovic. 2017. PUnDA: Probabilistic Unsupervised Domain Adaptation for Knowledge Transfer Across Visual Categories. In Proc. IEEE International Conference on Computer Vision.
  • Glorot et al. (2011) Xavier Glorot, Antoine Bordes, and Yoshua Bengio. 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proc. International Conference on Machine Learning. 513–520.
  • Gong et al. (2013a) Boqing Gong, Kristen Grauman, and Fei Sha. 2013a. Connecting the dots with landmarks: Discriminatively learning domain-invariant features for unsupervised domain adaptation. In Proc. International Conference on Machine Learning. 222–230.
  • Gong et al. (2013b) Boqing Gong, Kristen Grauman, and Fei Sha. 2013b. Reshaping visual datasets for domain adaptation. In Proc. Advances in Neural Information Processing Systems. 1286–1294.
  • Gong et al. (2014a) Boqing Gong, Kristen Grauman, and Fei Sha. 2014a. Learning kernels for unsupervised domain adaptation with applications to visual object recognition. International Journal of Computer Vision 109, 1-2 (2014), 3–27.
  • Gong et al. (2012) Boqing Gong, Yuan Shi, Fei Sha, and Kristen Grauman. 2012. Geodesic flow kernel for unsupervised domain adaptation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2066–2073.
  • Gong et al. (2016) Mingming Gong, Kun Zhang, Tongliang Liu, Dacheng Tao, Clark Glymour, and Bernhard Schölkopf. 2016. Domain adaptation with conditional transferable components. In Proc. International Conference on Machine Learning. 2839–2848.
  • Gong et al. (2014b) Yunchao Gong, Qifa Ke, Michael Isard, and Svetlana Lazebnik. 2014b. A multi-view embedding space for modeling internet images, tags, and their semantics. International journal of computer vision 106, 2 (2014), 210–233.
  • Goodfellow et al. (2014) Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672–2680.
  • Gopalan et al. (2011) Raghuraman Gopalan, Ruonan Li, and Rama Chellappa. 2011. Domain adaptation for object recognition: An unsupervised approach. In Proc. IEEE International Conference on Computer Vision. IEEE, 999–1006.
  • Gopalan et al. (2014) Raghavan Gopalan, Ruonan Li, and Rama Chellappa. 2014. Unsupervised adaptation across domain shifts by generating intermediate data representations. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 11 (2014), 2288–2302.
  • Gray et al. (2007) Douglas Gray, Shane Brennan, and Hai Tao. 2007. Evaluating appearance models for recognition, reacquisition, and tracking. In Proc. IEEE International Workshop on Performance Evaluation for Tracking and Surveillance, Vol. 3. 1–7.
  • Gross et al. (2010) Ralph Gross, Iain Matthews, Jeffrey Cohn, Takeo Kanade, and Simon Baker. 2010. Multi-pie. Image and Vision Computing 28, 5 (2010), 807–813.
  • Gupta et al. (2016) Saurabh Gupta, Judy Hoffman, and Jitendra Malik. 2016. Cross modal distillation for supervision transfer. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 2827–2836.
  • Hirzer et al. (2011) Martin Hirzer, Csaba Beleznai, Peter M Roth, and Horst Bischof. 2011. Person re-identification by descriptive and discriminative classification. In Proc. Scandinavian conference on Image analysis. Springer, 91–102.
  • Hochreiter and Schmidhuber (1997) Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.
  • Hoffman et al. (2014) Judy Hoffman, Trevor Darrell, and Kate Saenko. 2014. Continuous manifold based adaptation for evolving visual domains. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 867–874.
  • Hoffman et al. (2012) Judy Hoffman, Brian Kulis, Trevor Darrell, and Kate Saenko. 2012. Discovering latent domains for multisource domain adaptation. In Proc. European Conference on Computer Vision. Springer, 702–715.
  • Hsu et al. (2015) Tzu Ming Harry Hsu, Wei Yu Chen, Cheng-An Hou, Yao-Hung Hubert Tsai, Yi-Ren Yeh, and Yu-Chiang Frank Wang. 2015. Unsupervised Domain Adaptation With Imbalanced Cross-Domain Data. In Proc. IEEE International Conference on Computer Vision. IEEE, 4121–4129.
  • Hu and Yang (2011) Derek Hao Hu and Qiang Yang. 2011. Transfer learning for activity recognition via sensor mapping. In Proc. International Joint Conference on Artificial Intelligence, Vol. 22. 1962–1967.
  • Huang and Wang (2013) De-An Huang and Yu-Chiang Frank Wang. 2013. Coupled dictionary and feature space learning with applications to cross-domain image synthesis and recognition. In Proc. IEEE International Conference on Computer Vision. IEEE, 2496–2503.
  • Huang et al. (2006) Jiayuan Huang, Arthur Gretton, Karsten M Borgwardt, Bernhard Schölkopf, and Alex J Smola. 2006. Correcting sample selection bias by unlabeled data. In Proc. Advances in Neural Information Processing Systems. 601–608.
  • Isola et al. (2017) Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proc. IEEE Conference on Computer Vision and Pattern Recognition.
  • Jain and Learned-Miller (2011) Vidit Jain and Erik Learned-Miller. 2011. Online domain adaptation of a pre-trained cascade of classifiers. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 577–584.
  • Japkowicz and Stephen (2002) Nathalie Japkowicz and Shaju Stephen. 2002. The class imbalance problem: A systematic study. Intelligent Data Analysis 6, 5 (2002), 429–449.
  • Jayaraman and Grauman (2014) Dinesh Jayaraman and Kristen Grauman. 2014. Zero-shot recognition with unreliable attributes. In Proc. Advances in Neural Information Processing Systems. 3464–3472.
  • Jhuo et al. (2012) I-Hong Jhuo, Dong Liu, DT Lee, Shih-Fu Chang, et al. 2012. Robust visual domain adaptation with low-rank reconstruction. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2168–2175.
  • Jia et al. (2014) Chengcheng Jia, Yu Kong, Zhengming Ding, and Yun Raymond Fu. 2014. Latent tensor transfer learning for rgb-d action recognition. In Proc. ACM International Conference on Multimedia. ACM, 87–96.
  • Jiang et al. (2016) Wenhao Jiang, Hongchang Gao, Fu-lai Chung, and Heng Huang. 2016. The l2, 1-Norm Stacked Robust Autoencoders for Domain Adaptation. In Proc. AAAI Conference on Artificial Intelligence.
  • Jiang et al. (2008) Wei Jiang, Eric Zavesky, Shih-Fu Chang, and Alex Loui. 2008. Cross-domain learning methods for high-level visual concept classification. In Proc. IEEE International Conference on Image Processing. IEEE, 161–164.
  • Jie et al. (2011) Luo Jie, Tatiana Tommasi, and Barbara Caputo. 2011. Multiclass transfer learning from unconstrained priors. In Proc. IEEE International Conference on Computer Vision. IEEE, 1863–1870.
  • Johnson et al. (2016) Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. In European Conference on Computer Vision.
  • Kalchbrenner and Blunsom (2013) Nal Kalchbrenner and Phil Blunsom. 2013. Recurrent Continuous Translation Models. In Proc. Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics.
  • Kan et al. (2015) Meina Kan, Shiguang Shan, and Xilin Chen. 2015. Bi-shifting Auto-Encoder for Unsupervised Domain Adaptation. In Proc. IEEE International Conference on Computer Vision. IEEE, 3846–3854.
  • Kan et al. (2012) Meina Kan, Shiguang Shan, Haihong Zhang, Shihong Lao, and Xilin Chen. 2012. Multi-view discriminant analysis. In Proc. European Conference on Computer Vision. Springer, 808–821.
  • Khosla et al. (2012) Aditya Khosla, Tinghui Zhou, Tomasz Malisiewicz, Alexei A Efros, and Antonio Torralba. 2012. Undoing the damage of dataset bias. In Proc. European Conference on Computer Vision. Springer, 158–171.
  • Kim et al. (2017) Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jung Kwon Lee, and Jiwon Kim. 2017. Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. In Proc. International Conference on Machine Learning, Doina Precup and Yee Whye Teh (Eds.), Vol. 70. International Convention Centre, Sydney, Australia, 1857–1865.
  • Koch et al. (2015) Gregory Koch, TORONTO EDU, Richard Zemel, and Ruslan Salakhutdinov. 2015. Siamese Neural Networks for One-shot Image Recognition. In Proc. ICML Deep Learning Workshop.
  • Kodirov et al. (2015) Elyor Kodirov, Tao Xiang, Zhenyong Fu, and Shaogang Gong. 2015. Unsupervised domain adaptation for zero-shot learning. In Proc. IEEE International Conference on Computer Vision. IEEE, 2452–2460.
  • Koehn et al. (2003) Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proc. Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1. Association for Computational Linguistics, 48–54.
  • Koniusz et al. (2017) Piotr Koniusz, Yusuf Tas, and Fatih Porikli. 2017. Domain Adaptation by Mixture of Alignments of Second-or Higher-Order Scatter Tensors. In Proc. IEEE Conference on Computer Vision and Pattern Recognition.
  • Kuen et al. (2015) Jason Kuen, Kian Ming Lim, and Chin Poo Lee. 2015. Self-taught learning of a deep invariant representation for visual tracking via temporal slowness principle. Pattern Recognition 48, 10 (2015), 2964–2982.
  • Kulis et al. (2011) Brian Kulis, Kate Saenko, and Trevor Darrell. 2011. What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1785–1792.
  • Kumagai (2016) Wataru Kumagai. 2016. Learning Bound for Parameter Transfer Learning. In Advances in Neural Information Processing Systems. 2721–2729.
  • Kumar et al. (2009) Neeraj Kumar, Alexander C Berg, Peter N Belhumeur, and Shree K Nayar. 2009. Attribute and simile classifiers for face verification. In Proc. IEEE International Conference on Computer Vision. IEEE, 365–372.
  • Kuzborskij and Orabona (2013) Ilja Kuzborskij and Francesco Orabona. 2013. Stability and Hypothesis Transfer Learning. In Proc. International Conference on Machine Learning. 942–950.
  • Laffont et al. (2014) Pierre-Yves Laffont, Zhile Ren, Xiaofeng Tao, Chao Qian, and James Hays. 2014. Transient attributes for high-level understanding and editing of outdoor scenes. ACM Transactions on Graphics 33, 4 (2014), 149.
  • Lake et al. (2011) Brenden Lake, Ruslan Salakhutdinov, Jason Gross, and Joshua Tenenbaum. 2011. One shot learning of simple visual concepts. In Proc. Cognitive Science Society, Vol. 33.
  • Lampert et al. (2009) Christoph H Lampert, Hannes Nickisch, and Stefan Harmeling. 2009. Learning to detect unseen object classes by between-class attribute transfer. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 951–958.
  • Le et al. (2011) Quoc V Le, Will Y Zou, Serena Y Yeung, and Andrew Y Ng. 2011. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3361–3368.
  • Lee et al. (2009) Honglak Lee, Rajat Raina, Alex Teichman, and Andrew Y Ng. 2009. Exponential family sparse coding with applications to self-taught learning. In Proc. International Joint Conference on Artifical intelligence. Morgan Kaufmann Publishers Inc., 1113–1119.
  • Lee et al. (2017) Sang-Woo Lee, Jin-Hwa Kim, Jaehyun Jun, Jung-Woo Ha, and Byoung-Tak Zhang. 2017. Overcoming catastrophic forgetting by incremental moment matching. In Advances in Neural Information Processing Systems. 4655–4665.
  • Lei Ba et al. (2015) Jimmy Lei Ba, Kevin Swersky, Sanja Fidler, et al. 2015. Predicting deep zero-shot convolutional neural networks using textual descriptions. In Proc. IEEE International Conference on Computer Vision. 4247–4255.
  • Li et al. (2017c) Chunyuan Li, Hao Liu, Changyou Chen, Yuchen Pu, Liqun Chen, Ricardo Henao, and Lawrence Carin. 2017c. Alice: Towards understanding adversarial learning for joint distribution matching. In Advances in Neural Information Processing Systems. 5501–5509.
  • Li and Wand (2016) Chuan Li and Michael Wand. 2016. Precomputed real-time texture synthesis with markovian generative adversarial networks. In European Conference on Computer Vision.
  • Li et al. (2017a) Chun-Liang Li, Wei-Cheng Chang, Yu Cheng, Yiming Yang, and Barnabás Póczos. 2017a. MMD GAN: Towards deeper understanding of moment matching network. In Advances in Neural Information Processing Systems. 2200–2210.
  • Li et al. (2017g) Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy M Hospedales. 2017g. Deeper, Broader and Artier Domain Generalization. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 5542–5550.
  • Li et al. (2013) Stan Z Li, Dong Yi, Zhen Lei, and Shengcai Liao. 2013. The casia nir-vis 2.0 face database. In Proc. IEEE Conference on Computer vision and pattern recognition workshops. IEEE, 348–353.
  • Li et al. (2017b) Wen Li, Lin Chen, Dong Xu, and Luc Van Gool. 2017b. Visual Recognition in RGB Images and Videos by Learning from RGB-D Data. IEEE transactions on pattern analysis and machine intelligence (2017).
  • Li et al. (2014) Wen Li, Lixin Duan, Dong Xu, and Ivor W Tsang. 2014. Learning with augmented features for supervised and semi-supervised heterogeneous domain adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 6 (2014), 1134–1148.
  • Li et al. (2017e) Wen Li, Limin Wang, Wei Li, Eirikur Agustsson, and Luc Van Gool. 2017e. WebVision Database: Visual Learning and Understanding from Web Data. arXiv preprint arXiv:1708.02862 (2017).
  • Li and Wang (2013) Wei Li and Xiaogang Wang. 2013. Locally aligned feature transforms across views. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 3594–3601.
  • Li et al. (2017f) Wen Li, Zheng Xu, Dong Xu, Dengxin Dai, and Luc Van Gool. 2017f. Domain generalization and adaptation using low rank exemplar svms. IEEE transactions on pattern analysis and machine intelligence (2017).
  • Li et al. (2015) Xin Li, Yuhong Guo, and Dale Schuurmans. 2015. Semi-supervised zero-shot classification with label representation learning. In Proc. IEEE International Conference on Computer Vision. 4211–4219.
  • Li et al. (2017d) Yanan Li, Donghui Wang, Huanhang Hu, Yuetan Lin, and Yueting Zhuang. 2017d. Zero-Shot Recognition using Dual Visual-Semantic Mapping Paths. Proc. IEEE Conference on Computer Vision and Pattern Recognition.
  • Li and Hoiem (2016) Zhizhong Li and Derek Hoiem. 2016. Learning without forgetting. In European Conference on Computer Vision. Springer, 614–629.
  • Lin et al. (2014) Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740–755.
  • Liu et al. (2011) Jingen Liu, Benjamin Kuipers, and Silvio Savarese. 2011. Recognizing human actions by attributes. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3337–3344.
  • Liu et al. (2017) Ming-Yu Liu, Thomas Breuel, and Jan Kautz. 2017. Unsupervised Image-to-Image Translation Networks. In Advances in Neural Information Processing Systems.
  • Liu and Tuzel (2016) Ming-Yu Liu and Oncel Tuzel. 2016. Coupled generative adversarial networks. In Advances in Neural Information Processing Systems. 469–477.
  • Long et al. (2015) Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 3431–3440.
  • Long et al. (2013a) Mingsheng Long, Guiguang Ding, Jianmin Wang, Jiaguang Sun, Yuchen Guo, and Philip S Yu. 2013a. Transfer sparse coding for robust image representation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 407–414.
  • Long and Wang (2015) Mingsheng Long and Jianmin Wang. 2015. Learning Transferable Features with Deep Adaptation Networks. In Proc. International Conference on Machine Learning. 97–105.
  • Long et al. (2014) Mingsheng Long, Jianmin Wang, Guiguang Ding, Sinno Jialin Pan, and Philip S Yu. 2014. Adaptation regularization: A general framework for transfer learning. IEEE Transactions on Knowledge and Data Engineering 26, 5 (2014), 1076–1089.
  • Long et al. (2013b) Mingsheng Long, Jianmin Wang, Guiguang Ding, Jiaguang Sun, and Philip S Yu. 2013b. Transfer feature learning with joint distribution adaptation. In Proc. IEEE International Conference on Computer Vision. IEEE, 2200–2207.
  • Long et al. (2017) Mingsheng Long, Jianmin Wang, and Michael I Jordan. 2017. Deep transfer learning with joint adaptation networks. In Proc. International Conference on Machine Learning.
  • Long et al. (2016) Mingsheng Long, Han Zhu, Jianmin Wang, and Michael I Jordan. 2016. Unsupervised domain adaptation with residual transfer networks. In Advances in Neural Information Processing Systems. 136–144.
  • Lu et al. (2017) Hao Lu, Lei Zhang, Zhiguo Cao, Wei Wei, Ke Xian, Chunhua Shen, and Anton van den Hengel. 2017. When Unsupervised Domain Adaptation Meets Tensor Representations. In Proc. IEEE International Conference on Computer Vision.
  • Lu et al. (2015) Jie Lu, Vahid Behbood, Peng Hao, Hua Zuo, Shan Xue, and Guangquan Zhang. 2015. Transfer learning using computational intelligence: a survey. Knowledge-Based Systems 80 (2015), 14–23.
  • Ma et al. (2012) Zhigang Ma, Yi Yang, Yang Cai, Nicu Sebe, and Alexander G Hauptmann. 2012. Knowledge adaptation for ad hoc multimedia event detection with few exemplars. In Proc. ACM International Conference on Multimedia. ACM, 469–478.
  • Ma et al. (2014) Zhigang Ma, Yi Yang, Feiping Nie, Nicu Sebe, Shuicheng Yan, and Alexander G Hauptmann. 2014. Harnessing lab knowledge for real-world action recognition. International Journal of Computer Vision 109, 1-2 (2014), 60–73.
  • Margolis (2011) Anna Margolis. 2011. A literature review of domain adaptation with unlabeled data. Technical Report.
  • Mikolov et al. (2013) Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111–3119.
  • Mittal et al. (2014) Paritosh Mittal, Aishwarya Jain, Gaurav Goswami, Richa Singh, and Mayank Vatsa. 2014. Recognizing composite sketches with digital face images via SSD dictionary. In Proc. IEEE International Joint Conference on Biometrics. IEEE, 1–6.
  • Moreno-Torres et al. (2012) Jose G Moreno-Torres, Troy Raeder, RocíO Alaiz-RodríGuez, Nitesh V Chawla, and Francisco Herrera. 2012. A unifying view on dataset shift in classification. Pattern Recognition 45, 1 (2012), 521–530.
  • Motiian et al. (2017a) Saeid Motiian, Quinn Jones, Seyed Iranmanesh, and Gianfranco Doretto. 2017a. Few-Shot Adversarial Domain Adaptation. In Advances in Neural Information Processing Systems. 6673–6683.
  • Motiian et al. (2017b) Saeid Motiian, Marco Piccirilli, Donald A Adjeroh, and Gianfranco Doretto. 2017b. Unified Deep Supervised Domain Adaptation and Generalization. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 5715–5725.
  • Muandet et al. (2013) Krikamol Muandet, David Balduzzi, and Bernhard Schölkopf. 2013. Domain Generalization via Invariant Feature Representation. In Proc. International Conference on Machine Learning. 10–18.
  • Nater et al. (2011) Fabian Nater, Tatiana Tommasi, Helmut Grabner, Luc Van Gool, and Barbara Caputo. 2011. Transferring activities: Updating human behavior analysis. In Proc. IEEE International Conference on Computer Vision Workshops. IEEE, 1737–1744.
  • Ni et al. (2013) Jie Ni, Qiang Qiu, and Rama Chellappa. 2013. Subspace interpolation via dictionary learning for unsupervised domain adaptation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 692–699.
  • Niu et al. (2015a) Li Niu, Wen Li, and Dong Xu. 2015a. Multi-view domain generalization for visual recognition. In Proc. IEEE International Conference on Computer Vision. 4193–4201.
  • Niu et al. (2015b) Li Niu, Wen Li, and Dong Xu. 2015b. Visual Recognition by Learning from Web Data: A Weakly Supervised Domain Generalization Approach. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2774–2783.
  • Palatucci et al. (2009) Mark Palatucci, Dean Pomerleau, Geoffrey E Hinton, and Tom M Mitchell. 2009. Zero-shot learning with semantic output codes. In Proc. Advances in Neural Information Processing Systems. 1410–1418.
  • Pan et al. (2009) Sinnojialin Pan, Ivor W Tsang, James Tin Yau Kwok, and Qiang Yang. 2009. Domain Adaptation via Transfer Component Analysis. In Proc. International Joint Conference on Artificial Intelligence. 1187.
  • Pan and Yang (2010) Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2010), 1345–1359.
  • Panareda Busto and Gall (2017) Pau Panareda Busto and Juergen Gall. 2017. Open Set Domain Adaptation. In Proc. IEEE International Conference on Computer Vision.
  • Parikh and Grauman (2011) Devi Parikh and Kristen Grauman. 2011. Relative attributes. In Proc. IEEE International Conference on Computer Vision. IEEE, 503–510.
  • Patel et al. (2015) Vishal M Patel, Raghuraman Gopalan, Ruonan Li, and Rama Chellappa. 2015. Visual domain adaptation: A survey of recent advances. IEEE Signal Processing Magazine 32, 3 (2015), 53–69.
  • Patricia and Caputo (2014) Novi Patricia and Barbara Caputo. 2014. Learning to learn, from transfer learning to domain adaptation: A unifying perspective. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1442–1449.
  • Patterson et al. (2014) Genevieve Patterson, Chen Xu, Hang Su, and James Hays. 2014. The sun attribute database: Beyond categories for deeper scene understanding. International Journal of Computer Vision 108, 1-2 (2014), 59–81.
  • Peng et al. (2016) Yuxin Peng, Xin Huang, and Jinwei Qi. 2016. Cross-media shared representation by hierarchical learning with multiple deep networks. In Proc. International Joint Conference on Artificial Intelligence. AAAI Press, 3846–3853.
  • Pereira et al. (2014) Jose Costa Pereira, Emanuele Coviello, Gabriel Doyle, Nikhil Rasiwasia, Gert RG Lanckriet, Roger Levy, and Nuno Vasconcelos. 2014. On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 3 (2014), 521–535.
  • Perkins and Salomon (1992) David N Perkins and Gavriel Salomon. 1992. Transfer of learning. International Encyclopedia of Education 2 (1992), 6452–6457.
  • Perrot and Habrard (2015) Michaël Perrot and Amaury Habrard. 2015. A Theoretical Analysis of Metric Hypothesis Transfer Learning. In Proc. International Conference on Machine Learning. 1708–1717.
  • Qi et al. (2011) Guo-Jun Qi, Charu Aggarwal, and Thomas Huang. 2011. Towards semantic knowledge propagation from text corpus to web images. In Proc. international conference on World wide web. ACM, 297–306.
  • Qiao et al. (2016) Ruizhi Qiao, Lingqiao Liu, Chunhua Shen, and Anton van den Hengel. 2016. Less is more: zero-shot learning from online textual documents with noise suppression. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 2249–2257.
  • Qin et al. (2017) Jie Qin, Li Liu, Ling Shao, Fumin Shen, Bingbing Ni, Jiaxin Chen, and Yunhong Wang. 2017. Zero-shot action recognition with error-correcting output codes. In Proc. IEEE Conference on Computer Vision and Pattern Recognition.
  • Quanz and Huan (2009) Brian Quanz and Jun Huan. 2009. Large margin transductive transfer learning. In Proc. ACM conference on Information and knowledge management. ACM, 1327–1336.
  • Quionero-Candela et al. (2009) Joaquin Quionero-Candela, Masashi Sugiyama, Anton Schwaighofer, and Neil D Lawrence. 2009. Dataset shift in machine learning. The MIT Press.
  • Raina et al. (2007) Rajat Raina, Alexis Battle, Honglak Lee, Benjamin Packer, and Andrew Y Ng. 2007. Self-taught learning: transfer learning from unlabeled data. In Proc. International Conference on Machine learning. ACM, 759–766.
  • Rashtchian et al. (2010) Cyrus Rashtchian, Peter Young, Micah Hodosh, and Julia Hockenmaier. 2010. Collecting image annotations using Amazon’s Mechanical Turk. In Proc. NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk. Association for Computational Linguistics, 139–147.
  • Rastegari et al. (2012) Mohammad Rastegari, Ali Farhadi, and David Forsyth. 2012. Attribute Discovery via Predictable Discriminative Binary Codes. In European Conference on Computer Vision. 876–889.
  • Ravi and Larochelle (2017) Sachin Ravi and Hugo Larochelle. 2017. Optimization as a Model for Few-Shot Learning. In Proc. International Conference on Learning Representations.
  • Razavian et al. (2014) Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. 2014. CNN features off-the-shelf: an astounding baseline for recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition Workshops. IEEE, 512–519.
  • Reed et al. (2016) Scott Reed, Zeynep Akata, Honglak Lee, and Bernt Schiele. 2016. Learning deep representations of fine-grained visual descriptions. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 49–58.
  • Romera-Paredes and Torr (2015) Bernardino Romera-Paredes and Philip Torr. 2015. An embarrassingly simple approach to zero-shot learning. In Proc. International Conference on Machine Learning. 2152–2161.
  • Ruvolo and Eaton (2013) Paul Ruvolo and Eric Eaton. 2013. ELLA: An Efficient Lifelong Learning Algorithm. In Proc. International Conference on Machine Learning. 507–515.
  • Saenko et al. (2010) Kate Saenko, Brian Kulis, Mario Fritz, and Trevor Darrell. 2010. Adapting visual category models to new domains. In Proc. European Conference on Computer Vision. Springer, 213–226.
  • Saito et al. (2017) Kuniaki Saito, Yoshitaka Ushiku, and Tatsuya Harada. 2017. Asymmetric Tri-training for Unsupervised Domain Adaptation. In Proc. International Conference on Machine Learning.
  • Shao et al. (2015) Ling Shao, Fan Zhu, and Xuelong Li. 2015. Transfer learning for visual categorization: a survey. IEEE Transactions on Neural Networks and Larning Systems 26, 5 (2015), 1019–1034.
  • Shao et al. (2012) Ming Shao, Carlos Castillo, Zhenghong Gu, and Yun Fu. 2012. Low-rank transfer subspace learning. In Proc. IEEE International Conference on Data Mining. IEEE, 1104–1109.
  • Shao et al. (2014) Ming Shao, Dmitry Kit, and Yun Fu. 2014. Generalized transfer subspace learning through low-rank constraint. International Journal of Computer Vision 109, 1-2 (2014), 74–93.
  • Shekhar et al. (2013) Sumit Shekhar, Vishal M Patel, Hien V Nguyen, and Rama Chellappa. 2013. Generalized domain-adaptive dictionaries. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 361–368.
  • Shekhar et al. (2015) Sumit Shekhar, Vishal M Patel, Hien Van Nguyen, and Rama Chellappa. 2015. Coupled projections for adaptation of dictionaries. IEEE Transactions on Image Processing 24, 10 (2015), 2941–2954.
  • Shi and Sha (2012) Yuan Shi and Fei Sha. 2012. Information-Theoretical Learning of Discriminative Clusters for Unsupervised Domain Adaptation. In Proc. International Conference on Machine Learning. 1079–1086.
  • Shih et al. (2013) Yichang Shih, Sylvain Paris, Frédo Durand, and William T Freeman. 2013. Data-driven hallucination of different times of day from a single outdoor photo. ACM Transactions on Graphics 32, 6 (2013), 200.
  • Shin et al. (2017) Hanul Shin, Jung Kwon Lee, Jaehong Kim, and Jiwon Kim. 2017. Continual learning with deep generative replay. In Advances in Neural Information Processing Systems. 2994–3003.
  • Shrivastava et al. (2017) Ashish Shrivastava, Tomas Pfister, Oncel Tuzel, Josh Susskind, Wenda Wang, and Russ Webb. 2017. Learning from simulated and unsupervised images through adversarial training. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE.
  • Si et al. (2010) Si Si, Dacheng Tao, and Bo Geng. 2010. Bregman divergence-based regularization for transfer subspace learning. IEEE Transactions on Knowledge and Data Engineering 22, 7 (2010), 929–942.
  • Snell et al. (2017) Jake Snell, Kevin Swersky, and Richard S Zemel. 2017. Prototypical Networks for Few-shot Learning. In Proc. International Conference on Learning Representations.
  • Socher et al. (2013) Richard Socher, Milind Ganjoo, Christopher D Manning, and Andrew Ng. 2013. Zero-shot learning through cross-modal transfer. In Proc. Advances in Neural Information Processing Systems. 935–943.
  • Stamos et al. (2015) Dimitris Stamos, Samuele Martelli, Moin Nabi, Andrew McDonald, Vittorio Murino, and Massimiliano Pontil. 2015. Learning with dataset bias in latent subcategory models. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3650–3658.
  • Sugiyama et al. (2008) Masashi Sugiyama, Shinichi Nakajima, Hisashi Kashima, Paul V Buenau, and Motoaki Kawanabe. 2008. Direct importance estimation with model selection and its application to covariate shift adaptation. In Proc. Advances in Neural Information Processing Systems. 1433–1440.
  • Sukhija et al. (2016) Sanatan Sukhija, Narayanan C Krishnan, and Gurkanwal Singh. 2016. Supervised Heterogeneous Domain Adaptation via Random Forests. In Proc. International Joint Conference on Artificial Intelligence. AAAI Press.
  • Sun et al. (2016) Baochen Sun, Jiashi Feng, and Kate Saenko. 2016. Return of Frustratingly Easy Domain Adaptation. In Proc. AAAI Conference on Artificial Intelligence.
  • Sun and Saenko (2015) Baocheng Sun and Kate Saenko. 2015. Subspace Distribution Alignment for Unsupervised Domain Adaptation. In Proc. British Machine Vision Conference.
  • Sun and Saenko (2016) Baochen Sun and Kate Saenko. 2016. Deep CORAL: Correlation Alignment for Deep Domain Adaptation. In Proc. Transferring and Adapting Source Knowledge in Computer Vision (TASK-CV) in conjunction with the ECCV.
  • Sun et al. (2011) Qian Sun, Rita Chattopadhyay, Sethuraman Panchanathan, and Jieping Ye. 2011. A two-stage weighting framework for multi-source domain adaptation. In Proc. Advances in Neural Information Processing Systems. 505–513.
  • Sun et al. (2015) Shiliang Sun, Honglei Shi, and Yuanbin Wu. 2015. A survey of multi-source domain adaptation. Information Fusion 24 (2015), 84–92.
  • Sutskever et al. (2014) Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Advances in neural information processing systems. 3104–3112.
  • Tan et al. (2009) Songbo Tan, Xueqi Cheng, Yuefen Wang, and Hongbo Xu. 2009. Adapting naive bayes to domain adaptation for sentiment analysis. In Proc. European Conference on Information Retrieval. Springer, 337–349.
  • Tang et al. (2012) Kevin Tang, Vignesh Ramanathan, Li Fei-Fei, and Daphne Koller. 2012. Shifting weights: Adapting object detectors from image to video. In Advances in Neural Information Processing Systems. 638–646.
  • Thrun (1998) Sebastian Thrun. 1998. Lifelong learning algorithms. Learning to learn 8 (1998), 181–209.
  • Tommasi and Caputo (2013) Tatiana Tommasi and Barbara Caputo. 2013. Frustratingly easy nbnn domain adaptation. In Proc. IEEE International Conference on Computer Vision. IEEE, 897–904.
  • Tommasi et al. (2010) Tatiana Tommasi, Francesco Orabona, and Barbara Caputo. 2010. Safety in numbers: Learning categories from few examples with multi model knowledge transfer. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3081–3088.
  • Tommasi et al. (2012) Tatiana Tommasi, Francesco Orabona, Mohsen Kaboli, and Barbara Caputo. 2012. Leveraging over prior knowledge for online learning of visual categories. In Proc. British Machine Vision Conference. 1–11.
  • Tommasi and Tuytelaars (2014) Tatiana Tommasi and Tinne Tuytelaars. 2014. A testbed for cross-dataset analysis. In Proc. European Conference on Computer Vision. Springer, 18–31.
  • Torralba and Efros (2011) Antonio Torralba and Alexei Efros. 2011. Unbiased look at dataset bias. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1521–1528.
  • Tsai et al. (2016) Yao-Hung Hubert Tsai, Yi-Ren Yeh, and Yu-Chiang Frank Wang. 2016. Learning Cross-Domain Landmarks for Heterogeneous Domain Adaptation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 5081–5090.
  • Tzeng et al. (2015) Eric Tzeng, Judy Hoffman, Trevor Darrell, and Kate Saenko. 2015. Simultaneous deep transfer across domains and tasks. In Proc. IEEE International Conference on Computer Vision. IEEE, 4068–4076.
  • Tzeng et al. (2017) Eric Tzeng, Judy Hoffman, Kate Saenko, and Trevor Darrell. 2017. Adversarial discriminative domain adaptation. Proc. IEEE Conference on Computer Vision and Pattern Recognition (2017).
  • Tzeng et al. (2014) Eric Tzeng, Judy Hoffman, Ning Zhang, Kate Saenko, and Trevor Darrell. 2014. Deep domain confusion: Maximizing for domain invariance. arXiv preprint arXiv:1412.3474 (2014).
  • Venkateswara et al. (2017a) Hemanth Venkateswara, Shayok Chakraborty, and Sethuraman Panchanathan. 2017a. Deep-Learning Systems for Domain Adaptation in Computer Vision: Learning Transferable Feature Representations. IEEE Signal Processing Magazine 34, 6 (2017), 117–129.
  • Venkateswara et al. (2017b) Hemanth Venkateswara, Jose Eusebio, Shayok Chakraborty, and Sethuraman Panchanathan. 2017b. Deep Hashing Network for Unsupervised Domain Adaptation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 5018–5027.
  • Vinyals et al. (2016) Oriol Vinyals, Charles Blundell, Tim Lillicrap, Daan Wierstra, et al. 2016. Matching networks for one shot learning. In Advances in Neural Information Processing Systems. 3630–3638.
  • Wah et al. (2011) C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. 2011. The Caltech-UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001. California Institute of Technology.
  • Wang et al. (2017) Bokun Wang, Yang Yang, Xing Xu, Alan Hanjalic, and Heng Tao Shen. 2017. Adversarial Cross-Modal Retrieval. In Proc. ACM on Multimedia Conference. ACM, 154–162.
  • Wang and Mahadevan (2011) Chang Wang and Sridhar Mahadevan. 2011. Heterogeneous domain adaptation using manifold alignment. In Proc. International Joint Conference on Artificial Intelligence. 1541–1546.
  • Wang et al. (2016) Donghui Wang, Yanan Li, Yuetan Lin, and Yueting Zhuang. 2016. Relational knowledge transfer for zero-shot learning. In Proc. AAAI Conference on Artificial Intelligence. AAAI Press, 2145–2151.
  • Wang et al. (2013) Hua Wang, Feiping Nie, and Heng Huang. 2013. Robust and discriminative self-taught learning. In International Conference on Machine Learning. 298–306.
  • Wang et al. (2014) Han Wang, Xinxiao Wu, and Yunde Jia. 2014. Video Annotation via Image Groups from the Web. IEEE Transactions on Multimedia 16, 5 (2014), 1282–1291.
  • Wang et al. (2015) Pichao Wang, Wanqing Li, Zhimin Gao, Chang Tang, Jing Zhang, and Philip Ogunbona. 2015. ConvNets-Based Action Recognition from Depth Maps through Virtual Cameras and Pseudocoloring. In Proc. ACM Conference on Multimedia Conference. ACM, 1119–1122.
  • Wang et al. (2018) Pichao Wang, Wanqing Li, Jun Wan, Philip Ogunbona, and Xinwang Liu. 2018. Cooperative Training of Deep Aggregation Networks for RGB-D Action Recognition. In Proc. AAAI Conference on Artificial Intelligence. AAAI Press.
  • Wang and Gupta (2016) Xiaolong Wang and Abhinav Gupta. 2016. Generative image modeling using style and structure adversarial networks. In European Conference on Computer Vision.
  • Wang and Ji (2013) Xiaoyang Wang and Qiang Ji. 2013. A unified probabilistic approach modeling relationships between attributes and objects. In Proc. IEEE International Conference on Computer Vision. 2120–2127.
  • Wang and Schneider (2014) Xuezhi Wang and Jeff Schneider. 2014. Flexible transfer learning under support and model shift. In Advances in Neural Information Processing Systems. 1898–1906.
  • Wang and Tang (2009) Xiaogang Wang and Xiaoou Tang. 2009. Face photo-sketch synthesis and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 11 (2009), 1955–1967.
  • Wei et al. (2016) Pengfei Wei, Yiping Ke, and Chi Keong Goh. 2016. Deep Nonlinear Feature Coding for Unsupervised Domain Adaptation. In Proc. International Joint Conferences on Artificial Intelligence.
  • Weinland et al. (2007) Daniel Weinland, Edmond Boyer, and Remi Ronfard. 2007. Action recognition from arbitrary views using 3d exemplars. In Proc. IEEE International Conference on Computer Vision. 1–7.
  • Weiss et al. (2016) Karl Weiss, Taghi M Khoshgoftaar, and DingDing Wang. 2016. A survey of transfer learning. Journal of Big Data 1, 3 (2016), 1–40.
  • Woodworth and Thorndike (1901) RS Woodworth and EL Thorndike. 1901. The influence of improvement in one mental function upon the efficiency of other functions.(I). Psychological Review 8, 3 (1901), 247.
  • Wu and Ji (2016) Yue Wu and Qiang Ji. 2016. Constrained Deep Transfer Feature Learning and Its Applications. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE.
  • Xian et al. (2016) Yongqin Xian, Zeynep Akata, Gaurav Sharma, Quynh Nguyen, Matthias Hein, and Bernt Schiele. 2016. Latent embeddings for zero-shot classification. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 69–77.
  • Xian et al. (2018) Yongqin Xian, Tobias Lorenz, Bernt Schiele, and Zeynep Akata. 2018. Feature Generating Networks for Zero-Shot Learning. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE.
  • Xian et al. (2017) Yongqin Xian, Bernt Schiele, and Zeynep Akata. 2017. Zero-shot learning-The Good, the Bad and the Ugly. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 3077–3086.
  • Xiao and Guo (2015) Min Xiao and Yuhong Guo. 2015. Feature space independent semi-supervised domain adaptation via kernel matching. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 1 (2015), 54–66.
  • Xie and Tu (2015) Saining Xie and Zhuowen Tu. 2015. Holistically-nested edge detection. In Proc. IEEE international conference on computer vision. 1395–1403.
  • Xu et al. (2015) Hongyu Xu, Jingjing Zheng, and Rama Chellappa. 2015. Bridging the Domain Shift by Domain Adaptive Dictionary Learning. In Proc. British Machine Vision Conference.
  • Xu et al. (2014b) Jiaolong Xu, Sebastian Ramos, David Vazquez, and Antonio M Lopez. 2014b. Domain adaptation of deformable part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 12 (2014), 2367–2380.
  • Xu et al. (2014c) Jiaolong Xu, Sebastian Ramos, David Vázquez, and Antonio M López. 2014c. Incremental Domain Adaptation of Deformable Part-based Models.. In Proc. British Machine Vision Conference.
  • Xu et al. (2016) J. Xu, D. V�zquez, K. Mikolajczyk, and A. M. L�pez. 2016. Hierarchical online domain adaptation of deformable part-based models. In 2016 IEEE International Conference on Robotics and Automation. 5536–5541.
  • Xu et al. (2018) Ruijia Xu, Ziliang Chen, Wangmeng Zuo, Junjie Yan, and Liang Lin. 2018. Deep Cocktail Network: Multi-source Unsupervised Domain Adaptation with Category Shift. In Proc. IEEE Conference on Computer Vision and Pattern Recognition.
  • Xu et al. (2017a) Xun Xu, Timothy Hospedales, and Shaogang Gong. 2017a. Transductive zero-shot action recognition by word-vector embedding. International Journal of Computer Vision 123, 3 (2017), 309–333.
  • Xu et al. (2017b) Xing Xu, Fumin Shen, Yang Yang, Dongxiang Zhang, Heng Tao Shen, and Jingkuan Song. 2017b. Matrix Tri-Factorization with Manifold Regularizations for Zero-shot Learning. In Proc. IEEE Conference on Computer Vision and Pattern Recognition.
  • Xu et al. (2014a) Zheng Xu, Wen Li, Li Niu, and Dong Xu. 2014a. Exploiting low-rank structure from latent domains for domain generalization. In Proc. European Conference on Computer Vision. Springer, 628–643.
  • Yamada et al. (2014) Makoto Yamada, Leonid Sigal, and Yi Chang. 2014. Domain Adaptation for Structured Regression. International Journal of Computer Vision 109, 1-2 (2014), 126–145.
  • Yan and Mikolajczyk (2015) Fei Yan and Krystian Mikolajczyk. 2015. Deep correlation for matching images and text. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3441–3450.
  • Yan et al. (2017) Hongliang Yan, Yukang Ding, Peihua Li, Qilong Wang, Yong Xu, and Wangmeng Zuo. 2017. Mind the Class Weight Bias: Weighted Maximum Mean Discrepancy for Unsupervised Domain Adaptation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition.
  • Yang et al. (2007) Jun Yang, Rong Yan, and Alexander G Hauptmann. 2007. Cross-domain video concept detection using adaptive svms. In Proc. ACM International Conference on Multimedia. ACM, 188–197.
  • Yang et al. (2016) Liu Yang, Liping Jing, Jian Yu, and Michael K Ng. 2016. Learning transferred weights from co-occurrence data for heterogeneous transfer learning. IEEE transactions on neural networks and learning systems 27, 11 (2016), 2187–2200.
  • Yao et al. (2015) Ting Yao, Yingwei Pan, Chong-Wah Ngo, Houqiang Li, and Tao Mei. 2015. Semi-supervised Domain Adaptation with Subspace Learning for Visual Recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2142–2150.
  • Ye and Guo (2017) Meng Ye and Yuhong Guo. 2017. Zero-Shot Classification with Discriminative Semantic Representation Learning. Proc. IEEE Conference on Computer Vision and Pattern Recognition (2017).
  • Yi et al. (2017) Zili Yi, Hao Zhang, Ping Tan Gong, et al. 2017. DualGAN: Unsupervised Dual Learning for Image-to-Image Translation. In Proc. IEEE International Conference on Computer Vision.
  • Yosinski et al. (2014) Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks?. In Advances in neural information processing systems. 3320–3328.
  • Zellinger et al. (2017) Werner Zellinger, Thomas Grubinger, Edwin Lughofer, Thomas Natschläger, and Susanne Saminger-Platz. 2017. Central moment discrepancy (CMD) for domain-invariant representation learning. In Proc. International Conference on Learning Representations.
  • Zhai et al. (2010) Deming Zhai, Bo Li, Hong Chang, Shiguang Shan, Xilin Chen, and Wen Gao. 2010. Manifold Alignment via Corresponding Projections. In Proc. British Machine Vision Conference. BMVA Press, 3.1–3.11.
  • Zhang et al. (2018a) Jing Zhang, Zewei Ding, Wanqing Li, and Philip Ogunbona. 2018a. Importance Weighted Adversarial Nets for Partial Domain Adaptation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE.
  • Zhang et al. (2017a) Jing Zhang, Wanqing Li, and Philip Ogunbona. 2017a. Joint Geometrical and Statistical Alignment for Visual Domain Adaptation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition.
  • Zhang et al. (2018b) Jing Zhang, Wanqing Li, and Philip Ogunbona. 2018b. Unsupervised Domain Adaptation: A Multi-task Learning-based Method. arXiv preprint arXiv:1803.09208 (2018).
  • Zhang et al. (2016b) Jing Zhang, Wanqing Li, Pichao Wang, Philip Ogunbona, Song Liu, and Chang Tang. 2016b. A Large Scale RGB-D Dataset for Action Recognition. In Proc. International Workshop on Understanding Human Activities through 3D Sensors (UHA3DS’16) in conjunction with International Conference on Pattern Recognition.
  • Zhang et al. (2013a) Kun Zhang, Krikamol Muandet, Zhikun Wang, et al. 2013a. Domain adaptation under target and conditional shift. In Proc. International Conference on Machine Learning. 819–827.
  • Zhang et al. (2017b) Li Zhang, Tao Xiang, and Shaogang Gong. 2017b. Learning a Deep Embedding Model for Zero-Shot Learning. In Proc. IEEE Conference on Computer Vision and Pattern Recognition.
  • Zhang et al. (2016a) Richard Zhang, Phillip Isola, and Alexei A Efros. 2016a. Colorful image colorization. In European Conference on Computer Vision.
  • Zhang and Yeung (2010) Yu Zhang and Dit-Yan Yeung. 2010. Transfer metric learning by learning task relationships. In Proc. ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1199–1208.
  • Zhang and Saligrama (2015) Ziming Zhang and Venkatesh Saligrama. 2015. Zero-shot learning via semantic similarity embedding. In Proc. IEEE International Conference on Computer Vision. IEEE, 4166–4174.
  • Zhang and Saligrama (2016a) Ziming Zhang and Venkatesh Saligrama. 2016a. Zero-shot learning via joint latent similarity embedding. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. 6034–6042.
  • Zhang and Saligrama (2016b) Ziming Zhang and Venkatesh Saligrama. 2016b. Zero-shot recognition via structured prediction. In European Conference on Computer Vision. Springer, 533–548.
  • Zhang et al. (2013b) Zhong Zhang, Chunheng Wang, Baihua Xiao, Wen Zhou, Shuang Liu, and Cunzhao Shi. 2013b. Cross-view action recognition via a continuous virtual path. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2690–2697.
  • Zhao and Hoi (2010) Peilin Zhao and Steven C Hoi. 2010. OTL: A framework of online transfer learning. In Proc. International Conference on Machine Learning. 1231–1238.
  • Zheng et al. (2012) Jingjing Zheng, Zhuolin Jiang, P Jonathon Phillips, and Rama Chellappa. 2012. Cross-View Action Recognition via a Transferable Dictionary Pair. In Proc. British Machine Vision Conference, Vol. 1. 1–11.
  • Zheng (2009) W-S Zheng. 2009. Associating Groups of People. Proc. British Machine Vision Conference.
  • Zhou et al. (2014) Joey Tianyi Zhou, Ivor W Tsang, Sinno Jialin Pan, and Mingkui Tan. 2014. Heterogeneous Domain Adaptation for Multiple Classes. In Proc. International Conference on Artificial Intelligence and Statistics. 1095–1103.
  • Zhu and Shao (2014) Fan Zhu and Ling Shao. 2014. Weakly-supervised cross-domain dictionary learning for visual recognition. International Journal of Computer Vision 109, 1-2 (2014), 42–59.
  • Zhu et al. (2017) Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In Proc. IEEE International Conference on Computer Vision.
  • Zhu et al. (2014) Y. Zhu, W. Chen, and G. Guo. 2014. Evaluating spatiotemporal interest point features for depth-based action recognition. Image and Vision Computing 32, 8 (2014), 453–464.
  • Zhu et al. (2011) Yin Zhu, Yuqiang Chen, Zhongqi Lu, Sinnojialin Pan, Guirong Xue, Yong Yu, and Qiang Yang. 2011. Heterogeneous Transfer Learning for Image Classification. In Proc. AAAI Conference on Artificial Intelligence.
  • Zhu et al. (2018) Yizhe Zhu, Mohamed Elhoseiny, Bingchen Liu, and Ahmed Elgammal. 2018. Imagine it for me: Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts. In Proc. IEEE Conference on Computer Vision and Pattern Recognition.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
365617
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description