Stellar clusters in the inner Galaxy and their correlation with cold dust emissionThe full catalog of 695 stellar clusters within the ATLASGAL Galactic range is only available in electronic form at the CDS via anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsweb.u-strasbg.fr/cgi-bin/qcat?J/A+A/

# Stellar clusters in the inner Galaxy and their correlation with cold dust emission††thanks: The full catalog of 695 stellar clusters within the ATLASGAL Galactic range is only available in electronic form at the CDS via anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsweb.u-strasbg.fr/cgi-bin/qcat?J/A+A/

Esteban F. E. Morales Max-Planck-Institut für Astronomie, Königstuhl 17, 69117 Heidelberg, Germany
Max-Planck-Institut für Radioastronomie, Auf dem Hügel 69, 53121 Bonn, Germany
Friedrich Wyrowski Max-Planck-Institut für Radioastronomie, Auf dem Hügel 69, 53121 Bonn, Germany    Frederic Schuller Max-Planck-Institut für Radioastronomie, Auf dem Hügel 69, 53121 Bonn, Germany European Southern Observatory, Alonso de Córdova 3107, Casilla 19001, Santiago, Chile    Karl M. Menten Max-Planck-Institut für Radioastronomie, Auf dem Hügel 69, 53121 Bonn, Germany
Received 3 April 2013 / Accepted 16 September 2013
###### Key Words.:
open clusters and associations: general – Galaxy: disk – Galaxy: stellar content – submillimeter: ISM – stars: formation – catalogs
###### Abstract

Context:Stars are born within dense clumps of giant molecular clouds, constituting young stellar agglomerates known as embedded clusters, which only evolve into bound open clusters under special conditions.

Aims:We statistically study all embedded clusters (ECs) and open clusters (OCs) known so far in the inner Galaxy, investigating particularly their interaction with the surrounding molecular environment and the differences in their evolution.

Methods:We first compiled a merged list of 3904 clusters from optical and infrared clusters catalogs in the literature, including 75 new (mostly embedded) clusters discovered by us in the GLIMPSE survey. From this list, 695 clusters are within the Galactic range and covered by the ATLASGAL survey, which was used to search for correlations with submm dust continuum emission tracing dense molecular gas. We defined an evolutionary sequence of five morphological types: deeply embedded cluster (EC1), partially embedded cluster (EC2), emerging open cluster (OC0), OC still associated with a submm clump in the vicinity (OC1), and OC without correlation with ATLASGAL emission (OC2). Together with this process, we performed a thorough literature survey of these 695 clusters, compiling a considerable number of physical and observational properties in a catalog that is publicly available.

Results:We found that an OC defined observationally as OC0, OC1, or OC2 and confirmed as a real cluster is equivalent to the physical concept of OC (a bound exposed cluster) for ages in excess of  Myr. Some observed OCs younger than this limit can actually be unbound associations. We found that our OC and EC samples are roughly complete up to  kpc and  kpc from the Sun, respectively, beyond which the completeness decays exponentially. Using available age estimates for a few ECs, we derived an upper limit of 3 Myr for the duration of the embedded phase. Combined with the OC age distribution within 3 kpc of the Sun, which shows an excess of young exposed clusters compared to a theoretical fit that considers classical disruption mechanisms, we computed an embedded and young cluster dissolution fraction of . This high fraction is thought to be produced by several factors and not only by the classical paradigm of fast gas expulsion.

Conclusions:

## 1 Introduction

Stars form by gravitational collapse of high-density fluctuations in the interstellar molecular gas, which are generated by supersonic turbulent motions (e.g., Klessen 2011). Following the nomenclature of Williams et al. (2000), star formation takes place in dense ( cm) clumps, which are in turn fragmented into denser ( cm) cores, in which individual stars or small multiple systems are born. Given this nature of the star formation process, stars are born correlated in space and time, with typical scales of 1 pc and 1 Myr, respectively (see Kroupa 2011), constituting young stellar agglomerates known as embedded clusters (ECs). Bressert et al. (2010) studied the spatial distribution of star formation within 500 pc from the Sun and found that, in fact, most of the young stellar objects (YSOs) in their sample are found in regions with number densities greater than , which is more than an order of magnitude higher than the density of field stars in the Galactic disk, (Chabrier 2001).

Many of the ECs defined in this way, however, are not gravitationally bound and will not become classical open clusters (OCs), i.e., bound stellar agglomerates that are free of gas and have lifetimes on the order of 100 Myr. It is very important to make the distinction from the start because there is often some confusion about this in the literature. In the definition used throughout this work (see Section 1.2), ECs are not necessarily the direct progenitors of bound OCs, but just the natural outcome of the star formation process, which is “clustered” with respect to the field stars.

The dynamical evolution of an EC is quite complex and can progress in several possible ways, depending on both the characteristics of the recently born stellar population and the physical properties of the parent molecular cloud. A gravitationally unbound molecular cloud or an unbound region of a molecular complex might still be able to form stars in subregions that are locally bound (e.g., Bonnell et al. 2011), but the resulting EC born there is globally unbound and quickly disperses into the field. On the other hand, within a molecular complex, especially in bound regions, many ECs might merge and form a few large entities (Maschberger et al. 2010). If a certain EC (once born or after merging) manages to remain gravitationally bound in the gas potential, at some point the effect of stellar feedback starts to influence the parent molecular material in the vicinity. These feedback mechanisms include protostellar outflows, evaporation driven by non-ionizing ultraviolet radiation, photoionization and subsequent H ii region expansion, stellar winds, radiation pressure and, eventually, supernovae. Again, the relative importance of a certain dissipation process is determined by the physical conditions of the system and the environment (Fall et al. 2010).

The energy and momentum introduced by stellar feedback eventually disrupts the clump and sweeps up the residual gas out of the cluster volume. The stars of this emerging cluster are now tied to each other uniquely by the stellar gravitational potential, which might not be sufficient to keep the stars together, so that the cluster dissolves. This is the classical “infant mortality” paradigm established by Lada & Lada (2003). However, Kruijssen et al. (2012) argue that this effect is only important in low-density regions, and by analyzing the dynamical state of the ECs arising from star formation hydrodynamic simulations, they find that in dense regions the formed clusters are actually bound and even close to virial equilibrium. They propose that those clusters are instead destroyed via tidal shocks from the surrounding dense gas. An alternative disruption mechanism for small- systems or larger clusters with a hierarchical substructure has recently been studied by Moeckel et al. (2012), who find through -body simulations that those clusters undergo a quick expansion owing fast internal relaxation. Bound exposed clusters are therefore the few survivors of all these processes and represent the remnants of originally more massive ECs.

The observational study of ECs is fundamental to account for most of the newly formed stellar population in the Galaxy and to investigate the interaction with its parent molecular material through stellar feedback. In the past decade, thanks to the development of all-sky infrared imaging surveys, such as 2MASS and GLIMPSE (see Section 1.1), many new ECs have been discovered in the Galaxy (e.g., Dutra et al. 2003a; Bica et al. 2003b; Mercer et al. 2005; Borissova et al. 2011), significantly increasing the number of known systems. However, so far there have only been a few systematic studies of the whole current sample of ECs and OCs in a significant fraction of the Galactic plane (e.g., Bonatto & Bica 2011; Kharchenko et al. 2012), and none of these studies has distinguished clearly the embedded population from the OC sample (see below). The main goal of this paper is to fill this gap.

Here, we statistically study all OCs and ECs known so far in the inner Galaxy from different cluster catalogs in the literature, after compiling a considerable number of physical and observational properties of these objects, particularly their degree of correlation with the surrounding molecular environment, if present. We take advantage of the recently completed ATLASGAL submm continuum survey (see Section 1.1), which provides a spatially unbiased view of the distribution of the dense molecular material in the Milky Way. While the distinction of ECs from OCs in these catalogs has primarily been made via correlations with known H ii regions or nebulae seen in the infrared, the ATLASGAL survey allows us to objectively tell111In combination with distance information for cases of ambiguous physical relation. whether or not these objects are associated with dense molecular gas, as well as to possibly detect the presence of stellar feedback via simple morphological criteria.

This paper is organized as follows. In the remainder of this introduction, we shortly present the main observational data and the nomenclature used throughout this work (Sections 1.1 and 1.2, respectively). In Section 2, we describe the literature compilation of a merged list of Galactic OCs and ECs, including a new search for ECs we conducted on the GLIMPSE survey; more details about the literature cluster lists used here are given in Appendix A. Section 3 summarizes the construction of an extensive catalog for the cluster sample within the Galactic range covered by ATLASGAL, with many pieces of information, including: characteristics of the submm and mid-infrared emission, correlation with known objects, distances (kinematic and/or stellar), ages, and membership in big molecular complexes. A more detailed description of all the assumptions and procedures made when organizing this information in the catalog is given in Appendix B. In Section 4, we report the results of a statistical analysis performed on this catalog, in which we delineate a morphological evolutionary sequence with decreasing correlation with ATLASGAL emission, classify the sample in ECs and OCs, and separately study their distance distribution, completeness, and age distribution. Finally, Section 5 summarizes the main conclusions of this paper.

### 1.1 Observations: Galactic surveys

The APEX Telescope Large Area Survey of the Galaxy (ATLASGAL, Schuller et al. 2009) is the first unbiased submm continuum survey of the whole inner Galactic disk, covering a total of 360 square degrees of the sky with Galactic coordinates in the range and . The observations were carried out at 870 m using the Large APEX Bolometer Camera (LABOCA; Siringo et al. 2009) of the APEX telescope (Güsten et al. 2006), located on Llano de Chajnantor, Chile, at 5100 m of altitude. With an antenna diameter of 12 m, the observations reach an angular resolution222Throughout this paper, we will refer as angular resolution to the full width at half-maximum of the point-spread function (or telescope beam). of at this wavelength. The submm continuum emission mainly represents thermal radiation from cool dust, which is generally optically thin and, therefore, an excellent tracer of the amount of interstellar material on the line of sight. The ATLASGAL survey reaches an average rms noise level of  mJy/beam, which translates in a detection limit of of total molecular mass (for a nominal distance of 2 kpc and a dust temperature of  K).

In the infrared, we primarily use two large scale surveys that cover the inner Galactic plane: The Two Micron All Sky Survey (2MASS, Skrutskie et al. 2006) which provides near-infrared (NIR) images of the whole sky, in the (1.25 m), (1.65 m), and (2.16 m) filters, with an angular resolution of ; and the Galactic Legacy Infrared Mid-Plane Survey Extraordinaire (GLIMPSE, Benjamin et al. 2003; Churchwell et al. 2009), which is a set of various mid-infrared (MIR) surveys of the Galactic plane carried out with the InfraRed Array Camera (IRAC, Fazio et al. 2004), on board of the Spitzer Space Telescope (Werner et al. 2004). Here we use the GLIMPSE I and II surveys which cover the ranges: and ; and ; and , comprising a total of 274 square degrees. The IRAC camera provides images at four filters centered at wavelengths 3.6, 4.5, 5.6, and 8.0 m, with an angular resolution of .

The GLIMPSE surveys have revealed very peculiar structures in star-forming regions (a summary is provided in Section 2 of Churchwell et al. 2009). The 8.0 m filter is particularly useful to detect the presence of bright fluorescent emission from polycyclic aromatic hydrocarbons (PAHs), which are excited by the stellar far ultraviolet (UV) field, but are destroyed by the harder UV radiation present within ionized gas regions. Thus, PAH emission is often observed from IR bubbles, which appear projected as ring-like structures and in many cases are tracing molecular material swept up by the expansion of H ii regions created by the ionizing radiation from massive stars (Deharveng et al. 2010). On the other hand, infrared dark clouds (IRDCs), already found in previous MIR surveys, are seen as extinction features against the bright and diffuse mid-infrared Galactic background. They represent the densest and coldest condensations within giant molecular clouds and are the most likely sites of future star formation.

For a few regions within the ATLASGAL Galactic range not covered by the GLIMPSE survey, we use data from the Wide-field Infrared Survey Explorer (WISE, Wright et al. 2010), which mapped the entire sky in four infrared bands centered at 3.4, 4.6, 12, and 22 m, with an angular resolution of in the first three bands. Despite the lower sensitivity and coarser resolution as compared with GLIMPSE, bright PAH emission and prominent IRDCs can still be identified in the WISE images, specially at 12 m (see Section B.3).

### 1.2 “Stellar cluster” definitions

In this paper, we define:

• an embedded cluster (EC) as any stellar group recently born and still containing an important fraction of residual gas within and surrounding its volume, keeping in mind that it may never become a bound open cluster on its own. Since star formation takes place in molecular clouds, this definition is equivalent to the concept of a correlated star formation event introduced by Kroupa (2011); we keep the term “cluster” in order to match older designations in the literature.

• an open cluster (OC) as any agglomerate of spatially correlated stars, and relatively free of the remaining gas. We use this observational definition of OC (see also Section 4.3) in order to account for those objects that observationally appear like classical OCs, but whose dynamical state is unknown, in some cases they can actually be gravitationally unbound.

• a physical OC as a gravitationally bound OC (i.e., a classical OC).

• an association as an unbound OC.

In this work, we sometimes use the term “star clusters” generically for all the classes defined above, especially when concerning observations. Bound, exposed star clusters, however, will be always be referred to explicitly as physical OCs.

## 2 Compilation of cluster lists

Although the number of known OCs and ECs in the Galaxy has considerably increased over the last years, the current cluster sample is still far from being complete. As we discuss in Section 4.5, the detection of a stellar cluster in the inner Galactic plane is particularly difficult, due to the high extinction and the crowded stellar background, making the cluster sample severely incomplete for distances larger than a few kpc from the Sun. If we are able to quantify this incompleteness, however, all the statistical results can properly be corrected, as we do in this work. Of course, the more complete the cluster sample, the smaller the corresponding uncertainties.

We thus performed an extensive compilation of all Galactic star cluster catalogs from the literature. For completeness, this compilation was initially not restricted to the ATLASGAL Galactic range; we only did it afterwards for the comparison with ATLASGAL emission and all the subsequent analysis. The catalogs are listed in the first three columns of Table 1, where we give, respectively, an ID used throughout this work, the corresponding reference, and its category according to the wavelength at which the clusters are detected: optical, NIR or MIR. Optical clusters are taken mostly from the current version (3.1, from November, 2010) of the catalog by Dias et al. (2002). NIR cluster catalogs are compilations, or lists from visual and automated searches mainly performed on the 2MASS survey. MIR clusters represent the objects detected by Mercer et al. (2005) in the GLIMPSE data, and the new clusters discovered by us using a different search method on the same survey, which were missing in the Mercer et al. (2005) list (see Section 2.1). In our total sample, we also included individual star clusters from the literature not listed in the previous catalogs (referred to as “Not cataloged clusters” in Table 1). A more detailed description of the diverse catalogs and references used to construct our cluster sample is given in Appendix A. This literature compilation has been updated till August, 2011.

Since we are dealing with different cluster catalogs which were constructed independently, a specific object can be present in more than one list. We therefore implemented a simple merging procedure to finally have an unique sample of stellar clusters. The first condition to identify one repetition, i.e., the same object in two different catalogs, was that the angular distance between the two given center positions were less than both listed angular diameters. We checked all merged objects under this criterion looking for the corresponding cluster names, when available, and confirmed a repetition when the names coincided. Otherwise (names not available or different), two clusters were considered the same object when the angular distance was less than both angular radii, which were also required to agree within a factor of 5. The last condition was imposed to account for the case when a compact infrared cluster shares the same field of view of a (different) optical cluster with a large angular size. This cross-identification process was not intended to be perfect, but good enough to not affect the statistical results of the whole cluster sample. Within the ATLASGAL Galactic range, a much more thorough revision was done (see Section 3), further refining the cross-identifications, and even recognizing a few duplications and spurious clusters which were excluded from the final sample (see Section A.4).

In Table 1, for a given reference, we represent as the absolute (original) number of clusters in the catalog, whereas is the number of different entries with respect to all catalogs listed before it (i.e., after merging). The optical catalogs were put first, so that any cluster visible in the optical is considered an optical cluster. The infrared lists (including the NIR and MIR clusters) were positioned afterwards in chronological order, and therefore following roughly the discovery time. Absolute and after-merging numbers are presented for the total sky range of every list, the ATLASGAL Galactic range ( and ), and finally for only those associated with ATLASGAL emission according to the criterion explained in Section 4.1. We warn that the number of clusters given there are after removing a few spurious objects and globular clusters (listed in Table 6).

After cross-identifications, we ended up with a final sample of 3904 stellar clusters, of which 2247 are optical, 1493 NIR, and 164 MIR clusters. Taking into account the repetitions within each category, but not between them, the numbers of objects are 2247 for optical, 1950 for NIR, and 197 for MIR. Note that the low number of MIR clusters is due to the confined Galactic range of the GLIMPSE survey; actually, when only considering the ATLASGAL range, which is similar to the GLIMPSE range, the numbers of objects are of the same order for the different categories: 227 optical, 315 NIR, and 153 MIR clusters, after merging.

As argued in Section A.4, for ECs (as defined in this work) we expect a minimal contamination by spurious detections, whereas for OCs that have not been confirmed by follow-up studies, we estimate a spurious contamination rate of , following Froebrich et al. (2007b).

### 2.1 New search for ECs in GLIMPSE

The GLIMPSE on-line viewer from the Space Science Institute represents a very useful tool to quickly examine color images constructed from data collected in the four 3.6, 4.5, 5.8 and 8.0 m IRAC filters, of the whole survey. By inspecting some specific regions with this viewer, we noticed that some heavily ECs are still missing in the Mercer et al. (2005) list. An EC consists mostly of YSOs, which are intrinsically redder than field stars due to thermal emission from circumstellar dust, so that they are distinguished from background/foreground stars mainly by their red colors. Such a cluster would therefore produce a clearer spatial overdensity of stars in a point source catalog previously filtered by a red-color criterion, and would be more likely missed in a search of overdensities considering the totality of point sources, due the high number of field stars. We believe that this is the principal reason which would explain the incompleteness of the Mercer et al. (2005) catalog.

We then implemented a very simple automated algorithm using the GLIMPSE point source catalog to find the locations of EC candidates. First, we selected all point sources satisfying a red-color criterion: , following Robitaille et al. (2008), who applied this condition to create their catalog of GLIMPSE intrinsically red sources. As already explained in that work, the use of these specific IRAC bands is supported by the fact that the interstellar extinction law is approximately flat between 4.5 and 8.0 m, and therefore the contamination by extinguished field stars in this selection is reduced compared to other red-color criteria. By applying this condition to the entire GLIMPSE catalog, 268 513 sources were selected. We did not impose the additional brightness and quality restrictions used by Robitaille et al. (2008) because we favor the number of sources (and therefore higher sensitivity to possible YSO overdensities) rather than strict completeness and photometric reliability, which are not needed to only detect the locations of potential ECs. With the 268 513 selected sources, a stellar surface density map was constructed by counting the number of sources within boxes of 0.01 (), in steps of 0.002 (). This significant oversampling was adopted in order to detect density enhancements that would have fallen into two or more boxes if we had used not overlapping bins. The bin size correspond to the typical angular dimension of some ECs serendipitously found using the on-line GLIMPSE viewer. To account for larger overdensities, a second stellar density map was produced with a bin size of 0.018 (), using the same step size of 0.002.

The red-source density maps were checked in a test field, and we found that thresholds of 5 sources for the small bin, and 7 sources for the large bin, are needed to detect the positions of all clusters which can be identified by-eye using the GLIMPSE on-line viewer within that area, although at the same time these low thresholds yield the detection of many spurious red-source overdensities that do not contain clusters. We decided to keep these thresholds in order not to miss any real cluster that might have a low number of members listed in the point source catalog, and perform a visual inspection of the images after the automated search to filter all spurious detections. It was also noticed that using the GLIMPSE point source archive instead of the catalog is roughly equivalent to utilizing the catalog with a lower threshold, so as long as we choose a correct threshold, the use of the more reliable GLIMPSE catalog (with respect to the archive) is justified. Within the whole GLIMPSE area, we detected 702 independent positions of overdensities (bins containing not-intersecting subsets of red sources), corresponding to 172 bins of 36 with densities  sources/bin, 195 bins of 64.8 with densities  sources/bin, and 335 locations satisfying the thresholds for both bin sizes. It should be noted that since the red-color criterion produced density maps with low crowding and therefore the local background density is always close to zero, a more sophisticated algorithm is not needed. In fact, the red-source density maps have a mean and a standard deviation of 0.039 and 0.21 sources/bin for the small bin, and 0.13 and 0.43 sources/bin for the large bin, which means that the used thresholds are above the level. Again, we emphasize that the automated search was only used to find possible locations of ECs; we did not intend to catch the complete YSO population for a given cluster in this process.

11.1

555 Units of right ascension are hours, minutes, and seconds, and units of declination are degrees, arcminutes, and arcseconds. Column 6 gives the estimated angular diameter. Column 7 gives the estimated number of stellar members within the assumed radius, considered as a lower limit due to possible non-detection of low mass stars. Column 8 indicates the detection method: automated search (A), or on-line viewer (V). Column 9 lists different flags determined after visual inspection of the GLIMPSE three-color images, indicating: association with extended 8.0 m emission (E8) or localized diffuse 4.5 m emission (E4); cluster embedded in an infrared dark cloud (DC); cluster composed of red sources and additional bright normal stars (BR); cluster composed of bright normal stars alone (B); presence of additional probable YSOs, identified as sources uniquely detected at 8.0 m (U8), or compact 8.0 m objects not listed in the point source catalog or archive (C8); sparse, not centrally condensed morphology (S); cluster identified by-eye in a nearby location of an automatically detected overdensity, but not exactly at the same position (V2).

As pointed out above, a subsequent visual selection was performed by examining the GLIMPSE images, based on a series of criteria which are explained in the following. Because the GLIMPSE on-line viewer has limited pixel resolution and is not efficient to inspect a high number of specific locations, we downloaded original GLIMPSE cutouts around these 702 positions and constructed three-color images using the 3.6 (blue), 4.5 (green) and 8.0 m (red) IRAC bands. This by-eye inspection led us to finally select 88 overdensities as locations of clusters, 17 of which are identified as known clusters from our literature compilation presented before. The remaining 71 new objects are listed in Table 2. The adopted identification is a record number (column 1) preceded by the acronym “G3CC” (GLIMPSE 3-Color Cluster666Referring to the fact that the clusters were finally selected on the GLIMPSE three-color images). The final coordinates and the angular diameter (column 6) were estimated by eye on the GLIMPSE three-color images fitting circles interactively with the display software SAO Image DS9. The visual criteria applied to select the 88 overdensities are identified for each new object as flags in the last column of Table 2. Figure 1 shows GLIMPSE three-color images of 6 clusters, illustrating these different criteria. An almost ubiquitous characteristic of the selected clusters (present in 82 cases) is their association with typical mid-infrared star formation signposts (see Section 1.1), namely: extended 8.0 m emission in the immediate surroundings (flag E8, see Fig. 1(a,b,c,d,f)), likely representing radiation from UV-excited PAHs or warm dust; more localized extended 4.5 m emission within the cluster area (flag E4, Fig. 1(a)), which might trace shocked gas by outflowing activity from protostars (see Cyganowski et al. 2008, and references therein); and presence of an infrared dark cloud in which the cluster is embedded (flag DC, Fig. 1(a,e)). We also indicate whether a cluster appears to have more stellar members than those identified by the red-color criterion, including the following situations: cluster composed of red sources and additional bright normal (not reddened) stars (flag BR, Fig. 1(d)), suggesting that the cluster is in a more evolved phase, probably emerging from the molecular cloud; cluster exclusively composed of bright normal stars (flag B, but only two cases, in conjunction with flag V2, see below); and presence of additional probable YSOs within the cluster, identified as sources uniquely detected at 8.0 m (flag U8, representing extreme cases of red color), or compact 8.0 m objects not listed in the point source catalog or archive (flag C8, Fig. 1(b,c,d,f)), due to the bright and variable extended emission at this wavelength, saturation for bright sources, or localized diffuse emission around a particular source which makes its apparent size larger than a point-source. The other flags indicate when the cluster shows up as a sparse, not centrally condensed set of sources (flag S, Fig. 1(b)), or if the cluster was noticed by-eye on the GLIMPSE images in a nearby location of an automatically detected overdensity, but not exactly at the same position (flag V2).

The remaining positions were rejected as clusters, and typically correspond to background stars extinguished by dark clouds or seen behind foreground 8.0 m diffuse emission, producing a red-source density enhancement by chance, sometimes together in the same line of sight with a couple of intrinsically red sources (YSOs) which however do not represent a cluster by their own. Quantitatively, we found that, in general, most of the rejected positions are overdensities with fewer elements than the ones selected as clusters. In fact, if we choose stricter thresholds of 8 sources for the small bin, and 10 sources for the large bin, instead of the originally used 5 and 7, respectively, the total set of overdensities decreases from 702 to just 87 independent positions, 37 of which represent our clusters. This would mean an improved “success” rate of for the automated method rather than the original . Furthermore, if we consider the effective number of elements in the 88 bins originally selected as being locations of clusters, i.e., summing possible additional stellar members (flags BR,C8,U8) within the bins, we find that 61 of our clusters satisfy the new threshold. We emphasize, however, that the additional stellar members of each cluster were recognized after detailed inspection of the GLIMPSE images, so that the use of low density thresholds in the automated method was necessary to identify the initial cluster locations, despite of the consequent detection of many spurious red-source overdensities. If we had used from the beginning the stricter thresholds, we would have missed clusters. Column 7 of Table 2 lists for every cluster the estimated number of stellar members within the assumed radius, , counting the YSOs selected by the red-color criterion and the additional members identified in the images (flags BR,C8,U8). Note that this number represents a lower limit, especially in distant clusters, since lower mass members could still be undetected due to the limited angular resolution and sensitivity of the GLIMPSE data.

We note that, because our simple automated method to find YSO overdensities is based on the GLIMPSE point source catalog, it is unavoidably biased towards young ECs that are not yet associated with very bright extended emission, which would hide many of the cluster members from the point source detection algorithm. Fortunately, it is quite likely that those bright nebulae were already looked for the presence of clusters by previous by-eye searches (see Section 4.5), so probably a few of them are really missing in our total compiled sample. We tried anyway to complete our list of new clusters by performing a systematic visual inspection with the on-line viewer over the entire area surveyed by GLIMPSE, including also fully exposed clusters that appear bright at 3.6 m (equivalent to flag ‘B’). We found from this process 23 additional clusters, of which, however, only 4 are new discoveries with respect to our literature compilation. They are marked in column 8 of Table 2 with a ‘V’, while the ones detected by the automated method are indicated with an ‘A’. We remark that, of the 17 known clusters we rediscovered from the red-source overdensities, only 3 are from the Mercer et al. (2005) list. This practically null overlap between the two detection methods demonstrates that our search is fully complementary and particularly useful to detect ECs, confirming the ideas we presented at the beginning of this Section.

Although our literature compilation of clusters is up to date until August, 2011, it is interesting to cross-check our list of new GLIMPSE clusters with the ECs recently discovered by Majaess (2013), who applied a combination of color and spectral index criteria to find YSO candidates using the WISE and 2MASS catalogs, and then looked for clusters by visually inspecting the YSOs spatial distribution. We found that only 5 new GLIMPSE clusters (they are indicated in Table 2) are associated with objects from the published list by Majaess (2013), in particular these 5 clusters are contained within the corresponding objects identified by Majaess (2013), which cover a much larger area. Due to the coarser angular resolution of WISE data with respect to GLIMPSE data, the typical stellar densities in our ECs are probably too high to make all the individual members detectable at the WISE resolution, and consequently they are hidden in the Majaess (2013) YSO selection.

## 3 Properties of the cluster sample

The next step of this work was to characterize the ATLASGAL emission, if present, at the positions of the star clusters compiled in Section 2, and to compare this emission with NIR and MIR images. Hereafter, our study is naturally restricted to the ATLASGAL Galactic range ( and ), and we refer to the list of the 695 stellar clusters within that range as the “whole cluster sample” (or simply as the “cluster sample”), unless noted. Together with this process, we performed a critical literature revision in order to add and update distances and ages for an important fraction of the sample, as well as to look for connections with known H ii regions, IRDCs, and IR bubbles. We organize all this information in an unique catalog, whose construction is summarized in the following, and described in more detail in Appendix B. The catalog is only available in electronic form at the CDS, together with a companion list of all the references with the corresponding identification numbers used throughout the table. For illustration, an excerpt of the catalog is given in Appendix C.

### 3.1 ATLASGAL and MIR emission

In order to search for submm dust continuum emission tracing molecular gas likely associated with the clusters, we examined the ATLASGAL emission around the cluster positions. The column Morph is a text flag that gives information about the morphology of the detected ATLASGAL emission versus the IR emission. It is composed of two parts separated by a period. The first part tells about how the ATLASGAL emission is distributed throughout the immediate star cluster area, including the following cases:

• emb: cluster fully embedded, with its center matching the submm clump peak (Fig. 2, top).

• p-emb: cluster partially embedded, whose area is not completely covered, or the submm clump peak is significantly shifted from the (proto-) stars locations (Fig. 2, bottom).

• surr: possibly associated submm emission surrounding the cluster or close to its boundaries (Fig. 3, top).

• few: one or a few ATLASGAL clumps within the cluster area (mostly for optical clusters having a large angular size), not necessarily physically related with the cluster.

• few*: the same morphology as before, but now the clump(s) is (are) likely associated with the star cluster according to previous studies in the literature, or because the kinematic distance derived from molecular lines agrees with the stellar distance. See Section 3.3 for a brief description of the distance determinations.

• exp: exposed cluster, without ATLASGAL emission in immediate surroundings (Fig. 3, middle and bottom).

• exp*: cluster that is physically exposed, but presents submm emission within the cluster area which appears in the same line of sight, but with a kinematic distance discrepant from the stellar distance (the cluster would be categorized as few or surr if no distance information were available).

We indicate in the second part of the column Morph (after the period) details about the mid-infrared morphology of each cluster, after visually inspecting GLIMPSE three-color images made with the 3.6 (blue), 4.5 (green) and 8.0 m (red) bands. For a few clusters with no coverage in the GLIMPSE survey (7% of the cluster sample), we instead examined WISE three-color images using the 3.4, 4.6 and 12 m filters. This flag includes the following cases:

• bub-cen: presence of an IR bubble which seems to be produced by the cluster through stellar feedback, and appears in the images centered near the cluster position (Fig. 3, top).

• bub-cen-trig: the same situation than before, together with the presence of possible YSOs at the periphery of the bubble identified by their reddened appearance in the images, suggesting triggered star formation generated by the cluster (see also Fig. 3, top).

• bub-edge: in this case, the cluster itself appears at the edge of an IR bubble, suggesting that it was probably formed by triggering from an independent cluster or massive star.

• pah: presence of bright and irregular emission at 8.0 m (12 m for WISE) which seems to be produced by the cluster through stellar radiative feedback (Fig. 2, bottom); it is attributed to radiation from UV excited PAHs or warm dust, but is not clearly identified as an IR bubble (though it sometimes shows bubble-like borders)888This situation is conceptually different from the one indicated by the flag E8 for G3CC objects (see Section 2.1), where any extended 8.0 m emission in the vicinity of the cluster is flagged. Here, the emission has to be located throughout most of the cluster area and appear as produced by the whole cluster..

### 3.2 Correlation with known objects

Associated IR bubbles that are listed in the catalogs by Churchwell et al. (2006, 2007) are identified in the table column Bub. On the GLIMPSE three-color images and on the 8.0 m images (WISE three-color and 12 m images when GLIMPSE data were not available), we also identified the presence of an infrared dark cloud in which the cluster appears to be embedded (column IRDC; see Fig. 2, top), and we give the designation from the catalogs by Simon et al. (2006) or Peretto & Fuller (2009) when the object is listed there. Finally, we searched in the literature for associated H ii regions (column HII_reg), and we flagged the sources that have been classified in the literature as ultra compact (UC) H ii regions.

### 3.3 Distance and age

An important part of this work was to assign distances to as many clusters as possible. In this regard, we took advantage of the fact that many of the ATLASGAL clumps at the locations or in the vicinity of the stellar clusters have measurements of molecular line LSR velocities (e.g., Wienen et al. 2012; Bronfman et al. 1996; Urquhart et al. 2008). Using these velocities and a combined rotation curve based on the models by Brand & Blitz (1993) and Levine et al. (2008), we computed kinematic distances for the clumps (column KDist) and, therefore, for the corresponding clusters when they were assumed to be physically associated. The kinematic distance ambiguity (KDA) was disentangled mainly by searching for previous resolutions in the literature (e.g. Caswell & Haynes 1987; Faúndez et al. 2004; Anderson & Bania 2009; Roman-Duval et al. 2009), for the clumps themselves or nearby H ii regions in the phase space. A total of 424 clusters have kinematic distance estimates for the ATLASGAL clumps, 92% of which have available KDA solutions. The uncertainties (column e_KDist) have been determined by shifting the LSR velocities by  km s to account for random motions, following Reid et al. (2009), who suggest this value as the typical virial velocity dispersion of a massive star-forming region.

We also compiled values for the stellar distance (column SDist) and age (column Age), estimated from studies of the stellar population of the clusters. These data were obtained from the original cluster catalogs or from new references found in SIMBAD. To prevent underestimation of the uncertainties (provided in columns e_SDist and e_Age), we imposed minimum errors depending on the computation method for the stellar distance, and on the range for the age (the latter following Bonatto & Bica 2011). Stellar distances are available for 222 clusters (32% of the sample), and ages for 209 clusters (30% of the sample). The most common method for stellar distance and age determination is isochrone fitting (e.g., Loktin et al. 2001), which implies that these parameters are available mainly for exposed clusters (see Section 4.7).

The final adopted distance for each cluster (column Dist) was chosen to be the available distance estimate with the lowest uncertainty. In some cases, we adopted independent distance estimates from the literature if they were more accurate than SDist and KDist (e.g., from maser parallax measurements; see Reid et al. 2009, and references therein). Clusters within a particular complex (identified in the column Complex) were assumed to be all located at the same distance, determined from the literature, or kinematically from an average position and velocity.

In total, there are distance determinations (Dist) for 538 clusters, i.e., for 77% of our sample. Naturally, there is a dichotomy in the distance estimation method depending on whether or not the cluster is associated with an ATLASGAL source with available velocity, so that most exposed clusters uniquely have stellar distances, whereas the distances for ECs are mainly kinematic or from associations with complexes. However, it is still possible to compare stellar and kinematic determinations for a subsample of 38 clusters (mostly embedded) which have distances available from both methods. This comparison is shown in Figure 4, where plus symbols mean agreement between stellar and kinematic distances within the corresponding uncertainties, and circles are the cases in which there is a discrepancy between both techniques; the color indicates which distance estimate was finally adopted in our catalog: stellar (red), kinematic (blue), and other (black). The plot reveals that in our cluster sample, both methods are quite consistent with each other, with a 84% of agreement (32 out of 38 objects). We note that among the discrepant cases, there are two ECs (points  kpc and  kpc in the plot) whose method for age and (stellar) distance estimation was found to be particularly inaccurate (see Section 4.7.1).

The rms between the stellar and kinematic distances compared in Figure 4 is 1.28 kpc, which represents the combined error, for this particular subsample, of both stellar and kinematic distances added in quadrature. If we compute this error from the estimated uncertainties e_KDist and e_SDist averaged over the subsample, we obtain a value of 1.59 kpc, which means that we slightly overestimated some of the uncertainties, probably because we were quite conservative in determining the minimum errors for the stellar distances (see Section B.5). The average uncertainties are  kpc and  kpc for the subsample of the 38 clusters used for comparison, and  kpc and  kpc for the whole sample. The high average error for the stellar distance in the subsample with respect to the whole sample is due to the fact that most of these clusters have stellar distances estimated from the spectrophotometric method, which is more inaccurate than, e.g., main sequence or isochrone fitting (see Section B.5). The average estimated uncertainty in the adopted distance is  kpc for the whole sample (and 0.52 kpc for the subsample).

## 4 Analysis

### 4.1 Morphological evolutionary sequence

Here, we use the characterization of the ATLASGAL emission found throughout each cluster’s area and/or environment (described in Section 3.1) to define main morphological types and delineate an evolutionary sequence. First, in order to test our visual ATLASGAL morphological flags specified above (corresponding to the first part of the column Morph, and represented hereafter by m), we compared them against the more quantitative parameter of our catalog, which is the projected distance of the nearest ATLASGAL emission pixel, normalized to the cluster angular radius. We found a reasonable correlation: for all deeply ECs (m = emb), for partially ECs (m = p-emb), for clusters surrounded by submm emission (m = surr), and for exposed clusters (m = exp). Exposed clusters with only comprise a few cases with a large angular size and very faint emission close to their borders. The remaining morphological flags are very specific and we do not expect any correlation with the quantity Clump_sep.

Denoting by Cf the first digit of the flag Clump_flag from our catalog (a value means that the nearest ATLASGAL clump is likely associated with the cluster), and using the logical operators , and (‘and’, ‘or’, and ‘not’, respectively), we define five morphological types as follows:

• EC1:

• EC2:

• OC0:

• OC1:

• OC2: (

The morphological type for each cluster is given in the column Morph_type of our catalog. Figures 2 and 3 present one example cluster for each morphological type, shown in GLIMPSE three-color images, and 2MASS three-color images overlaid with ATLASGAL contours. In simpler words, given that star clusters are expected to be less and less associated with molecular gas as time evolves, due to gas dispersal driven by stellar feedback, we have defined above a morphological evolutionary sequence, with decreasing correlation with ATLASGAL emission. EC1 are deeply ECs (Fig. 2, top), EC2 are partially ECs (Fig. 2, bottom), OC0 are emerging exposed clusters (Fig. 3, top), and finally there are two kinds of totally exposed clusters: OC1 are still physically associated with molecular gas in their surrounding neighborhood (an ATLASGAL clump at a projected distance of Clump_sep times the cluster radius, see Fig. 3, middle), whereas OC2 are all the remaining exposed clusters, which present no correlation with ATLASGAL emission (Fig. 3, bottom).

Note that, however, this classification is not perfect. For example, although the gas velocity and stellar distance data are quite extensive, they are not complete to identify all the , and cases, so that some misclassification might occur in the type OC2. Similarly, the physical link between the submm emission and the ECs was based on the morphology seen in the images, and some chance alignments might still be present in a few cases (estimated to be about 5%, see Section 4.2). Therefore, the defined morphological types should primarily be considered in a statistical way, and for individual objects they must be treated with caution. Column 2 of Table 3 lists how many objects fall in each morphological type for the whole cluster sample. Note that the low number of OC1 clusters could be partially due to the observational difficulty in identifying an exposed cluster physically associated with molecular gas in their surroundings, as remarked before. Column 3 gives the number of clusters with available distances, and the remaining columns will be described in Section 4.6.

With this morphological classification, it is easy to determine (again, statistically) which clusters are associated with ATLASGAL emission: simply as those with types EC1,EC2,OC0 or OC1. These clusters are counted for every catalog in the last two columns of Table 1, as absolute and after-merging numbers of objects ( and , respectively). As expected, optical clusters are rarely associated with ATLASGAL emission (only of them, most of which are of type OC0 or OC1), since otherwise they would be barely visible at optical wavelengths due to dust extinction. On the other hand, the majority of the NIR and MIR clusters are physically related with submm dust radiation ( and 74% of them, respectively). Although this is also expected because infrared emission is much less affected by dust extinction than visible light, these high percentages might partially be a consequence of the detection method of the infrared cluster catalogs, which in most cases tried to intentionally highlight the EC population. For example, the 2MASS by-eye searches by Dutra et al. (2003a) and Bica et al. (2003b) were done towards known radio/optical nebulae, and our new GLIMPSE cluster candidates were detected after applying a red-color criterion (see Section 2.1). In these particular catalogs, almost the totality of objects are associated with ATLASGAL emission.

### 4.2 Chance alignments

We computed the probability of chance alignments of our stellar clusters with ATLASGAL clumps, and the different known objects looked for spatial correlation in our catalog (see Section 3.2), in order to test the validity of the assumption of physical relation, when this is only based on the position of the objects on the sky. For a given sample of objects, this probability was estimated semi-analytically by assuming that the objects within (where most sources are located for all samples used) and the longitude range originally covered, are uniformly distributed over that area, and that their angular sizes are distributed according to the observed sizes. We first calculated the probability of overlap of each cluster with one or more objects from this hypothetical sample, and then we averaged these probabilities over two different sets of clusters: morphological types EC1 and EC2 together (hereafter EC-); and types OC0, OC1 and OC2 together (hereafter OC-).

For ATLASGAL clumps, we adopted a total number of 6451 objects within and , from the compact source catalog by Contreras et al. (2013), which, together with their estimated effective radii, gives an average chance alignment probability of 8.8% for clusters with types EC-, and 32% for clusters with types OC-. Considering that the submm and infrared morphologies of deeply ECs (type EC1) usually support the real physical relation with molecular gas (e.g., matching peaks of submm emission and stellar density), and that partially ECs (type EC2) are generally associated with more than one ATLASGAL clump, in practice the fraction of chance alignments of EC- clusters with ATLASGAL compact sources is likely below 5%, which is low enough to not affect the statistics of this work. Due to their larger angular sizes, clusters of types OC- are more prone to be aligned with ATLASGAL clumps by chance, and therefore our additional requirements to assume that an exposed cluster is associated with ATLASGAL emission are justified (morphological criteria or matching distances for types OC0 and OC1).

For the known objects considered in our catalog, we assume that there are 4936 IR bubbles in the range and (Simpson et al. 2012)101010This is a recent catalog of IR bubbles which is much more complete than the Churchwell et al. (2006, 2007) catalogs, but was not used in this work because it was published after our cluster catalog was constructed. In any case, we searched for IR bubbles by eye at every cluster position to describe the MIR morphology (see Section 3.1)., IRDCs within and (from the catalogs by Simon et al. 2006; Peretto & Fuller 2009), and 944 H ii regions in the range and (from the recently discovered and previously known H ii regions listed in Anderson et al. 2011). In this case, to compute the chance alignment probability of each cluster with the objects of a given sample, we also required that the objects were larger than half the size of the cluster and that the distance between the object’s position and the cluster center were less than the sum of both radii divided by two, so that the alignment really mimics a physical relation misidentified by eye. The averaged probabilities are quite similar for clusters with types EC- and OC-, and they are all low: for IR bubbles, for IRDCs, and for H ii regions.

### 4.3 Observational classification of OCs and ECs

We can also use the morphological evolutionary sequence established in Section 4.1 to observationally define in our sample the concepts of EC and OC. Since any stellar agglomerate that appears deeply or partially embedded in ATLASGAL emission would satisfy our physical definition of EC presented in Section 1.2, we simply use as observational definition the embedded morphological types: EC = EC1 EC2. We consider the remaining morphological types as OCs, but excluding those objects that have not been confirmed by follow-up studies, since we expect for them a high contamination rate by spurious candidates (see Section A.4): OC = (OC0 OC1 OC2) (ref_Conf not empty), where ref_Conf is the column in the catalog indicating the reference for cluster confirmation (see Section B.5).

However, this observational definition of OC does not necessarily mean that the cluster is bound by its own gravity, and therefore, is not fully equivalent to the concept of physical OC defined in Section 1.2. To investigate under which conditions both definitions agree, we can apply the empirical criterion proposed by Gieles & Portegies Zwart (2011) which distinguishes between physical OCs and associations by comparing the age of the object with its crossing time, , computed as if it were in virial equilibrium. In useful physical units, Equation (1) of Gieles & Portegies Zwart (2011) becomes111111Before converting to physical units, we corrected a mistake in the original equation by Gieles & Portegies Zwart (2011): the transformation from virial radius to projected half-light radius is just for a Plummer model, so that the constant in their equation is instead of 10.

 tcross=9.33(100M\sunM)1/2(Reffpc)3/2Myr, (1)

where and are, respectively, the mass and the observed 2D projected half-light radius of the cluster. Unfortunately, mass estimates and accurate structural parameters are usually not directly available in the OC catalogs; in particular, there are no mass data in the Dias et al. (2002) catalog, and the given sizes come from individual studies compiled there and are mostly derived from visual inspection. We therefore used the masses and radii determined by Piskunov et al. (2007), who fitted a three-parameter King’s profile (King 1962) to the observed stellar surface density distribution of 236 objects taken from an homogeneous sample of 650 optical clusters in the solar neighborhood (Kharchenko et al. 2005b, a), which is a subset of the current version of the Dias et al. (2002) catalog. Piskunov et al. (2007) estimated the masses from the tidal radii, and the effective radius entering in Equation (1) can be derived from both the core and tidal radius (we used Equation (B1) of Wolf et al. 2010). Because only 14 of the clusters analyzed by Piskunov et al. (2007) are within the ATLASGAL sky coverage, in order to improve the statistics we applied the Gieles & Portegies Zwart (2011) criterion to the 236 studied objects, under the assumption that they are all OCs as observationally defined by us. This supposition is quite acceptable since they are optically-detected clusters and indeed within the ATLASGAL range almost all of them (13 out of 14) are classified as OCs.

We computed the crossing times using Equation (1), and in Figure 5 they are plotted versus the corresponding ages available from the Kharchenko et al. (2005b, a) catalogs. The dashed line is the identity  Age, which divides the physical OCs ( Age) from associations ( Age). It can be seen in the plot that, because the resulting crossing times are relatively short (), the majority of the objects studied by Piskunov et al. (2007) are physical OCs for ages in excess of 10 Myr. In fact, for , which is the threshold above which the age distribution can uniquely be explained through classical cluster disruption mechanisms (see Section 4.7.2), only 2.6% of the objects are formally associations. We thus conclude that our observational definition of OC agrees with the physical one provided by Gieles & Portegies Zwart (2011, what we call a physical OC) for ages greater than  Myr, which corresponds to the 74% of our OC sample within the ATLASGAL range. Younger OCs can be either associations, as a result of early dissolution, or already physical OCs.

### 4.4 Spatial distribution

In this Section, for the clusters in our sample with available distance estimates we study their spatial distribution in the Galaxy, and with respect to the Sun. Figure 6 shows the Galactic distribution of the clusters separated in the (a) OC and (b) EC categories defined in the previous Section, on top of an artist’s conception of the Milky Way viewed from the north Galactic pole (R. Hurt from the Spitzer Science Center, in consultation with R. Benjamin). The image was constructed based on multiwavelength data obtained from the literature, and we have scaled it to  kpc (Genzel et al. 2010, see Section B.4). It is clear from the image that ECs probe deeper the inner Galaxy than the OC sample, which is concentrated within a few kpc from the Sun ( kpc). This, of course is an observational effect mainly produced by the difficulty in detecting exposed clusters against the Galactic background, compared to ECs (see Section 4.5), and enhanced by the fact that some genuine OCs have no distance estimates and therefore cannot be included in the spatial distribution analysis (e.g., there are 123 clusters of type OC2 without available distance, half of which might be real). ECs are spread over larger distances from the Sun ( kpc) and, although few of them can be detected beyond the Galactic center, a paucity of ECs is hinted within the Galactic bar, augmented by some apparent crowding close to both ends of the bar. The Galactic distribution of ECs is consistent with the spiral structure delineated on the background image; however, the large distance uncertainties ( kpc on average, see Section 3.3), and the limited distance coverage, prevent the ECs from clearly defining the spiral arms by their own.

To really quantify how deep our OC and EC samples reach into the inner Galaxy, and to estimate the completeness fraction at a given distance, we need to study the observed heliocentric distance distribution of the clusters, and compare it to what is expected from making some basic assumptions. In the following, we denote by the distance of the cluster from the Sun, projected on the Galactic plane121212In practice, we did not distinguish between the distance and the projected distance . Since the maximum latitude within the ATLASGAL range is , the difference is less than 0.03%, far below the distance uncertainties., and by the height of the cluster above the Galactic plane. For simplicity, we also define , where is the displacement of the Sun above the plane; this is actually what we obtain directly131313In this paper, for simplicity we have assumed that the plane is parallel to the “true” Galactic plane, although in reality this is not the case (Goodman et al., in preparation). While this has a negligible effect on the distance distribution and the completeness, it may distort the derived height distribution when considering clusters at large distances from the Sun (see Section 4.4.3). from the cluster distance and its Galactic latitude , . The observed - and -distributions are shown, respectively, in Figures 7 and 8, for our cluster sample separated in OC and EC categories. In the construction of the histograms, we used fixed bins of  pc and  kpc, but since the distance uncertainties are quite nonuniform, we have fractionally spread the ranges determined by the central values and their uncertainties over the covered bins. In other words, for a cluster with distance and uncertainty , we considered all the bins overlapping with the range and in each bin we added the fraction (with respect to the total width of the range, ) comprised by the corresponding overlap. The total OC and EC distance distributions were obtained by repeating this procedure for all the clusters. The -distributions were constructed using the same method, and the fitted curves plotted in Figures  7 and 8 are explained in the following.

#### 4.4.1 Assumed model for the spatial distribution

In general, we can assume that the spatial number-density of OCs or ECs in the Galaxy is described by a combination of two independent exponential-decay laws for the cylindrical coordinates and , centered in the Galactic center: , with and . This is a common functional form used to characterize the Galactic distribution of stars (see Section 1.1.2 of Binney & Tremaine 2008), and has already been applied in previous OC studies (Bonatto et al. 2006; Piskunov et al. 2006). One might want to consider the imprint of spiral arm structure in the azimuthal distribution of ECs, since they are still embedded in molecular clouds, but here we are interested in the distance and height longitude-averaged distributions, for which azimuthal substructure is less important. Furthermore, as noted above, our EC distances are not accurate enough to constrain the location of the spiral arms. If we transform the density to a coordinate system centered at the Sun, and assume that we are observing the totality of the clusters in the Galaxy within the ATLASGAL range ( and , with and ), the resulting density (not averaged in longitude yet) can be written as

 ρtot(D,ℓ,Z)={ρ0φ(D,ℓ)φz(Z+Z0)if  |Z|≤Dtanb10else~{}, (2)

where

 φ(D,ℓ)≡φR(√R20+D2−2R0Dcosℓ) . (3)

Now we can derive an analytical expression for the -distribution of an ideally complete sample:

 ΦtotD(D) ≡ ∫∞−∞∫ℓ1−ℓ1ρtot(D,ℓ,Z)DdℓdZ (4) = Σ0fb1(D)D∫ℓ1−ℓ1φ(D,ℓ)dℓ , (5)

where is the surface number-density on the Galactic disk for , and we have defined the function as

 (6)

which arises from the fact that the limited latitude coverage restricts the integration in at each distance.

#### 4.4.2 Completeness fraction

In practice, however, as already mentioned before and discussed in Section 4.5, we are unable to detect the totality of the clusters within the ATLASGAL range, due to the difficulty in star cluster identification towards the inner Galaxy. Indeed, the -distributions that we really observe for OCs and ECs (see Figure 8) do not increase with distance up to the Galactic center (), as we would expect from Equation (5); instead, they reach a maximum at a nearby distance and then decay considerably, especially for optical clusters. The observed -distributions are dominated by the high incompleteness at increasingly larger distances from the Sun, and therefore, are insensitive to large scale structure on the Galactic disk such as the scale length . Attempts to include in the parametric fit to the distance distributions described below resulted in heavily degenerated output parameters and practically no constraint on their values. We then eliminated the dependence of the model on by making the rough approximation that the underlying radial distribution of clusters is uniform, i.e., . This is supported by the fact that, due to the incompleteness, most clusters in our sample are within a few kpc from the Sun, where the variations in can be considered small relative to the completeness decay. The constants and must now be interpreted as Solar neighborhood values, and from Equation (5) the complete -distribution becomes

 ΦtotD(D)=2ℓ1Σ0fb1(D)D . (7)

On the other hand, defining a fractional factor that quantifies the completeness of the cluster sample as a function of distance141414Ideally, one should consider a completeness fraction dependent on Galactic longitude also, , as we expect lower cluster detectability for low , where the stellar background is higher. However, since we made the approximation , the integration in longitude would only affect the term , and therefore the factor we used can be thought as a longitude-averaged completeness fraction., we can express the observed -distribution as

 ΦD(D)=2ℓ1Σ0fc(D)fb1(D)D . (8)

In order to assign a particular parametric shape to the completeness fraction, we chose an ansatz for based on previous statistical works of OCs in the whole sky. Bonatto et al. (2006) studied the WEBDA database151515WEBDA is an on-line OC database originally developed by Mermilliod (1996), and available on http://www.univie.ac.at/webda/; the clusters of this database are included in the Dias et al. (2002) catalog. at that time and found, by completeness simulations, that their analyzed OC sample is highly incomplete in the inner Galaxy, even within what they called the “restricted zone”, defined as an annulus segment with Galactocentric distances in the range . The completeness fraction they determined decays almost immediately from to (see their Fig. 11; note that  kpc in that work). However, Piskunov et al. (2006) claim that the Kharchenko et al. (2005b, a) OC catalogs constitute a complete sample up to about 0.85 kpc from the Sun. This is nicely illustrated in their Fig. 1, where a flat distribution of surface number-density of clusters is exhibited up to that distance, after which the distribution starts to decrease considerably. If the completeness fraction of their sample in the inner Galaxy were similar to that obtained by Bonatto et al. (2006), the surface density distribution would be a decreasing function immediately from  kpc rather than from  kpc161616We checked by numerical integration of that the raising of the surface density distribution in the inner Galaxy due to an exponential Galactic disk is practically imperceptible for  kpc, and therefore, a flat distribution cannot be the combined result of incompleteness and exponential disk structure.. We think that this discrepancy is mainly caused by two effects: 1) the cluster sample studied by Bonatto et al. (2006) (654 objects with known distances) is less complete than, e.g., the current version of the Dias et al. (2002) catalog used in this work (1309 clusters with available distances), which is equivalent to the Kharchenko et al. (2005b, a) sample within 0.85 kpc; and 2) the “restricted zone” considered by Bonatto et al. (2006) covers a larger area than the circle defined by the completeness limit of Piskunov et al. (2006) (radius of 0.85 kpc centered at the Sun), and thus includes regions where the OC sample is indeed incomplete. In fact, we performed a quick test on the current Dias et al. (2002) catalog by constructing the Galactocentric radii distribution of clusters within 1 kpc from the Sun, and we obtained a shape that is not incompatible with a exponential law in the whole range, as opposed to the distribution derived by Bonatto et al. (2006, their Fig. 9).

Based on the above discussion, the completeness fraction for our OC sample is likely up to a close distance from the Sun, , and then starts to decay significantly. We assume that the decay is exponential:

 fc(D)={1if  D≤Dce−(D−Dc)/s0else~{}. (9)

This parametrization allows us to investigate the possibility that the sample is always incomplete, as for Bonatto et al. (2006), by just imposing . We employ the same functional form for the completeness fraction of ECs, but of course varying the parameters and .

#### 4.4.3 Fit for the height distribution

Before proceeding to fit Equation (8) to the observed -distributions, we first need some estimates for and which are used to compute the factor . We obtain those estimates from the -distribution, which can be analytically written as

 ΦZ(Z)=e−|Z+z0|/zh∫∞|Z|/tanb1ΦD(D)2zhfb1(D)dD . (10)

The advantage in writing this equation explicitly in terms of is that we can directly use the observed -distribution instead of its analytical expression (and compute the integral numerically), so that it is possible to fit the -distribution with only two free parameters, and , and independently of the fit for the distance distribution. All the fits were performed using the Levenberg-Marquardt least-squares minimization package mpfit (Markwardt 2009), implemented in IDL, and we have assumed Poisson uncertainties. The best fit of Equation (10) to the observed -distribution of OCs is shown in Figure 7(a) as a solid curve, and the corresponding fitted parameters are  pc and  pc. These values are in excellent agreement with the ones derived by Bonatto et al. (2006), if we consider their scale height within the Solar circle (which is the case for almost the totality of our OC sample).

The observed -distribution of ECs (Figure 7(b)) is much more irregular than that of OCs, and therefore a proper fit is not possible. This is likely due to the fact that ECs are spread over a larger area than OCs, and therefore, present lower statistics in the Solar neighborhood and larger average errors in (). In addition, ECs are usually grouped in complexes, as we will see in Section 4.8 and can already be noted in Figure 6(b), where some particular locations appear crowded with many close objects, enhancing the non-uniformity of their spatial distribution. However, if we adopt the same parameters and derived from the OC sample and compute the predicted distribution from Equation (10) (naturally, using now the observed of ECs), the resulting curve is roughly consistent with the observed -distribution, as shown in Figure 7(b) (solid line). The most systematic discrepancy can be identified for  pc, where there is a significant deficit of observed clusters with respect to the predicted distribution, probably due to the difficulty in detecting ECs below the Galactic disk for large distances. Indeed, Figure 7(b) also shows the observed -distribution for ECs with  kpc (darker inner histogram) and the corresponding prediction (dashed curve), and we can see that in this case the deficit of observed clusters below the Galactic plane is only marginal. Another explanation might be the fact that we have assumed that the plane is parallel to the Galactic disk, while in reality the combined effect of the offset of the Sun above the “true” Galactic plane, and of the Galactic center below the plane, slightly tilts the plane towards the south of the Galaxy (see Goodman et al., in preparation), so that clusters at large distances from the Sun and below the Galactic plane would appear at more negative values in the true -distribution. This could help to populate the bins in the range of the deficit of observed clusters, and would also explain why the deficit is less important for the distribution of clusters with  kpc.

#### 4.4.4 Fit for the distance distribution

Using now values for and obtained from the OC sample, which are also consistent with the EC height distribution, to compute the factor defined in Equation (6), we fitted the analytical distribution from Equation (8) to the observed -distributions of OCs and ECs, with free parameters , , . The last two parameters are implicit in the completeness factor defined in Equation (9). The best fits are overplotted as solid curves on the corresponding histograms of Figure 8, and the fitted parameters are given in Table 4. As can be already noted in the plots and confirmed by the reduced values (0.90 for OCs, and 1.48 for ECs), the assumed form of the completeness fraction (Equation (9)) is a good representation of the overall detectability of star clusters in the inner Galaxy. The few outliers in the observed distribution with respect to the fitted analytical function for OCs with distances  kpc mainly correspond to exposed clusters recently discovered at infrared wavelengths. A similar tendency is hinted for ECs with  kpc, although in this case these outliers are also consistent with the irregular nature of the distribution in general, which slightly deviates (at one-sigma level) from the fitted curve at other distance bins. However, some problems with the resolution of the KDA, resulting in ECs incorrectly assigned to the far distance, cannot be ruled out.

It is remarkable that, despite the lower statistics caused by restricting to the ATLASGAL range, the fitted completeness limit of our OC sample,  kpc, is consistent with that derived by Piskunov et al. (2006) for their all-sky sample in the Solar neighborhood171717Very recently, a significant effort in obtaining distances and other parameters of most of the known OCs and ECs has been published by Kharchenko et al. (2013), who claim an overall completeness limit of 1.8 kpc. Since ECs are not dominant within a complete sample, the new limit represents an intrinsic improvement in the OC completeness.. For ECs, both the completeness limit and the completeness scale length are larger than the corresponding values of the OC distribution (see Table 4), quantitatively confirming that, from an observational point of view, the EC sample traces larger distances from the Sun than the ones traced by our OC sample.

The fitted completeness limits for OCs and ECs are significantly above zero, practically discarding the possibility that the cluster samples are always incomplete in the inner Galaxy, as suggested by Bonatto et al. (2006) for OCs. To further test this option, we performed an alternative fit of Equation (8) to the observed -distributions, now fixing . For each distribution in Figure 8, the resulting best fit is shown as a dashed line, and we immediately notice that this alternative fit is poorer than the one with as free parameter, specially for OCs. Indeed, we applied a Kolmogorov-Smirnov test to all the fitted distribution functions in a distance range free of far-distance outliers ( kpc for OCs,  kpc for ECs), and we found that the fit can be rejected with a significance level of 5% for OCs, and 6.5% for ECs. We thus conclude that the OC and EC samples in the inner Galaxy are roughly complete up to a distance of  kpc and  kpc, respectively, as derived from the free- fits.

### 4.5 Discussion on the completeness

In general, the existence of a stellar cluster is observationally established by an excess surface density of stars over the background, so that its detectability depends on its richness, its angular size, the number of resolved individual members and their apparent brightness (which is directly related to the distance), the surface density of field stars, and the amount of extinction on the line of sight (Lada & Lada 2003). Consequently, it is particularly difficult to identify a star cluster in the inner Galactic plane, where both the stellar background and the extinction are relatively high, or a very distant cluster, for which its members appear faint and could be confused as a few single stars due to limited angular resolution of the observations. In fact, we have shown in the previous Section that the current samples of OCs and ECs in the inner Galaxy are complete up to only a close distance from the Sun, and then the completeness heavily decreases as distance increases.

We have also seen that incompleteness affects the OC sample more severely than the ECs, i.e., the latter have a higher completeness limit and a less drastic decay in the completeness fraction. At first glance, this might seem contradictory since ECs are, by definition, embedded in molecular clouds and thus subject to a high degree of in situ dust extinction. However, at infrared wavelengths, ECs become easier to detect than exposed clusters because it is easier to distinguish them from the field population. Since ECs are usually associated with illuminated interstellar material, they can be identified by eye towards the locations of known nebulae or star-forming regions (e.g., Dutra et al. 2003a; Bica et al. 2003b; Borissova et al. 2011), even if the clusters are partially resolved or highly contaminated by extended emission. In other words, despite bright nebular emission can prevent young stars from being found by point source detection algorithms and therefore hide the host EC from automated searches, at the same time it can help to identify such a cluster when searched by eye against a high stellar background. For clusters with fainter or less irregular extended emission, automated searches can also take advantage of some distinctive characteristic of ECs (like the red-color criterion of our GLIMPSE search, see Section 2.1) to separate them from the background, which is in general not feasible for an evolved OC because its member stars present similar observational properties than the field population.

It is interesting to compare our distance distribution of ECs (Figure 8(b)) with that of individual Spitzer-detected YSOs (Robitaille et al. 2008), as simulated by Robitaille & Whitney (2010) using a population synthesis model. They show that the synthetic YSOs that would have been detected by Spitzer and included in the Robitaille et al. (2008) catalog correspond to massive objects with a mass distribution that peaks at . The corresponding distance distribution of this model is presented in Fig. 1 of Beuther et al. (2012) for the range. The plot reveals a high number of far YSOs up to distances of  kpc, showing that, despite the high extinction, individual (massive) YSOs can be detected deep into the Galactic plane, as opposed to ECs. We therefore think that the low detectability of a far EC is mainly due to the faint apparent brightness of its low-mass population and confusion of its members, so that the whole cluster might be misidentified as an individual massive young star. At near-infrared wavelengths, however, extinction could still play an important role in hiding a far EC.

### 4.6 Definition of a representative sample

We can quantify how many OCs and ECs we are missing within a certain distance from the Sun, using the analytical expressions for the observed distance distribution, (Equation (8)), and for the distance distribution that would be observed if we detected the totality of the clusters in the inner Galaxy, (Equation (7)), and using the fitted parameters given in Table 4. We define the cumulative completeness fraction, , as the ratio of the number of observed clusters with distances  to the number that would represent a complete sample within :

 Fc(D)≡Ncl(≤D)Ntotcl(≤D)=∫D0ΦD(D′)dD′∫D0ΦtotD(D′)dD′ . (11)

Now we can define a representative cluster sample as all objects with distances  for which the fraction is above a certain threshold in both the OC and EC samples (this naturally places the restriction on the OC sample alone, since it is more incomplete). We chose a threshold of 0.25, for which the distance has to be  kpc. For simplicity, we just adopt  kpc, where and for the OC and EC samples, respectively. Note that although the selection of the threshold is somewhat arbitrary, if we keep in mind the above fractions, we only need a certain distance limit where the samples are not too incomplete and at the same time have a reasonable absolute number of objects to perform a statistical analysis.

In Column 4 of Table 3, we list the number of clusters with  kpc for each morphological type; the total number of ECs in the representative sample is 98. To count the number of OCs, according to our definition we need that the clusters are also confirmed (ref_Conf not empty). The number of confirmed clusters with  kpc is given in Column 5 for each morphological type, from which we obtain a total number of 146 OCs in the representative sample. With the fractions computed before, it is also possible to estimate the number of clusters that we would observe within 3 kpc, if we had complete samples of OCs and ECs. The corresponding estimates are listed in Column 6, and were simply derived as for EC types, and for OC types. Note that the large number of OC2 clusters in this ideally complete sample is due to the fact that they cover a wide age range. The age distribution of our sample is analyzed in the next Section.

### 4.7 Ages

We would expect that the ages of the stellar clusters increase along the morphological evolutionary sequence defined in Section 4.1. By dividing the cluster sample in such morphological types, we indeed obtained an increasing tendency in the corresponding ages distributions. However, we were unable to estimate an average age or age ranges for each individual type, given the low number of clusters with available ages that fall within each category, except for OC2. In the whole sample, for types EC1, EC2, OC0 and OC1 there are, respectively, only 9, 16, 15 and 9 objects with age estimates, whereas for OC2 clusters there are 160. Note that for types OC0 and OC1, the total number of objects is also low (see Table 3), so that the main reason for the small number of age estimates is the low absolute statistics. On the other hand, for the much more numerous EC1 and EC2 morphological types (and possibly also part of the OC0 type), the lack of age estimates may simply be caused by the difficulties involved in obtaining these values.

It is still possible, however, to derive an upper limit for the ages of the ECs (EC1 and EC2 together), and also to study the age distribution of the whole OC population (OC0, OC1 and OC2 together), as described below.

#### 4.7.1 Upper limit age of ECs

The EC ages compiled from the literature were estimated using a variety of methods, including: comparison with theoretical isochrones on a Hertzsprung-Russell diagram constructed after spectroscopic classification in the near-infrared (e.g., Furness et al. 2010), use of the relation between the circumstellar disk fraction in the cluster and its age (following Haisch et al. 2001), and comparison with synthetic clusters constructed by Monte Carlo simulations (Stead & Hoare 2011), among others. We remark that from the 25 ECs with available age estimates, there are two objects that seem to be artificial outliers, with too old ages to be embedded, namely  Myr and  Myr (respectively, clusters VVV CL100 and VVV CL059 from Borissova et al. 2011)191919Note that the quoted uncertainties are from our catalog, which might be larger than the values given in the original paper because we adopted minimum errors for the age estimates (see Section 3.3).. These two objects are precisely the only ECs in our sample whose age was determined with the distance via isochrone fitting and the high uncertainty of this method for very young clusters is indeed acknowledged by the authors (Borissova et al. 2011). In a few other cases where isochrone fitting was used to derive the age of an EC, an independent measure of the distance was used as input in order to reduce the uncertainty (e.g., Ojha et al. 2010).

Excluding these two outliers from our sample, we found that 90% (21 out of 23) of the ECs with available age estimates are younger than 3 Myr. Furthermore, given the high errors in this age range, even the remaining two clusters are consistent with being younger than 3 Myr, within the uncertainties: age of  Myr for BDS2003 139 (Stead & Hoare 2011), and  Myr for DBS2003 118 (Roman-Lopes 2007)19. We therefore adopt an upper limit of 3 Myr for the embedded phase, which represents a better constraint than the 5 Myr limit often quoted in the literature (from Leisawitz et al. 1989). Since practically all available EC ages in our sample are  Myr, the same result is obtained if we consider the representative sample ( kpc), despite the low statistics (10 out of 11 ECs are formally younger than 3 Myr, after removing one outlier).

#### 4.7.2 Age distribution of OCs

The much higher number of OCs with available age estimates allowed us to study their age distribution, which is shown in Figure 9 for the representative sample (a total of 143 OCs). Assuming a constant cluster formation rate (CFR), the decreasing number of OCs as time evolves is due to the effect of different disruption processes. Lamers & Gieles (2006) provide a theoretical parameterization of the survival time of initially bound OCs in the solar neighborhood, taking into account four main mechanisms: stellar evolution, tidal stripping by the Galactic gravitational field, shocking by spiral arms, and encounters with giant molecular clouds. They show that the observed age distribution for a constant CFR and a power-law cluster initial mass function with a slope of can be written as

 Φa(a)=C⎡⎣(Mlim(a)M\sun)−1−(MmaxM\sun)−1⎤⎦ , (12)

where is the age, is a constant, is the initial mass of a cluster that, at an age , reaches a mass equal to the detection limit (assumed to be 100 ), and is the maximum initial mass of clusters that are formed. It can be shown that the cluster formation rate within the initial mass range is related with the factor by

 CFR=C⎡⎣1100−(MmaxM\sun)−1⎤⎦ . (13)

We fitted from Equation (12) to the observed age distribution of OCs in the representative sample, with free parameters and ; the input function was obtained by digitizing the dashed curve in Fig. 2 of Lamers & Gieles (2006). We plot the resulting best fit as a solid curve in Figure 9, corresponding to the parameters  Myr and . It is clear from the figure that there is an excess of observed young OCs with respect to the fitted theoretical distribution, whereas for older ages the fit is a pretty good representation of the data. The observed excess of young OCs could be the result of two effects. First, young OCs dominate at larger distances because they contain more luminous stars, so that within an incomplete sample the proportion of young OCs is relatively higher than that of older clusters (Piskunov et al. 2006). Second, since the parameterization of Lamers & Gieles (2006) considers the dissolution of initially bound OCs due to classical mechanisms, the observed over-population of young clusters might consists of associations, i.e., clusters which are already unbound due to disruption processes that are not accounted for by Lamers & Gieles (2006). These associations will quickly dissolve into the field and, therefore, will not be able to populate the older age bins of the distribution in the future.

While the age-dependent incompleteness is likely playing a role within our  kpc limit, it is interesting to investigate whether or not there is also a contribution from the presence of associations, for which we need to restrict the sample to smaller distances, where the incompleteness is not important. We found that the excess of observed young OCs still holds if we perform the fit for samples restricted to successively smaller distances, down to  kpc; nevertheless, the low statistics in the Solar neighborhood within the ATLASGAL range prevents us to perform this test on an even more restricted subsample of our catalog. We therefore fitted the model to all-sky samples of OCs, namely, the Dias et al. (2002) catalog and the Kharchenko et al. (2005b, a) sample, restricted to a certain limit in projected distance, . For clusters with  kpc, in both samples, we recovered the results from Lamers & Gieles (2006)202020This is totally expected for the Kharchenko et al. sample, since Lamers & Gieles (2006) used basically the same clusters. The only difference is that they did not include the objects newly detected by Kharchenko et al. (2005a). On the other hand, the fact that for the Dias et al. (2002) sample we obtain the same result implies that there are no systematic effects arising from differences between both samples, in particular regarding the age estimates., whose observed age distribution practically does not show the excess of young OCs with respect to the fitted curve (see their Fig. 3). If we restrict the samples to  kpc, however, the age distribution for the Dias et al. (2002) catalog presents a statistically significant over-population of young OCs, whereas for the Kharchenko et al. (2005b, a) sample the excess is only marginal.

Given that the Kharchenko et al. (2005b, a) sample is a subset of the Dias et al. (2002) catalog, this behavior means that the young excess in the sample with  kpc cannot purely be due to the age-dependent incompleteness, since otherwise we would obtain a more noticeable effect in the less complete sample. Then, there must necessarily be a contribution from presence of associations. The excess is less significant for the Kharchenko et al. catalog and not noticeable for clusters in both samples with  kpc probably because there is an observational limitation in detecting associations at very close distances, due to their larger sizes. In summary, we think that the excess of young clusters in our representative OC sample ( kpc) with respect to the theoretical description of Lamers & Gieles (2006) is caused by a combination of age-dependent incompleteness and presence of associations.

The age distribution shown in Figure 9 was constructed using a bin width large enough to ensure good statistics over the whole age range, but we can refine the grid to constrain better a certain feature, as long as the presentation remains statistically significant. By constructing the age distribution with smaller bin widths and doing the fitting again, we found that the transition after which the theoretical description fits well the data occurs at an age of , i.e.,  Myr. Consistently, we have seen in Section 4.3 that the  Myr limit is roughly the age before which an observed OC might be either an association or a physical OC, whereas observed OCs older than that are practically always bound and therefore are disrupted through “classical” mechanisms over a longer timescale.

#### 4.7.3 Young cluster dissolution

Similarly to the estimation of the cumulative completeness fraction (see Section 4.6), we can use the analytical expressions for the distance distributions from Section 4.4 to transform the absolute CFR in the representative sample to an incompleteness-corrected cluster formation rate per unit area, , representative of the inner Galaxy close to the Sun. It can be easily shown that the conversion is

 ˙Σ=CFR(D≤Drep)ℓ1D2eff(Drep) , (14)

where

 D2eff(D)≡2∫D0fc(D′)fb1(D′)D′dD′ . (15)

For the OC sample,  kpc, which implies that the fitted cluster formation rate per unit area is  Myr kpc. This value can now be compared with the analogous parameter in the Lamers & Gieles (2006) fit for a complete all-sky sample within 0.6 kpc from the Sun,  Myr kpc. Together with the maximum mass of they obtain, we can see that both fits are consistent within the uncertainties, assuming that their errors are similar to ours (theirs are not provided). On the other hand, from the observed number of OCs in our representative sample with ages , we derive  Myr kpc (using Poisson errors), which sets an upper limit of to the fraction of observed young OCs that are actually associations. The observed cluster formation rate corrected by age-dependent incompleteness is some value between and that can be parametrized as , where is a factor in the range ( for no age-dependent incompleteness, and for no intrinsic young excess).

To obtain a realistic estimate of the fraction of young clusters that will dissolve or merge with other(s) agglomerate(s), and therefore will not become physical OCs by their own, we also need an equivalent estimate for the formation rate of ECs. For that, we can simply take the local surface density obtained from fitting the distance distribution of ECs (Table 4), and divide it by their upper limit age of 3 Myr, resulting in  Myr kpc. This EC formation rate, however, is not directly comparable to that of OCs, since within 3 kpc from the Sun we are likely detecting ECs with masses below the detection limit of 100  adopted by Lamers & Gieles (2006) for OCs, as shown, e.g., by Lada & Lada (2003), whose EC catalog includes objects with masses down to 20 , with a large number of clusters with masses in the range . Fortunately, we found that the uncertainty in the fraction of ECs with masses above 100 , , is not dominant and does not prevent us to compute a good estimate of the young dissolution fraction.

If we assume that is in the range , we obtain that the fraction of ECs and young exposed clusters, , that will not become physical OCs is

where the uncertainty has been numerically computed assuming Gaussian random variables, except for and which were drawn from uniform probability distributions in the corresponding domains ( range for , see above). The value is in excellent agreement with that obtained by Lada & Lada (2003). However, the explanation proposed by these authors, that this high fraction is produced by the dissolution of ECs after fast gas expulsion, has been modified (or extended) considerably in recent years. As we have reviewed in the Introduction, depending on the physical conditions of each individual system and its environment, several other phenomena can contribute to the high observed number of ECs relative to physical OCs, namely: dissolving associations from birth, merging of young subclusters, and young cluster dispersion due to tidal shocks from environment or due to fast relaxation for small- systems.

### 4.8 Correlations

In this Section, we look for correlations between the morphological types defined in Section 4.1 and other information compiled in our cluster catalog, such as the MIR morphology and association with known objects. The percentages of clusters that satisfy the studied criteria within each morphological type are presented in Table 5. Column 2 gives the percentage of clusters that appear to be exciting PAH emission through UV radiation from their stars, as traced by bright diffuse 8 m emission (12 m for WISE) or the presence of IR bubbles (MIR morphology bub-cen, bub-cen-trig, or pah, see Section 3.1). Column 3 lists the fraction of clusters that seem to be triggering further star formation at the edge of the associated IR bubble (MIR morphology bub-cen-trig alone), whereas Column 4 indicates the fraction of clusters that are located at the edge of an IR bubble (MIR morphology bub-cen-edge). Columns 5, 6 and 7 give, respectively, the percentage of objects that are associated with IRDCs, H ii regions of any type, and UCH ii regions alone. Finally, Column 8 lists the fraction of clusters that are part of a complex of several clusters (see Section B.6). In this table we present the statistics calculated for the whole cluster sample, because we obtained the same results for the representative sample, within the uncertainties (assumed to be Poisson errors). The only exception is the association with infrared dark clouds, for which we give the fractions within the representative sample. This is expected since an IRDC can only be identified at a relatively near distance because, to be detectable, it has to manifest itself as a dark extinction feature in front of the diffuse Galactic background. We also computed the statistics restricted to clusters with GLIMPSE data available, in order to minimize possible systematic errors arising from the lower resolution and sensitivity of the WISE images (see Section B.3), but since only 7% of the clusters have no GLIMPSE data, we obtained identical results than those presented in Table 5.

We note from the table that the presence of stellar feedback as traced by PAH emission and H ii regions is very important in the first four stages of the evolutionary sequence. When excluding UCH ii regions, we found that both indicators of feedback are roughly equivalent, i.e., the same clusters present both tracers. That a few clusters have PAH emission but no H ii region is probably due to the incompleteness of the current sample of H ii regions. Alternately, in some cases we might be dealing with lower mass clusters whose UV radiation is strong enough to excite the PAH molecules, but not to produce a detectable region of ionized gas (Allen et al. 2007). On the other hand, the few H ii regions without PAH emission are probably more evolved, or UCH ii regions not identified as such. However, it is remarkable that although the identification of an ultra compact region was only based on the literature, such objects are much more frequently associated with the first morphological type, which presumably covers the youngest clusters. The almost null correlation of OC2 clusters with indicators of stellar feedback is consistent with the fact that these clusters are mostly classical OCs and already gas-free.

Concerning triggered star formation, we see that only EC2, OC0, and OC1 clusters are able to produce it, in roughly 10% of the cases. EC1 clusters are not able because they are too embedded and have not yet started to sweep up the surrounding material; in turn, their formation might be triggered itself by another cluster or massive star, but in only a very small fraction (see Column 4). We warn, however, that our diagnoses of triggered star formation are purely based on morphology, so that its real existence in these cases is definitely not conclusive.

Infrared dark clouds are mostly associated with the first morphological type, confirming that they trace the earliest phases of star cluster formation. Interestingly, we found that the presence of IRDCs and PAH emission are almost mutually exclusive: within the representative sample, both tracers combined practically account for the totality of EC1 clusters, with almost null intersection. In other words, IRDCs and PAH emission trace, respectively, an earlier and later stage within the deeply embedded phase (type EC1). A simple interpretation for this behavior is that at some point IRDCs are “illuminated” by the radiation of the recently formed ECs, before their actual disruption, so that they become undetectable as extinction features in the mid-infrared but still prominent in the submm dust continuum emission traced by ATLASGAL.

Although we have not identified the totality of complexes of physically related clusters in our sample, Table 5 shows a clear tendency for ECs to be grouped in complexes. In contrast, OCs are much more isolated (the type OC2 dominates the OC population). Only those OCs that are still associated with some molecular gas (types OC0, OC1) present a similar degree of grouping with other clusters as ECs. This is consistent with the fact that star formation occurs in giant molecular cloud complexes with a hierarchical structure, in which star-forming regions with a relatively higher stellar density would be observationally identified as ECs. Many of them will dissolve, while others, if close enough, will undergo a merging process as a result of dynamical evolution, all in a timescale shorter than  Myr (see Section 4.7). The final outcome, after the parent molecular cloud is destroyed, might therefore be very few or even an unique physical OC, which will appear relatively in isolation.

## 5 Conclusions

We have statistically studied all ECs and OCs known so far in the inner Galactic plane and their correlation with dense molecular gas, taking particular advantage of the improved cluster sample over the past decade and the ATLASGAL submm continuum survey, which traces cold dust and dense molecular gas. The main results and conclusions presented in this paper are summarized as follows.

1. We compiled a merged full-sky list of 3904 ECs and OCs in the Galaxy, collected from several optical and infrared cluster catalogs in the literature, dealing properly with cross-identifications.

2. As part of the above compilation, we performed our own search for ECs on the mid-infrared GLIMPSE survey, complementing the catalog of 92 exposed and less-embedded clusters detected by Mercer et al. (2005) on the same data. Our method basically consisted on visual inspection of three-color images around positions previously selected as potential YSO overdensities, which correspond to enhancements on a stellar density map of the GLIMPSE point source catalog filtered by a red color criterion. With this technique, we found 75 new clusters.

3. The sample of 695 ECs and OCs within the ATLASGAL Galactic range ( and ) was studied in more detail, particularly regarding the correlation with submm emission. We constructed an extensive catalog (available in electronic form at the CDS) with all the relevant information on these objects, including: the characteristics of the submm and mid-infrared emission; correlation with IRDCs, IR bubbles, and H ii regions; distances (kinematic and/or stellar) and ages; and membership in big molecular complexes.

4. Based on the morphology of the submm emission and, for exposed clusters, on the agreement of the clump kinematic distances and cluster stellar distances, we defined an evolutionary sequence with decreasing correlation with ATLASGAL emission: deeply embedded clusters (EC1), partially embedded clusters (EC2), emerging exposed clusters (OC0), totally exposed clusters still physically associated with molecular gas in their surrounding neighborhood (OC1), and all the remaining exposed clusters, with no correlation with ATLASGAL emission (OC2).

5. The morphological evolutionary sequence correlates well with other observational indicators of evolution. In particular, we found that IR bubbles/PAH emission and H ii regions are both equivalently important in the first four stages of the evolutionary sequence, suggesting that ionization is one of the main feedback mechanisms in our cluster sample. IRDCs are significant mostly in the first type (EC1), tracing a very early phase prior to the stage in which the EC starts to “illuminate” the host molecular clump while still embedded (EC1 clusters with PAH emission). The presence of big complexes containing several clusters is, again, relevant in the first four morphological types, which is consistent with the fact that star formation occurs in giant molecular clouds, and that older OCs (OC2) are just the bound survivors of a very complex process of merging and dissolution of young agglomerates.

6. We observationally defined an EC as any cluster with morphological types EC1 or EC2; OCs were defined as all the remaining types, OC0, OC1, and OC2, but were required to be confirmed by follow-up studies, in order to minimize the contamination by spurious candidates.

7. We found that our observational definition of OC agrees with the physical one (a bound exposed cluster, referred to in this work as a physical OC) for ages greater than  Myr. In our sample, some OCs younger than this limit can actually be associations.

8. By fitting the observed heliocentric distance distribution for OCs and ECs within the ATLASGAL range, we found that our OC and EC samples are roughly complete up to a distance of  kpc and  kpc, respectively. Beyond these limits, the completeness of the OC and EC samples decay exponentially with scale lengths of  kpc and  kpc, respectively.

9. We argued that ECs probe deeper the inner Galactic plane than OCs because, at infrared wavelengths, ECs can be more easily distinguished from the field population than OCs. On the other hand, a very distant EC is hardly detected due to the combined effect of extinction, the faint apparent brightness of its low-mass population and confusion of its members.

10. From a subsample of 23 ECs with available age estimates, we derived an upper limit of 3 Myr for the duration of the embedded phase.

11. We studied the OC age distribution within 3 kpc from the Sun, which was used to fit the theoretical parametrization of Lamers & Gieles (2006) of different disruption mechanisms for bound OCs. We found an excess of observed young OCs with respect to the fit, thought to be a combined effect of age dependent incompleteness and presence of associations for ages  Myr.

12. We derived formation rates of 0.54, 1.18, and 6.50 Myr kpc for bound OCs, all observed young OCs, and ECs, respectively, which translates into a EC dissolution fraction of . This high fraction is thought to be produced by a combination of the following effects: dissolving associations from birth; merging of young subclusters; and young cluster dispersion due to fast gas expulsion, tidal shocks from environment, or fast relaxation for small- systems.

The new generation of all-sky near-infrared surveys, such as the UKIDSS Galactic Plane Survey (Lucas et al. 2008) and VISTA Variables in the Vía Láctea (VVV, Minniti et al. 2010), will constitute valuable tools to discover new OCs and ECs in the Galactic plane and to start filling in the highly incomplete parts of the plane beyond 1 or 2 kpc from the Sun (for OCs and ECs, respectively). In the future, we plan to update our cluster database for the inner Galaxy to include the new discoveries. Furthermore, the improved sensitivity and resolution of these surveys relative to 2MASS will allow studies of the stellar population of ECs which appear too crowded and/or faint in the 2MASS data. Very importantly, this will increase the number of young clusters with available estimates of their physical properties, such as ages and masses. In particular, stellar masses can be combined with estimates of gas masses (e.g., from ATLASGAL) to derive star formation efficiencies and investigate possible trends with the age and the presence of feedback, placing important constraints on star formation theories.

###### Acknowledgements.
We thank the referee for making useful suggestions that improved the clarity of the paper, and Thomas Robitaille for reading the manuscript and providing helpful comments. We acknowledge the useful discussions with Pavel Kroupa, Maria Messineo (about the GLIMPSE search for ECs), and Marion Wienen (about kinematic distances). We also benefited from the email discussions with D. Froebrich (about its catalog of clusters), A. Moisés (about NIR spectrophotometric distances), and M. Gieles (about Equation (1)). This research is based on: data from the ATLASGAL project, which is a collaboration between the Max-Planck-Gesellschaft (MPIfR and MPIA), the European Southern Observatory and the Universidad de Chile; observations made with the Spitzer Space Telescope, which is operated by the Jet Propulsion Laboratory, California Institute of Technology under a contract with NASA; data products from the 2MASS, which is a joint project of the University of Massachusetts and the Infrared Processing and Analysis Center/California Institute of Technology, funded by the National Aeronautics and Space Administration and the National Science Foundation; and data products from the WISE, which is a joint project of the University of California, Los Angeles, and the Jet Propulsion Laboratory/California Institute of Technology, funded by the National Aeronautics and Space Administration. This work has made use of the SIMBAD database, operated at CDS, Strasbourg, France, the NASA’s Astrophysics Data System, and the VizieR database of astronomical catalogs (Ochsenbein et al. 2000). This paper has made use of information from the Red MSX Source survey database at www.ast.leeds.ac.uk/RMS which was constructed with support from the Science and Technology Facilities Council of the UK. E.F.E.M was supported for part of this research through a stipend from the International Max Planck Research School (IMPRS) for Astronomy and Astrophysics at the Universities of Bonn and Cologne. This was work partially carried out in the Max Planck Research Group Star formation throughout the Milky Way Galaxy at the Max Planck Institute for Astronomy (MPIA).

## References

• Allen et al. (2007) Allen, L., Megeath, S. T., Gutermuth, R., et al. 2007, Protostars and Planets V, 361
• Anderson & Bania (2009) Anderson, L. D. & Bania, T. M. 2009, ApJ, 690, 706
• Anderson et al. (2011) Anderson, L. D., Bania, T. M., Balser, D. S., & Rood, R. T. 2011, ApJS, 194, 32
• Baba et al. (2009) Baba, J., Asaki, Y., Makino, J., et al. 2009, ApJ, 706, 471
• Benjamin et al. (2003) Benjamin, R. A., Churchwell, E., Babler, B. L., et al. 2003, PASP, 115, 953
• Beuther et al. (2012) Beuther, H., Tackenberg, J., Linz, H., et al. 2012, ApJ, 747, 43
• Bica et al. (2008) Bica, E., Bonatto, C., & Camargo, D. 2008, MNRAS, 385, 349
• Bica et al. (2003a) Bica, E., Dutra, C. M., & Barbuy, B. 2003a, A&A, 397, 177
• Bica et al. (2003b) Bica, E., Dutra, C. M., Soares, J., & Barbuy, B. 2003b, A&A, 404, 223
• Binney & Tremaine (2008) Binney, J. & Tremaine, S. 2008, Galactic Dynamics: Second Edition (Princeton University Press)
• Blitz (1991) Blitz, L. 1991, in IAU Symposium, Vol. 144, The Interstellar Disk-Halo Connection in Galaxies, ed. H. Bloemen, 41–52
• Bonatto & Bica (2008) Bonatto, C. & Bica, E. 2008, A&A, 485, 81
• Bonatto & Bica (2011) Bonatto, C. & Bica, E. 2011, MNRAS, 415, 2827
• Bonatto et al. (2006) Bonatto, C., Kerber, L. O., Bica, E., & Santiago, B. X. 2006, A&A, 446, 121
• Bonnell et al. (2011) Bonnell, I. A., Smith, R. J., Clark, P. C., & Bate, M. R. 2011, MNRAS, 410, 2339
• Borissova et al. (2011) Borissova, J., Bonatto, C., Kurtev, R., et al. 2011, A&A, 532, A131
• Borissova et al. (2006) Borissova, J., Ivanov, V. D., Minniti, D., & Geisler, D. 2006, A&A, 455, 923
• Borissova et al. (2005) Borissova, J., Ivanov, V. D., Minniti, D., Geisler, D., & Stephens, A. W. 2005, A&A, 435, 95
• Borissova et al. (2003) Borissova, J., Pessev, P., Ivanov, V. D., et al. 2003, A&A, 411, 83
• Brand & Blitz (1993) Brand, J. & Blitz, L. 1993, A&A, 275, 67
• Bressert et al. (2010) Bressert, E., Bastian, N., Gutermuth, R., et al. 2010, MNRAS, 409, L54
• Bronfman et al. (1996) Bronfman, L., Nyman, L.-A., & May, J. 1996, A&AS, 115, 81
• Carraro et al. (2006) Carraro, G., Janes, K. A., Costa, E., & Méndez, R. A. 2006, MNRAS, 368, 1078
• Caswell & Haynes (1987) Caswell, J. L. & Haynes, R. F. 1987, A&A, 171, 261
• Celnik et al. (1979) Celnik, W., Rohlfs, K., & Braunsfurth, E. 1979, A&A, 76, 24
• Chabrier (2001) Chabrier, G. 2001, ApJ, 554, 1274
• Churchwell et al. (2009) Churchwell, E., Babler, B. L., Meade, M. R., et al. 2009, PASP, 121, 213
• Churchwell et al. (2006) Churchwell, E., Povich, M. S., Allen, D., et al. 2006, ApJ, 649, 759
• Churchwell et al. (2007) Churchwell, E., Watson, D. F., Povich, M. S., et al. 2007, ApJ, 670, 428
• Contreras et al. (2013) Contreras, Y., Schuller, F., Urquhart, J. S., et al. 2013, A&A, 549, A45
• Cyganowski et al. (2008) Cyganowski, C. J., Whitney, B. A., Holden, E., et al. 2008, AJ, 136, 2391
• Davies et al. (2012) Davies, B., de La Fuente, D., Najarro, F., et al. 2012, MNRAS, 419, 1860
• Davies et al. (2008) Davies, B., Figer, D. F., Law, C. J., et al. 2008, ApJ, 676, 1016
• Deharveng et al. (2010) Deharveng, L., Schuller, F., Anderson, L. D., et al. 2010, A&A, 523, A6
• Dias et al. (2002) Dias, W. S., Alessi, B. S., Moitinho, A., & Lépine, J. R. D. 2002, A&A, 389, 871
• Dutra & Bica (2000) Dutra, C. M. & Bica, E. 2000, A&A, 359, L9
• Dutra & Bica (2001) Dutra, C. M. & Bica, E. 2001, A&A, 376, 434
• Dutra et al. (2003a) Dutra, C. M., Bica, E., Soares, J., & Barbuy, B. 2003a, A&A, 400, 533
• Dutra et al. (2003b) Dutra, C. M., Ortolani, S., Bica, E., et al. 2003b, A&A, 408, 127
• Fall et al. (2010) Fall, S. M., Krumholz, M. R., & Matzner, C. D. 2010, ApJ, 710, L142
• Faúndez et al. (2004) Faúndez, S., Bronfman, L., Garay, G., et al. 2004, A&A, 426, 97
• Faustini et al. (2009) Faustini, F., Molinari, S., Testi, L., & Brand, J. 2009, A&A, 503, 801
• Fazio et al. (2004) Fazio, G. G., Hora, J. L., Allen, L. E., et al. 2004, ApJS, 154, 10
• Fritz et al. (2011) Fritz, T. K., Gillessen, S., Dodds-Eden, K., et al. 2011, ApJ, 737, 73
• Froebrich et al. (2007a) Froebrich, D., Meusinger, H., & Scholz, A. 2007a, MNRAS, 377, L54
• Froebrich et al. (2008) Froebrich, D., Meusinger, H., & Scholz, A. 2008, MNRAS, 390, 1598
• Froebrich et al. (2010) Froebrich, D., Schmeja, S., Samuel, D., & Lucas, P. W. 2010, MNRAS, 409, 1281
• Froebrich et al. (2007b) Froebrich, D., Scholz, A., & Raftery, C. L. 2007b, MNRAS, 374, 399
• Furness et al. (2010) Furness, J. P., Crowther, P. A., Morris, P. W., et al. 2010, MNRAS, 403, 1433
• Genzel et al. (2010) Genzel, R., Eisenhauer, F., & Gillessen, S. 2010, Reviews of Modern Physics, 82, 3121
• Gieles & Portegies Zwart (2011) Gieles, M. & Portegies Zwart, S. F. 2011, MNRAS, 410, L6
• Glushkova et al. (2010) Glushkova, E. V., Koposov, S. E., Zolotukhin, I. Y., et al. 2010, Astronomy Letters, 36, 75
• Grocholski & Sarajedini (2003) Grocholski, A. J. & Sarajedini, A. 2003, MNRAS, 345, 1015
• Güsten et al. (2006) Güsten, R., Nyman, L. Å., Schilke, P., et al. 2006, A&A, 454, L13
• Haisch et al. (2001) Haisch, Jr., K. E., Lada, E. A., & Lada, C. J. 2001, ApJ, 553, L153
• Hanson et al. (2010) Hanson, M. M., Kurtev, R., Borissova, J., et al. 2010, A&A, 516, A35
• Herbst (1975) Herbst, W. 1975, AJ, 80, 212
• Ivanov et al. (2002) Ivanov, V. D., Borissova, J., Pessev, P., Ivanov, G. R., & Kurtev, R. 2002, A&A, 394, L1
• Jackson et al. (2008) Jackson, J. M., Finn, S. C., Rathborne, J. M., Chambers, E. T., & Simon, R. 2008, ApJ, 680, 349
• Kang et al. (2010) Kang, M., Bieging, J. H., Kulesa, C. A., et al. 2010, ApJS, 190, 58
• Kharchenko et al. (2005a) Kharchenko, N. V., Piskunov, A. E., Röser, S., Schilbach, E., & Scholz, R.-D. 2005a, A&A, 440, 403
• Kharchenko et al. (2005b) Kharchenko, N. V., Piskunov, A. E., Röser, S., Schilbach, E., & Scholz, R.-D. 2005b, A&A, 438, 1163
• Kharchenko et al. (2012) Kharchenko, N. V., Piskunov, A. E., Schilbach, E., Röser, S., & Scholz, R.-D. 2012, A&A, 543, A156
• Kharchenko et al. (2013) Kharchenko, N. V., Piskunov, A. E., Schilbach, E., Röser, S., & Scholz, R.-D. 2013, A&A, 558, A53
• King (1962) King, I. 1962, AJ, 67, 471
• Klessen (2011) Klessen, R. S. 2011, in EAS Publications Series, ed. C. Charbonnel & T. Montmerle, Vol. 51, 133–167
• Kronberger et al. (2006) Kronberger, M., Teutsch, P., Alessi, B., et al. 2006, A&A, 447, 921
• Kroupa (2011) Kroupa, P. 2011, in Stellar Clusters & Associations: A RIA Workshop on Gaia, 17–27
• Kruijssen et al. (2012) Kruijssen, J. M. D., Maschberger, T., Moeckel, N., et al. 2012, MNRAS, 419, 841
• Kumar et al. (2004) Kumar, M. S. N., Kamath, U. S., & Davis, C. J. 2004, MNRAS, 353, 1025
• Kumar et al. (2006) Kumar, M. S. N., Keto, E., & Clerkin, E. 2006, A&A, 449, 1033
• Kurayama et al. (2011) Kurayama, T., Nakagawa, A., Sawada-Satoh, S., et al. 2011, PASJ, 63, 513
• Kurtev et al. (2008) Kurtev, R., Ivanov, V. D., Borissova, J., & Ortolani, S. 2008, A&A, 489, 583
• Lada & Lada (2003) Lada, C. J. & Lada, E. A. 2003, ARA&A, 41, 57
• Lamers & Gieles (2006) Lamers, H. J. G. L. M. & Gieles, M. 2006, A&A, 455, L17
• Leisawitz et al. (1989) Leisawitz, D., Bash, F. N., & Thaddeus, P. 1989, ApJS, 70, 731
• Levine et al. (2008) Levine, E. S., Heiles, C., & Blitz, L. 2008, ApJ, 679, 1288
• Lockman (1989) Lockman, F. J. 1989, ApJS, 71, 469
• Loktin et al. (2001) Loktin, A. V., Gerasimenko, T. P., & Malysheva, L. K. 2001, Astronomical and Astrophysical Transactions, 20, 607
• Longmore et al. (2011) Longmore, A. J., Kurtev, R., Lucas, P. W., et al. 2011, MNRAS, 416, 465
• Lucas et al. (2008) Lucas, P. W., Hoare, M. G., Longmore, A., et al. 2008, MNRAS, 391, 136
• Majaess (2013) Majaess, D. 2013, Ap&SS, 344, 175
• Marasco & Fraternali (2012) Marasco, A. & Fraternali, F. 2012, in European Physical Journal Web of Conferences, Vol. 19, 8007
• Markwardt (2009) Markwardt, C. B. 2009, in Astronomical Society of the Pacific Conference Series, Vol. 411, Astronomical Data Analysis Software and Systems XVIII, ed. D. A. Bohlender, D. Durand, & P. Dowler, 251
• Martins et al. (2005) Martins, F., Schaerer, D., & Hillier, D. J. 2005, A&A, 436, 1049
• Maschberger et al. (2010) Maschberger, T., Clarke, C. J., Bonnell, I. A., & Kroupa, P. 2010, MNRAS, 404, 1061
• McClure-Griffiths & Dickey (2007) McClure-Griffiths, N. M. & Dickey, J. M. 2007, ApJ, 671, 427
• McMillan & Binney (2010) McMillan, P. J. & Binney, J. J. 2010, MNRAS, 402, 934
• Mercer et al. (2005) Mercer, E. P., Clemens, D. P., Meade, M. R., et al. 2005, ApJ, 635, 560
• Mermilliod (1996) Mermilliod, J.-C. 1996, in Astronomical Society of the Pacific Conference Series, Vol. 90, The Origins, Evolution, and Destinies of Binary Stars in Clusters, ed. E. F. Milone & J.-C. Mermilliod, 475
• Messineo et al. (2009) Messineo, M., Davies, B., Ivanov, V. D., et al. 2009, ApJ, 697, 701
• Minniti et al. (2010) Minniti, D., Lucas, P. W., Emerson, J. P., et al. 2010, New A, 15, 433
• Moeckel et al. (2012) Moeckel, N., Holland, C., Clarke, C. J., & Bonnell, I. A. 2012, MNRAS, 425, 450
• Moisés et al. (2011) Moisés, A. P., Damineli, A., Figuerêdo, E., et al. 2011, MNRAS, 411, 705
• Ochsenbein et al. (2000) Ochsenbein, F., Bauer, P., & Marcout, J. 2000, A&AS, 143, 23
• Ojha et al. (2010) Ojha, D. K., Kumar, M. S. N., Davis, C. J., & Grave, J. M. C. 2010, MNRAS, 407, 1807
• Paunzen & Netopil (2006) Paunzen, E. & Netopil, M. 2006, MNRAS, 371, 1641
• Peretto & Fuller (2009) Peretto, N. & Fuller, G. A. 2009, A&A, 505, 405
• Phelps & Janes (1994) Phelps, R. L. & Janes, K. A. 1994, ApJS, 90, 31
• Piatti & Clariá (2001) Piatti, A. E. & Clariá, J. J. 2001, A&A, 379, 453
• Pinheiro et al. (2010) Pinheiro, M. C., Copetti, M. V. F., & Oliveira, V. A. 2010, A&A, 521, A26
• Piskunov et al. (2006) Piskunov, A. E., Kharchenko, N. V., Röser, S., Schilbach, E., & Scholz, R.-D. 2006, A&A, 445, 545
• Piskunov et al. (2007) Piskunov, A. E., Schilbach, E., Kharchenko, N. V., Röser, S., & Scholz, R.-D. 2007, A&A, 468, 151
• Porras et al. (2003) Porras, A., Christopher, M., Allen, L., et al. 2003, AJ, 126, 1916
• Reid et al. (2009) Reid, M. J., Menten, K. M., Zheng, X. W., et al. 2009, ApJ, 700, 137
• Robitaille et al. (2008) Robitaille, T. P., Meade, M. R., Babler, B. L., et al. 2008, AJ, 136, 2413
• Robitaille & Whitney (2010) Robitaille, T. P. & Whitney, B. A. 2010, ApJ, 710, L11
• Roman-Duval et al. (2009) Roman-Duval, J., Jackson, J. M., Heyer, M., et al. 2009, ApJ, 699, 1153
• Roman-Lopes (2007) Roman-Lopes, A. 2007, A&A, 471, 813
• Roman-Lopes & Abraham (2006) Roman-Lopes, A. & Abraham, Z. 2006, AJ, 131, 2223
• Sato et al. (2010) Sato, M., Reid, M. J., Brunthaler, A., & Menten, K. M. 2010, ApJ, 720, 1055
• Schönrich et al. (2010) Schönrich, R., Binney, J., & Dehnen, W. 2010, MNRAS, 403, 1829
• Schuller et al. (2009) Schuller, F., Menten, K. M., Contreras, Y., et al. 2009, A&A, 504, 415
• Simon et al. (2006) Simon, R., Jackson, J. M., Rathborne, J. M., & Chambers, E. T. 2006, ApJ, 639, 227
• Simpson et al. (2012) Simpson, R. J., Povich, M. S., Kendrew, S., et al. 2012, MNRAS, 424, 2442
• Siringo et al. (2009) Siringo, G., Kreysa, E., Kovács, A., et al. 2009, A&A, 497, 945
• Skrutskie et al. (2006) Skrutskie, M. F., Cutri, R. M., Stiening, R., et al. 2006, AJ, 131, 1163
• Solin et al. (2012) Solin, O., Ukkonen, E., & Haikala, L. 2012, A&A, 542, A3
• Stead & Hoare (2011) Stead, J. J. & Hoare, M. G. 2011, MNRAS, 418, 2219
• Strader & Kobulnicky (2008) Strader, J. & Kobulnicky, H. A. 2008, AJ, 136, 2102
• Straw et al. (1989) Straw, S. M., Hyland, A. R., & McGregor, P. J. 1989, ApJS, 69, 99
• Tadross (2008) Tadross, A. L. 2008, New A, 13, 370
• Urquhart et al. (2008) Urquhart, J. S., Hoare, M. G., Lumsden, S. L., Oudmaijer, R. D., & Moore, T. J. T. 2008, in Astronomical Society of the Pacific Conference Series, Vol. 387, Massive Star Formation: Observations Confront Theory, ed. H. Beuther, H. Linz, & T. Henning, 381
• Urquhart et al. (2012) Urquhart, J. S., Hoare, M. G., Lumsden, S. L., et al. 2012, MNRAS, 420, 1656
• Urquhart et al. (2011) Urquhart, J. S., Moore, T. J. T., Hoare, M. G., et al. 2011, MNRAS, 410, 1237
• Werner et al. (2004) Werner, M. W., Roellig, T. L., Low, F. J., et al. 2004, ApJS, 154, 1
• Wienen et al. (2012) Wienen, M., Wyrowski, F., Schuller, F., et al. 2012, A&A, 544, A146
• Williams et al. (2000) Williams, J. P., Blitz, L., & McKee, C. F. 2000, Protostars and Planets IV, 97
• Williams et al. (1994) Williams, J. P., de Geus, E. J., & Blitz, L. 1994, ApJ, 428, 693
• Wolf et al. (2010) Wolf, J., Martinez, G. D., Bullock, J. S., et al. 2010, MNRAS, 406, 1220
• Wright et al. (2010) Wright, E. L., Eisenhardt, P. R. M., Mainzer, A. K., et al. 2010, AJ, 140, 1868
• Xu et al. (2006) Xu, Y., Reid, M. J., Zheng, X. W., & Menten, K. M. 2006, Science, 311, 54

## Appendix A Cluster lists in the literature

In this appendix, we describe the diverse catalogs and references used for our cluster compilation, separated in three categories according to the wavelength at which the clusters are detected: optical, NIR and MIR clusters. Furthermore, we present a brief discussion of the contamination by false cluster candidates. Again, as for Table 1, the number of clusters quoted within the text represent values after removing these spurious objects and some globular clusters (listed in Table 6), unless explicitly mentioned.

### a.1 Optical clusters

Dias et al. (2002) provide the most complete catalog of optically visible OCs and candidates, containing revised data compiled from old catalogs and from isolated papers recently published. The list is regularly updated on a dedicated webpage, with additional clusters seen in the optical and revised fundamental parameters from new references. We used the version 3.1 (from November, 2010), which contains 2117 objects, of which 99.7% have estimated angular diameters, and 59.4% have simultaneous reddening, distance and age determinations. Kinematic information is also given for a fraction of clusters, 22.9% of the list have both radial velocity and proper motion data. It should be noted that this catalog aims at collecting not only the OCs first detected in the optical, but also most of (ideally, all) the clusters which were detected in the infrared and are visible in the optical. For example, 293 objects from the 998 2MASS-detected clusters of Froebrich et al. (2007b) were included in the last version of the catalog, based on by-eye inspection of the Digitized Sky Survey (DSS) images.

We also included in our compilation the list of new galactic OC candidates by Kronberger et al. (2006), who did a visual inspection of DSS and 2MASS images towards selected regions, and a subsequent analysis of the 2MASS color-magnitude diagrams of the candidates. The clusters were divided in different lists, some of them with fundamental parameters determined, and are all included in the Dias et al. (2002, ver. 3.1) catalog, except most of the stellar fields classified as suspected OC candidates (their Table 2e), which adds 130 objects to the optical cluster sample.

### a.2 NIR clusters

Stellar clusters detected by NIR imaging, mainly from surveys of individual star-forming regions, are compiled from the literature by Porras et al. (2003), Lada & Lada (2003), and Bica et al. (2003a). The first two catalogs are exclusively limited to nearby regions (distances less than 1 kpc and  kpc, respectively); Bica et al. (2003a) did not use that restriction, but their list is only representative for nearby distances too ( kpc). It is not surprising that the three compilations overlap considerably, as is shown in Table 1. All together, these catalogs contribute 297 additional objects with respect to the optical cluster sample.

However, most of the NIR clusters correspond to recent discoveries using the 2MASS survey. More than 300 new clusters were found by visual inspection of a huge number of 2MASS , , and specially images (Dutra & Bica 2000, 2001; Bica et al. 2003b; Dutra et al. 2003a). In the pioneer work of Dutra & Bica (2000), 58 star clusters and candidates were originally detected by doing a systematic visual search on a field of centered close to the Galactic Center, and towards the directions of H ii regions and dark clouds for ; though most of them were observed later at higher angular resolution, and 36 turned out to be spurious detections mainly due to the high contamination from field stars in this area (see Section A.4). Additional 42 objects were discovered by Dutra & Bica (2001), who searched for ECs around the central positions of optical and radio nebulae in the Cygnus X region and other specific regions of the sky (they are included in the literature compilation by Bica et al. 2003a). They extended the method for the whole Milky Way (Dutra et al. 2003a; Bica et al. 2003b, southern and equatorial/northern Galaxy, respectively), inspecting a sample of 4450 nebulae collected from the literature, and they found a total of 337 new clusters.

In addition to the visual inspection technique, a large number of 2MASS star clusters have been discovered by automated searches, which are based on the selection of enhancements on stellar surface density maps constructed with the point source catalog. The early works of Ivanov et al. (2002) and Borissova et al. (2003) led to 14 detections (the ones not present in any of the catalogs mentioned above are counted in the “Not cataloged (NIR)” row of Table 1); similarly, Kumar et al. (2006) found 54 ECs of which 20 are new detections, focusing the search around the positions of massive protostellar candidates. More recently, Froebrich et al. (2007b) searched for 2MASS clusters along the entire Galactic Plane with , automatically looking for star density enhancements, and manually selecting all remaining objects possessing the same visual appearance in the star density maps as known star clusters. They identified a total of 1788 star cluster candidates, 1021 of which resulted to be new discoveries and were presented as a catalog; an estimate of the contamination suggested that about half of these new candidates are real star clusters. A considerable number of objects from the Froebrich et al. (2007b) list have been analyzed in more detail by a variety of authors, and they were compiled by Froebrich et al. (2008). For these objects and the ones recently studied by Froebrich et al. (2010) (comprising a total of 68 clusters), we use the refined coordinates and diameters instead of the original ones. The follow-up studies compiled by Froebrich et al. (2008) also unveil 22 spurious clusters and one globular cluster (see Table 6). A similar automatic 2MASS search done by Glushkova et al. (2010) in the range, which includes the verification of the obtained star density enhancements by the analysis of color-magnitude diagrams and radial density distributions, produced a list of new clusters (most of them included in the last version of the catalog by Dias et al. 2002), providing physical parameters for a total of 168 new and previously discovered objects.

Expectations for the near future are that the new generation of all-sky NIR surveys, such as the United Kingdom Infrared Deep Sky Survey (UKIDSS) and VISTA Variables in the Vía Láctea (VVV), will give rise to the discovery of many more stellar clusters, thanks to their improved limiting magnitude and angular resolution compared to 2MASS. A cluster search using these data has already been performed by Borissova et al. (2011), who found 96 previously unknown stellar clusters by visually inspecting multiwavelength NIR images of the VVV survey in the covered disk area ( and ), towards directions of star formation signposts (masers, radio, and infrared sources). The objects listed in their catalog were required to present distinguishable sequences on the color-color and color-magnitude diagrams, after applying a field-star decontamination algorithm, in order to minimize the presence of false detections. Automated cluster searches in the UKIDSS and VVV surveys are being done by the corresponding teams.232323According to unpublished data, there seem to be more than 300 new clusters detected so far by the UKIDSS team. An independent automated search on UKIDSS, leading to the discovery of 167 additional clusters and multiple star forming regions, has already been published by Solin et al. (2012), after the last update of our cluster compilation was done.

In our star cluster compilation, we also included recent NIR studies towards specific star-forming regions, or individual star clusters, which are not listed in the previous catalogs. In their NIR survey of 26 high-mass star-forming regions, Faustini et al. (2009) identified the presence of 23 clusters, 16 of which are new discoveries. Additional individual new objects are counted as “Not cataloged clusters (NIR)” in Table 1.

### a.3 MIR clusters

As a result of the high sensitivity of the GLIMPSE mid-infrared survey, Mercer et al. (2005) managed to find 92 new star clusters (2 of which are globular clusters) using an automated algorithm applied to the GLIMPSE point source catalog and archive, and a visual inspection of the image mosaics to search for ECs (the GLIMPSE Galactic range at that time was and