Properties of Liquid Clusters in Largescale Molecular Dynamics Nucleation Simulations
Abstract
We have performed largescale LennardJones molecular dynamics simulations of homogeneous vaportoliquid nucleation, with atoms. This large number allows us to resolve extremely low nucleation rates, and also provides excellent statistics for cluster properties over a wide range of cluster sizes. The nucleation rates, cluster growth rates, and size distributions are presented in Diemand et al. [J. Chem. Phys. 139, 74309 (2013)], while this paper analyses the properties of the clusters. We explore the cluster temperatures, density profiles, potential energies and shapes. A thorough understanding of the properties of the clusters is crucial to the formulation of nucleation models. Significant latent heat is retained by stable clusters, by as much as for clusters with size . We find that the clusters deviate remarkably from spherical  with ellipsoidal axis ratios for critical cluster sizes typically within and . We examine cluster spin angular momentum, and find that it plays a negligible role in the cluster dynamics. The interfaces of large, stable clusters are thiner than planar equilibrium interfaces by . At the critical cluster size, the cluster central densities are between lower than the bulk liquid expectations. These lower densities imply largerthanexpected surface areas, which increase the energy cost to form a surface, which lowers nucleation rates.
pacs:
05.10.a, 05.70.Fh, 05.70.Ln, 05.70.Np, 36.40.Ei, 64.60.qe, 64.70.Hz, 64.60.Kw, 64.10.+h, 83.10.Mj, 83.10.Rs, 83.10.Tv
I Introduction
A homogeneous vapor, when supersaturated, changes phase to liquid through the process of nucleation. This transformation is stochastically driven through the erratic formation of clusters, made up of atoms clinging together in droplets large enough that the free energy barrier is surpassed. Despite the ubiquity of this process in nature, attempts to model this process have met with difficulty, because the properties of the nanoscale liquidlike clusters are not well known Kalikmanov (2013). Unlike real laboratory experiments (Sinha, S. et al. (2010)Sinha et al. (2010) for example), computer simulations offer detailed information on the properties and evolution of clusters. However, direct simulations of nucleation are typically performed using a few thousand atoms only, and are therefore limited to extremely high nucleation rates and to forming a small number stable clusters (e.g. Wedekind et al. (2007)Wedekind et al. (2007) and Napari et al. (2009)(Napari, Julin, and Vehkam?ki, 2009a)). A large number of small simulations allows us to constrain the critical cluster properties in the high nucleation rate regime, but there seems to be limited information in the literature on critical cluster properties. An alternative approach is to simulate clusters in equilibrium with a surrounding vapor, however the resulting cluster properties seem to differ significantly from those seen in nucleation simulations Napari, Julin, and Vehkam?ki (2009a).
We report here on the properties of the clusters which form in largescale molecular dynamics LennardJones simulations. These direct simulations of homogeneous nucleation are much larger and probe much lower nucleation rates than any previous direct nucleation simulation. Some of the simulations even cover the same temperatures, pressures, supersaturations and nucleation rates as the recent Argon supersonic nozzle (SSN) experimentSinha et al. (2010) and allow, for the first time, direct comparisons to be made. The nine simulations we analyse here are part of a larger suite of runs: results of these runs pertaining to nucleation rates and comparisons to nucleation models and the SSN experimentSinha et al. (2010), are presented in Diemand et al. (2013)Diemand et al. (2013).
The large size of these simulations is primarily necessitated by the rarity of nucleation events at these low supersaturations. However, a further benefit gained from large simulations is the substantial number of nucleated droplets which are able to continue growing without significant decreases in the vapor density. This allows us to study, with good statistics, the properties of clusters as they grow, embedded within an realistic unchanging environment. This is particularly important in understanding the role that the droplet’s surface plays in the development of the droplet  as a bustling interface between the denser (and evergrowing) core, and the vapor outside. The nucleation properties of the simulations, the cluster growth rates and size distributions, and comparisons to nucleation models are presented in Diemand et al. (2013)Diemand et al. (2013), while here we explore the properties of the clusters themselves. Studying the properties of the nanosized liquid clusters which form, both stable and unstable, is of service to understanding the details of the nucleation process, the reasons behind the shortcomings of the available nucleation models, and aids in the blueprinting and selection of ingredients for future ones.
Section II provides details on the numerics of the simulations. In Section III we present the temperatures of the clusters and in Section IV we show the clusters’ potential energies. Section V addresses cluster rotation and angular momentum. The shapes the clusters take on is detailed in section VI. The cluster density profiles are explored in section VII. Section VIII will address cluster sizes and we use this information to revisit nucleation theory in section IX.
Ii Numerical Simulations
The simulations were performed with the Largescale Atomic/Molecular Massively Parallel Simulator Plimpton (1995), an opensource code developed at Sandia National Laboratories. The interaction potential is LennardJones,
(1) 
though truncated at , as well as shifted to zero. The properties of the Lennard Jones fluid depend on the cutoff scale, and our chosen cutoff is widely used in simulations, resulting in properties similar to Argon at reasonable computational cost. The integration routine is the wellknown Verlet integrator (often referred to as leapfrog), with a timestep of , regardless of the simulation temperature. In the Argon system, the units become , , and
The initial conditions correspond to randomPanneton, L’Ecuyer, and Matsumoto (2006) positions, and random velocities corresponding to a chosen temperature, in a cube with periodic boundary conditions. The simulations are carried out at constant average temperature, but not constant energy. The velocities of all the atoms in the simulation are rescaled, so that the average temperature of the simulation box remains constantWedekind, Reguera, and Strey (2007). The condensation process converts potential energy into kinetic energy as the clusters form, adding heat to the clusters. However because the amount of atoms in gas relative to the number in the condense phase is so large, the amount of thermostatting required is negligible. Because the nucleation rates explored in these simulations are relatively low, relative to the number of monomers in the simulation, very few clusters form. This means that the average heat produced through the transformation is extremely low, and so the amount of velocity rescaling necessary almost negligible.
Run ID  T  Diemand et al. (2013)  Diemand et al. (2013)  

[]  
T10n60  1.0  
T10n55  1.0  
T8n30  0.8  
T6n80  0.6  
T6n65  0.6  
T6n55  0.6  
T5n40  0.5  
T4n10  0.4  
T4n7  0.4 
The initial density of the simulations are indicated in table 1. The monomer density drops rapidly as the gas subcritical cluster distribution forms. The clusters are identified by the Stillinger Stillinger (1963) algorithm, whereby neighbours with small enough separations are joined into a common group. The linking length is temperature dependent, and given in table 2. It is set by the distance below which two monomers would be bound in the potential (1), were their velocities to correspond to the thermodynamic average, (Yasuoka, Matsumoto, and Kataoka, 1994; Tanaka et al., 2011). The type of initial conditions, the simulation timestep, thermostatting consequences, and the LennardJones cutoff, have been investigated through convergence tests in Diemand et al. (2013)Diemand et al. (2013).
The runs we have chosen to analyse here cover a broad temperature range , and a nucleation rate range  . All the results presented in this paper come from single snapshots taken at the end of each run. Because of the different nucleation rates, cluster growth rates, critical cluster sizes, and because all runs were terminated at a different time, different runs have markedly different size distributions at the time of termination. For example, while run T8n3 formed stable clusters, the largest of which has members, run T6n55 formed only a single stable cluster, yet made up of atoms (pictured in figure 1), while T10n55 formed clusters up to , although still below the critical size for this run. However, even a run which forms no stable clusters is still of value, as the subcritical droplets, whose properties we are interested in, are always present in the gas. Sometimes it is convenient to group the atoms of each cluster into one of two categories: according to whether they belong to the core, or to the surface. Atoms are considered core atoms if they have at least neighbours within the search radius. is temperature dependent, and is chosen such that the number of neighbours at the bulk density is approximately constant. The cutoffs are given in table 2. Unless otherwise indicated, error bars indicate the root mean square scatter in the measured quantity. In many cases, it is instructive to make comparisons between the liquid in the clusters of various sizes (and therefore curvatures), and the bulk liquid. To facilitate this comparison, we perform supplementary MD saturation simulations of the vaporliquid phase equilibrium(Werth et al., 2013; Baidakov et al., 2007; Trokhymchuk and Alejandre, 1999; Mecke, Winkelmann, and Fischer, 1997), in order to calculate the relevant quantities directly. Appendix B details these simulations, whose parametervalue results are given in table 3.
Iii Temperatures
We define the temperature of an ensemble of atoms from their mean kinetic energy,
(2) 
where is the total number of atoms in the ensemble, are the atom masses, and the velocities are those relative to the simulation box. For small, outofequilibrium clusters it can be troublesome to define temperature as an average over kinetic energy. However, by taking ensemble averages, the large number of small, subcritical clusters in the simulations mitigate this complication. The cluster temperature differences with the average bath temperature, is plotted for a few runs in figure 2, and shows that clusters smaller and at the critical cluster size are in thermodynamic equilibrium with the gas. This can be expected as subcritical clusters are as likely to accrete a monomer as they are to evaporate one: Because the growth rate is equal to the loss rate, subcritical clusters quickly lose latent heat due to the many interactions that they undergo as they randomwalk the size ladder. Wedekind et al. 2007 Wedekind, Reguera, and Strey (2007) find a similar behaviour in simulations. Their temperature differences are larger and set in at smaller cluster sizes than those in our simulations, as expected at their much higher nucleation rates.
Larger than the critical size, the cluster temperatures relative to the gas temperature increase with the cluster size. For simulations at the same temperature, the higher supersaturation case has a higher latent heat retention. This is likely due to the higher growth rates, caused by the higher collision rate and also by the higher probability that a monomer sticks to a clusterTanaka et al. (2011); Diemand et al. (2013).
The only set of runs without a significant postcritical signal are the high temperature runs. It is possible that evaporation, which is proportional to Diemand et al. (2013), is efficient enough at the low supersaturations of the runs to keep the clusters closer to thermal equilibrium with the gas.
If we divide the atoms of every cluster into two population types: core atoms and surface atoms, based on the number of neighbours that each atom has, we can investigate their temperature differences. Across all of our simulations, we find no significant difference in the core temperatures vs. the surface temperatures. The clusters are conductive enough for the surfaces to maintain thermodynamic equilibrium with their cores.
Figure 3 shows the temperature probability distribution for clusters of various sizes, for the hightemperature run T10n6. Due to the asymmetry of the distribution for smaller clusters, the average temperature is not equal to the most probable one, . The distributions as a function of cluster size were derived by McGraw and Laviolette McGraw and LaViolette (1995):
(3) 
where provides normalisation, and is the cluster’s heat capacity. This predicted form fits the distributions shown in Figure 3 extremely well and allows us to derive the most probable temperature very accurately. We fit this curve to the temperature probability distributions to all runs and cluster sizes where we have sufficient statistics and plot the resulting values in figure 4. Figures 2 and 4 also show comparisons with Feder et al (1966)’s(Feder et al., 1966) classical nonisothermal nucleation theory (CNINT), assuming a sticking probability of one, no carrier gas and no evaporation. According to this theory is negative below the critical cluster size, zero for critical clusters and positive above the critical size. The CNINT agrees only qualitatively with the MD results. Discrepencies are expected because the classical nucleation theory does not match the critical cluster sizes, size distributions and nucleation rates found in our MD simulations Diemand et al. (2013). Similar qualitative agreement was found in simulations with much higher nucleation rates inWedekind, Reguera, and Strey (2007).
For the heat capacity , all simulations are consistent with a simple linear fit with slope of against , i.e. the heat capacity per molecule is almost equal to the ideal value for a monoatomic gas.
We remark that:

For clusters larger than the critical cluster size, significant latent heat is retained by the clusters. An exception to this are the high temperatures runs, for which temperatures of clusters greater than the critical size are consistent with zero (within the scatter)  no clear latent heat retention signal is observed. The excess temperature is marginally larger for the lower gas temperature () runs than the higher ().

Within the scatter, the cluster core temperatures are the same as the surface temperatures. Heat is conducted efficiently enough that latent heat, which is added to the surface is transferred to the core rapidly enough to maintain equilibrium  within the bounds of statistical veracity.

The temperature probability distributions as a function of cluster size have the expected form  reshaping from the Maxwell distribution towards the normal distribution as cluster size grows.

For clusters smaller than the critical size, the most probable temperature relative to the gas temperature is universal across all runs.
Iv Potential Energies
In atomistic theories of nucleation, e.g. the Fisher droplet modelFisher (1967) and other atomistic models (see e.g. Kashchiev (2000)(Kashchiev, 2000) and Kalikmanov (2013)(Kalikmanov, 2013) for details), one relates the surface energy of a cluster to its total potential energy :
(4) 
where is the potential energy per particle in the bulk liquid phase. In this approach the surface energy of a cluster is simply the difference between its actual potential energy and the one it would have if all its members were embedded in bulk liquid. We have tried this approach using the bulk liquid potential energies measured in the equilibrium simulations described in the Appendix and find that the resulting surface terms are too large: In the atomistic model the free energy difference at saturation (i.e. where the volume term vanishes) are equal to the surface term, with a constant shift to have zero for monomers:
(5) 
For , the classical volume term is no longer null:
(6) 
This estimate for the free energy lies far above the true free energy landscape, which we reconstructed from the size distribution in the simulation using a new analysis method (Tanaka, K. et al (2014), in preparation). Figure 16 plots this reconstruction . In comparison, the atomistic theory free energy curve lies far above these estimates, reaching a critical size , at which point the atomistic free energy is . The same was found in other simulations, i.e. this simple implementation of an atomistic model seems to overestimate the surface energy and therefore underestimates the nucleation rates by large factors. On the other hand, if not the bulk liquid phase value is chosen for , but the core potential energy at size (red dots on figure 5), then the corresponding surface energy is too low.
The lower panel of figure 6 shows the potential energy per particle as a function of distance from the centerofmass for the large cluster pictured in figure 1. It reaches the same values as in the bulkliquid at the same temperature. While we observe the kinetic energy differences (i.e. temperature differences, see Section III) between core and surface atoms to be consistent with zero, the potential energies of the core are expected to be considerably lower than those of the surface particles, as they have more neighbours. Figure 5 plots the potential energies of core and surface atoms for a few runs. As the clusters grow, they become more tightly packed, and so both the surface and core potential energies drop. The potential energies per core particle are expected to reach a minimum in the limit that the clusters grow large enough to have core potential energies equal to the bulk liquid. In our lowtemperature simulations, this occurs at , and for our high temperatures simulations at .
To predict nucleation rates accurately one needs an good estimate of the surface term for clusters near the critical size. Figure 5 shows that the core particles in these relatively small clusters are still strongly affected by the surface and the transition region, i.e. their potential energies per particle are far less negative than those found for true bulk liquid particles. This discrepancy might be related to the failure of the simple atomistic model described above. Replacing in Eqn. 4 with the less negative values actually found in the cores of critical clusters can significantly improve the estimates, at least for , and lead to better nucleation rate estimates.
V Rotation
Because clusters grow through isotropic interactions with vapor atoms, it can be expected that the spin of the clusters decreases for larger clusters. The angular momentum of a single particle in a cluster is
(7) 
where C.o.M denotes that the quantity is taken relative to the cluster centerofmass, and we have set the mass to unity. The magnitude of the total angular momentum of the cluster is then
(8) 
We define the related quantity
(9) 
The dimensionless quantity provides an indication of the extent to which the members of the cluster spin in a common direction. Figure 7 plots this ratio as a function of cluster size for two runs at the same temperature. Also plotted are the values of this quantity from constantdensity bulk liquid simulations. This is done by evaluating the quantity for atoms within a randomly entered spherical boundary of different sizes. This provides a noise estimate to which the nucleation simulation results may be compared. Across all runs we find that for small clusters sizes (), the spin is slightly above the noise level, but decays rapidly for larger clusters. Across all simulations, at size the ratio is within , with the high temperature runs at the high end, and the low at the lower. This is to be expected if the axis of rotation is random relative to the ellipsoidal axis.
Vi Shapes
One assumption usually made by nucleation models is that the clusters are spherical (with few exceptions, see e.g. Prestipino et al. (2012) Prestipino, Laio, and Tosatti (2012)). This is motivated by the sphericity of large liquid droplets, which bear this shape as it minimises the surface area, and therefore the surface energy. While the cluster shapes in our simulations can deviate significantly from any symmetries, for the sake of simplicity, we will investigate the cluster shapes by analysing the extent to which the clusters are ellipsoidal. We use principal component analysis to calculate the cluster ellipticities. For each cluster, we calculate the semimajor ellipsoidal axes by following the procedure outlined in Zemp et al 2011Zemp et al. (2011), which applies this approach to investigate the shapes of dark matter substructure in simulations. We reintroduce this procedure in appendix A. Figure 8 illustrates the results of this analysis for a typical cluster from one of our simulations.
The cluster axis ratios as a function of cluster size are plotted in figure 9. For a single run, figure 10 shows the probability distribution for the axis ratios, and how they change with cluster size. Although the clusters become more spherical as they grow, they are still significantly ellipsoidal at all sizes  in contrast to the standard model assumption that both sub and postcritical clusters are spherical. We observe a trend of increasing ellipticity as temperatures increase. Especially important to nucleation are the shapes around the critical cluster sizes. For each simulation, we find that the critical clusters have axis ratios and . These ellipticities are rather significant, and contrary to standard assumption of spherical shapes in nucleation models.
To explore to origin of these nonspherical cluster shapes we performed supplementary simulations: Eight spherical liquid clusters of size are placed in a gas at saturation and . After letting the system equilibrate for 10’000 timesteps, we perform the PCA on the clusters. Their axis ratios are in the second panel (blue crosses and pink circles) of figure 9. We find that they bear the same ellipsoidal distortions as the clusters in the nucleation simulation. We therefore conclude that it unlikely that the clusters’ ellipsoidal shapes are in some way dependent on their formation through the nucleation process. Section V investigates whether angular momentum plays a significant role in cluster dynamics, and finds it relatively insignificant. The relatively low spins suggest that the large ellipticities (figure 9) are not supported by angular momentum.
This leaves us to conclude that the main cause of the cluster ellipticities are thermal fluctuations. High temperatures lead to larger surface fluctuations, and cause larger deviations from sphericity found at high temperatures. Interestingly, the average differences between the axis lengths are nearly independent of cluster size, but increase with temperature: We find that and for all clusters in the simulations. Large clusters are rounder (axis ratios closer to one) mainly because their larger size makes the nearly constant absolute differences become smaller in relative terms. The average differences grow with temperature and at we find and . Short animations of evolving clusters are available, in which their nonsphericity is clearly visible(supp, ).
Vii Central Densities and Transition Regions
An essential aspect of tiny clusters is the interface layer between the constantdensity, liquid interior and the gas outside. The surface energy of the cluster depends on the properties of this layer. Of particular importance to the classical nucleation theory is the surface tension of the droplet, which depends on the interfacial pressure profile. Classical nucleation models assume that clusters are homogeneous, spherical droplets, with a sharp, well defined boundary. At this boundary the fluid properties are assumed to jump from bulk liquid properties on the inside to bulk vapor properties on the outside (see for example Kalikmanov, V.I. (2013)(Kalikmanov, 2013) and Kashchiev (2000)Kashchiev (2000)). In this section we show that the cluster properties found in our simulation deviate significantly from these assumptions.
Numerical simulationsChapela et al. (1977) of liquidvapor interfaces have shown that the density transition is continuous, and for spherical droplets, is wellapproximated by
(10) 
where is the density within the cluster, the gas density, the corona position, and its width. In each cluster’s centerofmass frame, we bin the spherical number density, using a bin size of The number density profiles for clusters of the same size are used to make ensemble averages, to which equation 10 can be fit.
Because the interface width is computed from fits to a spherical density profile, yet the clusters do not exhibit spherical symmetry, even a sharp transition region would result in a density profile with . Under the more realistic assumption of ellipticity, this artificial contribution to can be estimated. Figure 11 shows the ratio of and the long axis length , as a function of the axis ratios. We find that this artificial contribution to is smaller than our measured the interface widths in practically all cases. In other words, this effect, which in principle could lead us to overestimate significantly, is actually negligible.
The upper panel of Figure 6 shows the binned centerofmass density profile of the large cluster shown in figure 1, as well as the fit. Figures 12 and 13 show that the inner density and interface width respectively depend on cluster size: Generally, both the inner density and the interface width increase with cluster size (with the exception of the anomalous inner densities in small, high temperature clusters, see below). Inner densities and interface widths may then be compared to the analogous quantities from equilibrium simulations of planar vaporliquid interfaces at various temperatures.
We note that:

At small sizes (), the clusters hardly have a core, and so equation (10) gives them only a transition component.

At the critical cluster sizes , the inner densities are significantly lower than in bulk liquid. This implies a surface area larger than expected from classical nucleation theories.

For the clusters are warmer than the gas (see figure 2), and due to thermal expansion, have a lower density than the bulk would have at the gas temperature. The bulk densities taken at the cluster temperatures agrees with the central cluster densities only for our very largest clusters ().

For all simulations and cluster sizes (i.e. subcritical and postcritical), the cluster transition regions are thinner than the planar. equilibrium interfaces simulated at the same temperature.
MonteCarlo simulations ten Wolde and Frenkel (1998), find that critical clusters have inner densities equal to that of bulk liquid, which is at tension with our observations. Napari et al. (2009)(Napari, Julin, and Vehkam?ki, 2009b) estimated interface widths by comparing the sizes of cluster determined with different cluster definitions. They conclude that critical clusters from direct nucleation simulations have a thicker transition region than spherical clusters in equilibrium with the surrounding vapor. Due to the different simulations and analysis methods a detailed comparison is difficult.
As mentioned above, figure 12 shows that the inner densities generally increase with cluster size, except for small, high temperature clusters. Fitting the density profile of small clusters is difficult, because they do not have a well defined, constant inner density. This could affect the resulting values. To confirm the surprising anomaly in the inner densities of small, high temperature clusters we also measure the central cluster densities within a sphere of the cluster centre of mass directly: Figure 14 plots the ensembleaveraged number of particles within . This alternative measure confirms the findings from directly, without any fitting procedure: Generally, as the clusters grow, they become more tightly bound, leading to an increased central density. The small clusters in our high temperature runs at show a different trend: the central density first decreases, and then increases again. At the minimum central density occurs at , and at it occurs in the range This anomaly is evident both in the values from the fits and in the densities within the central . We are currently unable to explain this behaviour.
Viii Cluster sizes
We have two means for measuring cluster sizes. The one is with the principle component analysis procedure, detailed in section VI and appendix A, and the other with density profile fits, explained in section VII. The principle component analysis route, under the assumption of a constant density ellipsoid, yields the three ellipsoidal axes and for each cluster. The second method assigns to each density profile the centre of the transition region, . For nucleation, sizes are important because they provide an estimate for the cluster surface area, which helps set the total surface energy  a key component for nucleation theories. The simplest nucleation models assume that a cluster with members is spherical, and has a density equal to that of the bulk at the same temperature, from which a size, and therefore surface area, can be calculated. For three ellipsoidal axes and , the surface area may be analytically obtained using the approximate relation Klamkin (1971)
(11) 
which has a worstcase relative error . Figure 15 compares, using our two sizemeasuring methods, the sizes and surface areas of the clusters to the standard nucleation model assumptions.
Both methods give larger sizes and surface areas than the classical model assumptions, due to the densities being lower than the bulk. Both methods suffer from large uncertainties and it is unclear how their resulting surface areas relate to the area of the true (unknown) surface of tension. The surface of tension lies somewhere in the transition regions. We have tried to calculate the radius of the surface of tension and the surface energy by assuming spherical interfaces and using the IrvingKirkwood(Irving and Kirkwood, 1950a) pressure tensor approach applied to spherical droplets(Irving and Kirkwood, 1950b; Hi Lee, 2012). However, the transient nature of the nonequilibrium clusters in our nucleation simulations does not allow for the accumulation of strongenough statistics for us to get a useful surface energy signal. We are therefore unable to constrain the cluster surface tension and the location of their true surface of tension.
For critical clusters we observe ratios from about 0.6 to 0.85 (see Figure 13). This introduces very large possible shifts in the surface areas: Placing the surface of tension at instead of would increase the critical cluster sizes by 30 to 43 percent, and their surface areas by factors of 2.2 to 2.9. Setting the surface of tension to instead of would decrease surface areas by factors of 2.9 to 5.3. These examples illustrate how large the uncertainties in the area of the surface of tension are, which introduces huge uncertainties of many orders of magnitudes into any nucleation rate predictions based on these surface areas.
Compared with the spherical density profile based size definition, which uses the midpoint of the transition region as cluster radius, the principle component analysis method usually gives larger sizes. This is because of the assumption that clusters are constant density ellipsoids, when converting the eigenvalues to axis lengths using equation 20. The PCA analysis weights outer members heavily in the computation, yet these outer members are not part of a constant density neighbourhood, because they belong to the tail of the transition region. This effect decreases as the size of the transition region becomes small relative to the cluster size, i.e. when . At low temperatures, the PCA route yields smaller clusters than the density profile method. This, because lowtemperature clusters are more spherical than higher temperature ones.
Ix Revisiting nucleation models
Nucleation models for the free energy of formation aspire to find the balance between the energy gain and cost due to creation of volume and surface. The volume term is wellunderstood as its contribution to the formation energy dominates in the largecluster limit, whose properties are therefore straightforward to verify. Nucleation models’ shortcomings are thought to be due to an insufficient understanding of the surface energy contribution to the free energy, which dominates for small clusters, and which is therefore difficult to verify. Most nucleation models in the literature offer various forms for the surface energy component. For example, a common choice for the surface energy expresses the surface tension of a spherical cluster as a correction to the planar surface tension. This prescription for the free energy takes the form (Oxtoby, 1992; Laaksonen, Ford, and Kulmala, 1994)
(12) 
where is the cluster radius. The DillmanMeier approach(Dillmann and Meier, 1991) lets the term play the role of a Tolmanlike Tolman (1949) correction. The SP model as used in Tanaka et al. (Tanaka et al., 2011) and Diemand et al. (Diemand et al., 2013) makes the choice
(13) 
where is set using the second virial coefficient, and
with the density taken to be equal to that of the bulk density. The classical nucleation model on the other hand, lets the surface tension take on the planar value, regardless of cluster size, setting . Both the CNT and SP models however, along with many others, make the same choice for the surface area, setting it to In this section we explore the effect of replacing this surface area estimate with the directlymeasured values
For the surface tension, we use the SP model parameters (13). We impose the further stipulation that the free energy of formation for a cluster of size one is zero:
(14) 
Figure 16 plots modelled free energy curves for a single run, and compares them to a kinematically reconstructed (Tanaka, K. et al (2014), in preparation) free energy. Our techniques for estimating the sizes (see section VIII and figure 15) show that because the cluster densities are lower than the bulk values, the surface areas are larger than the traditional nucleation model surface area assumptions. This increases the cost in forming a surface, lowering nucleation rates. Figure 17 compares the resultant nucleation rates to those measured directly from the simulation using the YasuokaMatsumoto(Yasuoka and Matsumoto, 1998) (or threshold) method. In almost all cases, the directlymeasured surface areas lower the nucleation rate. The density profile surface area estimates improve the model estimation by a factor  However, the PCA surface area estimates significantly underestimate the nucleation rates, especially at high temperature, where surface fluctuations dominate the size measuring method. One should not keep ambitions for retrieving perfectagreement nucleation rates with the procedure used in this section, as our ‘empirical’ surface energy model is still at the behest of a theoretical surface tension model, which likely holds shortcomings unfortunately typical to droplet surface tension models. Given that reasonable size measurements improve the model predictions, we are lead to conclude that it is not just the surface tension modelling which needs improvement  but that models must address surface area estimates directly, taking into account the notyetbulk central densities of clusters.
X Conclusions
This work offers detailed description of cluster formation in unprecedented largescale LennardJones molecular dynamics simulations of homogeneous vaportoliquid nucleation. Our main findings are

Significant latent heat is retained by large, stable clusters: as much as for clusters with . Small, subcritical clusters on the other hand have the exact same average temperature as the surrounding vapor.

Cluster shapes deviate significantly from spherical: ellipsoidal axis ratios for critical cluster sizes lie typically within and .

Cluster spin is small and plays a negligible role in the cluster dynamics.

For critical, subcritical, and postcritical clusters, the central potential energies per particle are significantly less negative than in the bulk liquid. They reach the bulk values only at large () sizes.

Central cluster densities generally increase with cluster size. However, for small, high temeperature clusters we uncover a surprising exception to this rule: their central densities decrease with size, reach a minimum (at for and at for ) and then join the general trend of increasing central densities with larger sizes.

For critical and subcritical clusters, , the central densities are significantly smaller than in the bulk liquid. At the critical cluster size, the cluster central densities are between lower than the bulk expectations. This implies a surface area larger than expected from classical nucleation theories.

For the clusters are warmer than the gas (see figure 2), and due to thermal expansion, have a lower density than the bulk would have at the gas temperature. The bulk densities and potential energies taken at the cluster temperatures agree with the central cluster properties only for our very largest clusters ().

For all simulations and cluster sizes (i.e. subcritical and postcritical), the cluster transition regions are thinner than the planar equilibrium interfaces simulated at the same temperature.

Cluster size measurements suggest larger sizes than assumed in classical nucleation models, implying lowerthanexpected nucleation rates. However, the exact area of the true surface of tension remains unknown, as does the surface tension itself  a major source of uncertainty in nucleation rate predictions.

Across all the cluster properties examined in this paper, there exists significant spread for clusters at each size. The standard approach to nucleation theory assumes all clusters of the same size have the same properties. This therefore allots to all clusters of a certain size, the same surface energy, as opposed to distributing them into disparate population types. The observed scatter in the cluster properties at each size suggests that the development of nucleation theories which address this may lead to a more realistic description of the process.
Acknowledgements.
We thank the referees for useful comments. We acknowledge a PRACE award (36 million CPU hours) on Hermit at HLRS. Additional computations were preformed on SuperMUC at LRZ, on Rosa at CSCS and on the zBox4 at UZH. J.D. and R.A. are supported by the Swiss National Science Foundation.Appendix A Determining cluster shapes with principal component analysis
We define the tensor
(15) 
which is the second moment of the mass distribution  the part of the moment of inertia tensor responsible for the describing the matter distribution. The shape tensor is defined by
(16) 
For discrete, equal mass particles, in the centreofmass frame, the elements of the shape tensor read
(17)  
(18) 
in cartesian coordinates relative to the cluster centre of mass, where the sum is over all the members of the cluster. If there exist vectors , for which satisfy
(19) 
then the triplet and form the eigenvectoreigenvalue pairs for . For an ellipsoid of uniform density, the axes and are related to the eigenvalues of the shape tensor via
(20) 
We choose the convention . Clusters not large enough to have a significant number of core atoms  therefore composed only of a fluffy surface, cannot be wellapproximated by a constantdensity ellipsoid. For these clusters, the method can overestimate the axis lengths, implying that this method does not provide a sound estimate of clusters’ sizes when they are small. However, the axis ratios provide a useful indicator of the cluster shapes regardless of the cluster size.
Appendix B Supplementary equilibrium simulations of planar vaporliquid interfaces
To determine the thermodynamic properties of the LennardJones fluid simulated here a range of liquidvapor phase equilibrium simulation were run, similar to those in (Werth et al., 2013; Baidakov et al., 2007; Trokhymchuk and Alejandre, 1999; Mecke, Winkelmann, and Fischer, 1997). We used the same setup and analysis as described in (Baidakov et al., 2007), except that we use a different cutoff scale () and a different timestep (). For simplicity, we calculate the surface tensions using the KirkwoodBuff pressure tensor only and to obtain better statistics the interface surface area was increased by doubling the size of the simulated rectangular parallelepiped in the x and y direction, i.e. we set .
Using the same cutoff scale () as in (Baidakov et al., 2007), we are able to exactly reproduce their results given at and . The results with our chosen cutoff scale () are shown in Figure 18. The surface tensions are about 5 percent lower, and the bulk liquid densities lower by than those found in (Baidakov et al., 2007). Figure 18 also shows fitting functions to our simulation results for the bulk liquid density
(21) 
with
(22) 
the planar interface thickness
(23) 
the planar surface tension
(24) 
and the potential energy per particle in the bulk liquid
(25) 
We use the values of these fitting functions throughout this paper, the values at and are listed in table 3. Results of the thermodynamic quantities that we calculate from slab simulations, and comparison to similar simulations by other authors are plotted in figure 19.
Note that at we have to rely on extrapolations . We could not get meaningful constraints from equilibrium simulations at this low very temperature, because our liquid slabs begin to freeze before a true equilibrium with the vapor is established. The same limitations were also reported in Baidakov et al. (2007)(Baidakov et al., 2007).
The saturation pressures in our equilibrium simulations agree well with the fitting function proposed in Trokhymchuk et al.Trokhymchuk and Alejandre (1999), and we use this fitting function and the actual pressure measured in the nucleation simulations to determine the supersaturation , see Diemand et al 2013(Diemand et al., 2013) for details.
1.0  0.682  0.437  4.77  1.94 

0.8  0.787  0.801  5.591  1.52 
0.6  0.874  1.17  6.344  6.21 
0.5  0.913  1.48  6.708  9.46 
0.4  0.950  1.57  7.10  13.9 
References
 Kalikmanov (2013) V. I. Kalikmanov, Nucleation Theory, Lecture Notes in Physics (Springer, Dordrecht, 2013).
 Sinha et al. (2010) S. Sinha, A. Bhabhe, H. Laksmono, J. Wölk, R. Strey, and B. Wyslouzil, The Journal of Chemical Physics 132, 064304 (2010).
 Wedekind et al. (2007) J. Wedekind, J. Wölk, D. Reguera, and R. Strey, The Journal of Chemical Physics 127, 154515 (2007).
 Napari, Julin, and Vehkam?ki (2009a) I. Napari, J. Julin, and H. Vehkam?ki, The Journal of Chemical Physics 131, 244511 (2009a).
 Diemand et al. (2013) J. Diemand, R. Angélil, K. K. Tanaka, and H. Tanaka, The Journal of Chemical Physics 139, 074309 (2013).
 Plimpton (1995) S. Plimpton, Journal of Computational Physics 117, 1 (1995).
 Panneton, L’Ecuyer, and Matsumoto (2006) F. Panneton, P. L’Ecuyer, and M. Matsumoto, ACM Trans. Math. Softw. 32, 1 (2006).
 Wedekind, Reguera, and Strey (2007) J. Wedekind, D. Reguera, and R. Strey, The Journal of Chemical Physics 127, 064501 (2007).
 Stillinger (1963) F. H. Stillinger, The Journal of Chemical Physics 38, 1486 (1963).
 Yasuoka, Matsumoto, and Kataoka (1994) K. Yasuoka, M. Matsumoto, and Y. Kataoka, The Journal of Chemical Physics 101, 7904 (1994).
 Tanaka et al. (2011) K. K. Tanaka, H. Tanaka, T. Yamamoto, and K. Kawamura, The Journal of Chemical Physics 134, 204313 (2011).
 Werth et al. (2013) S. Werth, S. V. Lishchuk, M. Horsch, and H. Hasse, Physica A Statistical Mechanics and its Applications 392, 2359 (2013), arXiv:1301.2546 [condmat.soft] .
 Baidakov et al. (2007) V. G. Baidakov, S. P. Protsenko, Z. R. Kozlova, and G. G. Chernykh, The Journal of Chemical Physics 126, 214505 (2007).
 Trokhymchuk and Alejandre (1999) A. Trokhymchuk and J. Alejandre, The Journal of Chemical Physics 111, 8510 (1999).
 Mecke, Winkelmann, and Fischer (1997) M. Mecke, J. Winkelmann, and J. Fischer, The Journal of chemical physics 107, 9264 (1997).
 Feder et al. (1966) J. Feder, K. Russell, J. Lothe, and G. Pound, Advances in Physics 15, 111 (1966), http://www.tandfonline.com/doi/pdf/10.1080/00018736600101264 .
 McGraw and LaViolette (1995) R. McGraw and R. A. LaViolette, The Journal of Chemical Physics 102, 8983 (1995).
 Fisher (1967) M. E. Fisher, Physics 3 (1967).
 Kashchiev (2000) D. Kashchiev, Nucleation: Basic Theory with Applications (ButterworthHeinemann, Oxford, 2000).
 Prestipino, Laio, and Tosatti (2012) S. Prestipino, A. Laio, and E. Tosatti, Phys. Rev. Lett. 108, 225701 (2012).
 Zemp et al. (2011) M. Zemp, O. Y. Gnedin, N. Y. Gnedin, and A. V. Kravtsov, The Astrophysical Journal 197, 30 (2011), arXiv:1107.5582 [astroph.IM] .
 Chapela et al. (1977) G. A. Chapela, G. Saville, S. M. Thompson, and J. S. Rowlinson, J. Chem. Soc., Faraday Trans. 2 73, 1133 (1977).
 ten Wolde and Frenkel (1998) P. R. ten Wolde and D. Frenkel, The Journal of Chemical Physics 109, 9901 (1998).
 Napari, Julin, and Vehkam?ki (2009b) I. Napari, J. Julin, and H. Vehkam?ki, The Journal of Chemical Physics 131, 244511 (2009b).
 Klamkin (1971) M. S. Klamkin, The American Mathematical Monthly 78, pp. 280 (1971).
 Irving and Kirkwood (1950a) J. H. Irving and J. G. Kirkwood, The Journal of Chemical Physics 18, 817 (1950a).
 Irving and Kirkwood (1950b) J. H. Irving and J. G. Kirkwood, The Journal of Chemical Physics 18, 817 (1950b).
 Hi Lee (2012) S. Hi Lee, Bull. Korean Chem. Soc. 33, 3805 (2012).
 Oxtoby (1992) D. W. Oxtoby, Journal of Physics: Condensed Matter 4, 7627 (1992).
 Laaksonen, Ford, and Kulmala (1994) A. Laaksonen, I. J. Ford, and M. Kulmala, Phys. Rev. E 49, 5517 (1994).
 Dillmann and Meier (1991) A. Dillmann and G. E. A. Meier, The Journal of Chemical Physics 94, 3872 (1991).
 Tolman (1949) R. C. Tolman, The Journal of Chemical Physics 17, 333 (1949).
 Yasuoka and Matsumoto (1998) K. Yasuoka and M. Matsumoto, The Journal of Chemical Physics 109, 8451 (1998).
 Gu, Watkins, and Koplik (2010) K. Gu, B. Watkins, and J. Koplik, Fluid Phase Equilibria 297, 77 (2010).
 Dunikov, Malyshenko, and Zhakhovskii (2001) D. O. Dunikov, S. P. Malyshenko, and V. V. Zhakhovskii, The Journal of Chemical Physics 115, 6623 (2001).
 Vrabec et al. (2006) J. Vrabec, G. K. Kedia, G. Fuchs, and H. Hasse, Molecular Physics 104, 1509 (2006), http://www.tandfonline.com/doi/pdf/10.1080/00268970600556774 .
 (37) See Supplementary Material available at JCP online, for short movies of the trajectories of stable clusters.