Morphological Properties of z\sim 1.5-3.6 Star Forming Galaxies

# An HST/WFC3-IR Morphological Survey of Galaxies at z=1.5−3.6: I. Survey Description and Morphological Properties of Star Forming Galaxies

David R. Law1 2 , Charles C. Steidel3 , Alice E. Shapley2 , Sarah R. Nagy2 , Naveen A. Reddy1 4 , & Dawn K. Erb5
1affiliation: Hubble Fellow.
2affiliation: Department of Physics and Astronomy, University of California, Los Angeles, CA 90095; drlaw, aes@astro.ucla.edu, snagy@ucla.edu
3affiliation: California Institute of Technology, MS 249-17, Pasadena, CA 91125; ccs@astro.caltech.edu
4affiliation: National Optical Astronomy Observatories, 950 N. Cherry Ave., Tucson, AZ 85719
5affiliation: Department of Physics, University of Wisconsin-Milwaukee, P.O. Box 413, Milwaukee, WI 53201
###### Abstract

We present the results of a 42-orbit Hubble Space Telescope Wide-Field Camera 3 (HST/WFC3) survey of the rest-frame optical morphologies of star forming galaxies with spectroscopic redshifts in the range . The survey consists of 42 orbits of F160W imaging covering arcmin distributed widely across the sky and reaching a depth of 27.9 AB for a detection within a 0.2 arcsec radius aperture. Focusing on an optically selected sample of 306 star forming galaxies with stellar masses in the range , we find that typical circularized effective half-light radii range from kpc and describe a stellar mass - radius relation as early as . While these galaxies are best described by an exponential surface brightness profile (Sersic index ), their distribution of axis ratios is strongly inconsistent with a population of inclined exponential disks and is better reproduced by triaxial stellar systems with minor/major and intermediate/major axis ratios and 0.7 respectively. While rest-UV and rest-optical morphologies are generally similar for a subset of galaxies with HST/ACS imaging data, differences are more pronounced at higher masses . Finally, we discuss galaxy morphology in the context of efforts to constrain the merger fraction, finding that morphologically-identified mergers/non-mergers generally have insignificant differences in terms of physical observables such as stellar mass and star formation rate, although merger-like galaxies selected according to some criteria have statistically smaller effective radii and correspondingly larger .

galaxies: high-redshift — galaxies: fundamental parameters — galaxies: structure
slugcomment: DRAFT: September 3, 2019thanks: Based in part on data obtained at the W. M. Keck Observatory, which is operated as a scientific partnership among the California Institute of Technology, the University of California, and NASA, and was made possible by the generous financial support of the W. M. Keck Foundation.

## 1. Introduction

In recent years our understanding of the broad global characteristics of galaxies in the young universe has grown considerably. Using rest-frame UV and optical spectroscopy and multi-wavelength broadband photometry it has been possible to estimate their stellar and dynamical masses, average metallicities, ages, and star formation rates across cosmic time from to the present day (e.g., Shapley et al. 2005; Cowie & Barger 2008; Maiolino et al. 2008; Stark et al. 2009). Such studies indicate that the majority of structures observed in the local universe were already in place at (Papovich et al. 2005) and point to the era spanned by the redshift range as the peak epoch of both the cosmic star formation rate density (Dickinson et al. 2003; Reddy et al. 2008) and AGN activity in the universe (e.g., Miyaji et al. 2000).

In contrast to our knowledge of the global characteristics of such galaxies from ever-expanding samples however, our understanding of their internal structure and evolution has been limited by their small angular size. It has therefore been challenging to constrain the major mode of mass assembly in these galaxies (i.e., from major/minor mergers, hot mode or cold filamentary gas accretion, etc.). With typical half-light radii arcsec at (Bouwens et al. 2004; Nagy et al. 2011), such galaxies are barely resolved in the arcsec FWHM ground-based imaging and spectroscopy that form the backbone of the observational data.

Significant efforts have therefore been invested in imaging studies capitalizing on the high angular resolution afforded by the Hubble Space Telescope (HST). Early efforts to characterize the morphologies of galaxies at (e.g., Abraham et al. 1996; Giavalisco et al. 1996; Lowenthal et al. 1997; Bouwens et al. 2004; Conselice et al. 2004; Lotz et al. 2006; Papovich et al. 2005; Law et al. 2007b; and references therein) used the visible-wavelength surveying efficiency of the ACS (Advanced Camera for Surveys) to demonstrate that star forming galaxies typically have irregular, clumpy morphologies unlike the well-known Hubble sequence that has been established since (e.g., Conselice et al. 2005, Oesch et al. 2010). Indeed, rest-UV luminosity and morphology for such galaxies appears to be only poorly correlated with other physical observables such as stellar mass, outflow characteristics, and characteristic rotation velocity (e.g., Shapley et al. 2005; Law et al. 2007b).

Recent technological developments have permitted additional insights to be gleaned from ground-based observations using adaptive-optics (AO) fed imagers or integral field unit (IFU) spectrographs on 10m-class telescopes (e.g., Law et al. 2007a, 2009; Melbourne et al., 2008ab, 2011; Stark et al., 2008; Förster-Schreiber et al. 2009; Wright et al. 2009; Jones et al., 2010). Such IFU spectroscopy mapping rest-frame optical nebular line emission (redshifted into the near-IR at ) from star forming galaxies has suggested that high redshift star forming galaxies often have dispersion-dominated kinematics at odds with the classical picture of galaxy formation via rotationally supported thin gas disks (e.g., Förster-Schreiber et al. 2009; Law et al. 2009). Instead, the dynamical evolution of these systems may be driven by gravitational instabilities within massive gas-rich clumps or low angular-momentum cosmological gas flows (e.g., Kereš et al. 2005; Bournaud et al. 2007; Genzel et al. 2008 ). What is immediately clear is that we do not yet understand the dynamical state of galaxies during the period when they are forming the majority of their stars.

Both early rest-UV imaging and AO IFU observations of high-redshift galaxies tend to trace regions of active star formation however, and in order to understand these galaxies we also wish to map the regions in which the bulk of the underlying stellar population live. While young and old populations may have a generally similar distribution for lower-mass galaxies (e.g., Conselice et al. 2011), more significant differences exist for galaxies with larger stellar mass (e.g., Dickinson 2000; Papovich et al. 2005; and §4.3). Efforts to characterize rest-optical galaxy morphologies using ground-based instruments and/or the HST/NICMOS camera have been made by (e.g.) Papovich et al. (2005), Franx et al. (2008), Toft et al. (2009), van Dokkum et al. (2010), and Mosleh et al. (2011), generally finding that galaxies at were significantly smaller at fixed stellar mass than in the local universe. Additionally, HST/NICMOS work by Kriek et al. (2009) has demonstrated that star-forming and quiescent galaxies differ substantially from each other in relative compactness of their rest-optical morphologies, and both differ from their kin in the local universe.

Given the narrow field of view of both ground-based AO-fed imagers (e.g., Carrasco et al. 2010) and the HST/NICMOS camera (e.g., Conselice et al. 2011a) however, it is only recently with the advent of the new WFC3 camera onboard HST that it has become practical to perform wide-field morphological surveys in the near-IR that trace rest-frame optical emission from galaxies at . The results of the first such studies in the UDF have been reported recently in the literature (e.g., Cameron et al. 2010; Cassata et al. 2010; Conselice et al. 2011b). Our recent survey has greatly extended these early results by obtaining HST/WFC3-IR morphological data for 306 galaxies in 10 fields widely distributed across the sky for which we have obtained dense spectroscopic sampling.

Preliminary results for the evolution of the stellar mass - radius relation were presented in Nagy et al. (2011). In this first contribution of a series of papers using the full sample, we introduce our survey and describe a selection of results concerning evolution of the characteristic size, shape, and major merger fraction for actively star forming galaxies. Future contributions (Law et al. 2012, in preparation) will discuss the relation between morphology and low-ionization gas-phase kinematics, treat quiescent galaxies and AGN, and discuss the morphology of uniquely interesting galaxies (e.g., Q2343-BX442; Law & Shapley 2012, in preparation) in greater detail. This paper is organized as follows: In §2 we describe the HST/WFC3 observing program and review the properties of the star forming galaxy sample. In §3 we present postage-stamp morphologies of the galaxy sample and discuss our morphological analysis techniques. An extended discussion of the robustness of the morphological statistics and the systematic variations between measurement systems commonly adopted in the literature is presented in the Appendix. §4 summarizes the basic morphological characteristics (luminosity profile, relation to rest-UV imaging, and intrinsic 3D shape) of the galaxy sample, and the implications of our data for the evolution of the stellar mass - effective radius relation are discussed in §5. Finally, we use a variety of morphological statistics to constrain the major merger fraction and its evolution with redshift in §6. We summarize our results in §7.

Throughout our analysis, we adopt a standard CDM cosmology based on the seven-year WMAP results (Komatsu et al. 2011) in which km s Mpc, , and .

## 2. Observational Data

### 2.1. Observations and Data Reduction

Data were obtained using the WFC3/IR camera onboard the Hubble Space Telescope (HST-WFC3) as part of the Cycle 17 program GO-11694 (PI: Law). This program was comprised of 42 orbits using the F160W filter (  Å, which traces rest-frame   Å at respectively), divided amongst fourteen pointings in ten different survey fields (see Table 1) for a combined sky coverage of arcmin centered on lines of sight to bright () background QSOs.111Two fields (Q1623+26 and Q2343+12) had additional pointings in order to include sightlines to additional bright background QSOs and to include the uniquely interesting systems Q2343-BX415 (Rix et al. 2007) and Q2343-BX418 (Erb et al. 2010). Each pointing had a total integration time of 8100 seconds composed of nine 900 second exposures dithered using a custom nine-point sub-pixel offset pattern designed to uniformly sample the PSF.

The data were reduced using the MultiDrizzle (Koekemoer et al. 2002) software package to clean, sky subtract, distortion correct, and combine the individual frames. The raw WFC3 frames are undersampled with a pixel scale of 0.128 arcsec; these frames were drizzled to a pixel scale of 0.08 arcsec pixel using a pixel droplet fraction (pixfrac) of 0.7. This combination of parameters was found to give the cleanest, narrowest point-spread function (PSF) while ensuring that the RMS variation of the final weight map was less than % across the arcsec field of view. Using nine isolated and unsaturated stars in the Q1623+26 field we estimate that the FWHM of the PSF is arcsec (i.e., Nyquist sampled by the 0.08 arcsec drizzled pixels), varying by less than 4% across the detector and from field-to-field.

### 2.2. The Galaxy Sample

Our fourteen individual pointings are located within ten survey fields centered on lines of sight to bright background QSOs (). In the present contribution, we focus on actively star forming galaxies drawn from rest-UV color-selected catalogs of star forming galaxy candidates constructed according to the methods described by Steidel et al. (2003, 2004) and Adelberger et al. (2004). These catalogs are based on deep ground-based imaging and therefore select galaxies with independent of morphology or surface brightness (since even the largest galaxies are nearly unresolved in these seeing-limited images). Extensive ancillary information is available in these survey fields. In addition to deep ground-based optical imaging and rest-UV spectroscopy, many of the fields also have deep ground-based imaging, Spitzer IRAC/MIPS photometry, and for Q1549+19/Q1700+64 respectively spatially resolved HST/WFC3-UVIS and HST/ACS rest-UV imaging. All galaxy candidates in these catalogs are detected with WFC3 at down to AB.

Rather than relying on photometric redshifts, which typically have large uncertainties ( at ; van Dokkum et al. 2009), we restrict our attention to the subsample of galaxies with that have been spectroscopically confirmed using Keck/LRIS rest-UV spectra to lie in the redshift range ; i.e., the “BM” (), “BX” (), and “LBG” or -dropout () samples defined by Steidel et al. (2003, 2004). Systemic redshifts for the majority of our galaxies were derived from rest-UV absorption/emission line centroids using the prescriptions of Steidel et al. (2010); for 51 galaxies that have been successfully observed to date with either long-slit (Erb et al. 2006b) and/or IFU spectroscopy (13 galaxies, Förster Schreiber et al. 2009; Law et al. 2009; Wright et al. 2009) systemic redshifts were derived from rest-optical nebular emission lines (e.g., H, [O iii]).222Nebular emission line redshifts are better indicators of the systemic redshift than UV interstellar features at the km slevel; see discussion by Steidel et al. 2010).

Additionally, we omit from our sample any galaxies that lie within 1.5 arcsec of the edge of the WFC3-IR detector (where our dither coverage is incomplete), or which are known to contain AGN on the basis of rest-UV spectroscopy (24 systems; 12 bright QSOs with AB, and 12 faint AGN with AB). We discuss the morphological properties of these AGN in detail in a forthcoming contribution (Law et al. 2012, in preparation). The redshift and F160W magnitude distribution of the final sample of 306 galaxies are shown in Figure 1. As detailed in Table 1 the galaxies are roughly evenly distributed amongst the 10 fields (with additional pointings in Q1623+26 and Q2343+12). Motivated by the redshift ranges of the photometric selection criteria we loosely divide our galaxies into the three redshift ranges , , and , containing 72/127/107 galaxies respectively. Although we include galaxies up to in our analysis we note that the galaxy sample is very sparse for , as shown in Figure 1.

### 2.3. Initial Segmentation Map

Reduced HST/WFC3 images were registered to the same world coordinate system (WCS) as our deep ground-based optical/near-IR data using stars per pointing. Source Extractor (Bertin & Arnouts 1996) was then used to perform automated object detection (with no smoothing kernel) and produce an initial segmentation map in which each source is assigned a unique identifier. We set the source detection threshold to with a required minimum of 10 pixels above threshold for analysis, 32 deblending thresholds, and a minimum deblending contrast of 1%.333Adopting reasonable alternative values for the smoothing kernel and deblending thresholds makes an imperceptible difference to our derived morphological statistics. We adopt an RMS map proportional to the inverse square root of the weight map produced by MultiDrizzle, scaling by a correction factor (see discussion by Casertano et al. 2000) to account for the fact that the MultiDrizzle process introduces correlation in the pixel-to-pixel noise.

The initial segmentation map was manually inspected for each galaxy in our sample to ensure both that no spurious pixels were assigned to the galaxies and that each galaxy was not artificially broken into multiple objects. Since galaxies in the redshift range are well-known to be clumpy (e.g., Conselice et al. 2005; Law et al. 2007b; and references therein), this latter goal is non-trivial and Source Extractor frequently classifies multi-component galaxies as separate sources (see, e.g., Colley et al. 1996). While some neighboring clumps are likely to be physically associated with each other (if, for instance, they are embedded in a common envelope of low surface brightness emission), it is not always obvious which clumps are part of the target source and which are unassociated low- or high-redshift interlopers along the line of sight. Generally, we assume that all clumps that lie within a 1.5 arcsec ( kpc at ) radius about the -band centroid (i.e., the original detection image) are physically associated with a given galaxy unless there is evidence to the contrary (e.g., different spectroscopic redshifts, or dramatically different colors), and combine them under a single identifier. We discuss the validity of this method with respect to the incidence of genuine vs apparent pairs in §6.1.2.

In Figure 2, we present postage-stamp images of the galaxy sample (all pixels identified with sources other than the target galaxy sample have been cosmetically masked out by Gaussian random noise matched to the noise characteristics of the background sky). As expected on the basis of previous rest-UV morphological studies there is considerable diversity among the morphologies, which range from compact isolated sources to multi-component systems with extended regions of diffuse emission. While this initial segmentation map is adequate for estimating total source magnitudes and constructing postage-stamp images, it is inadequate for calculating quantitative morphologies; we discuss construction of second-pass segmentation maps in Section 3.8.

### 2.4. Photometry

We photometrically calibrated our data using the zeropoint magnitude of 25.96 AB given for the F160W filter in the HST/WFC3 data handbook. Masking all pixels identified with luminous sources using Source Extractor (Bertin & Arnouts 1996), we use a clipped mean to estimate that our drizzled images typically reach a limiting depth of 27.9 AB for a detection within a 0.2 arcsec radius aperture, or surface brightness sensitivity of 25.1 AB arcsec.444For comparison, the HUDF09 program (GO 11563; Bouwens et al. 2010) covered an area of 4.7 arcmin to a depth of 28.8 AB, the GOODS-NICMOS survey (GNS; Conselice et al. 2011a) covered 45 arcmin to a depth of 26.8 AB, and the ERS/GOODS-S program (GO 11359; Windhorst et al. 2011) covered an area of arcmin to a depth of 27.2 AB.

Initial estimates of the F160W magnitudes of the galaxies are obtained from the Source Extractor corrected isophotal magnitudes (MAG_ISOCOR), which are consistent to within 0.04 mag with estimates obtained from matched-aperture photometry from images smoothed to the angular resolution of the ground-based -band survey images for well-defined, isolated sources. We perform Monte Carlo tests of the statistical uncertainty and photometric biases in these magnitudes by inserting 1000 artificial galaxy models with known total magnitudes into randomly selected blank-field regions of the images and calculating the accuracy with which their magnitudes are recovered using Source Extractor. The galaxy models are constructed using GALFIT (see §3.2) to model the light profiles of five real galaxies in the Q1700+64 field that span a wide range of effective radii and Sersic index. These tests are performed for 0.5 magnitude bins spanning the range AB of the galaxy sample, and suggest (Table 2) that MAG_ISOCOR systematically underestimates the brightness of objects by mag. After correcting for this systematic offset, we find that the magnitudes of 10 isolated, bright ( AB), unsaturated stars in our target fields all agree with values published in the 2MASS point source catalog (Skrutskie et al. 2006) to within the photometric uncertainty of the catalog.

### 2.5. SED Fitting

Stellar masses, ages, and star formation rates were calculated by fitting the broadband spectral energy distribution (SED) of the galaxies with stellar population synthesis models using a customized IDL code (Reddy et al. 2012, in preparation). In addition to the HST/F160W and ground-based photometry many galaxy models also incorporate -band data, and in some cases Spitzer IRAC photometry. The SED fitting process is described in detail by Shapley et al. (2001, 2005), Erb et al. (2006c), and Reddy et al. (2006, 2010); in brief, we use Charlot & Bruzual (2011, in preparation) population synthesis models, a Chabrier (2003) initial mass function (IMF), and a constant () star formation history.

Although the statistical uncertainty of the magnitudes is small (see Table 2), the true uncertainty in the continuum magnitudes is significantly larger due to the uncertain contribution from nebular line emission that falls within the F160W bandpass ( Å). In order to ensure that the magnitudes do not unduly influence the SED fit with their small formal uncertainties (see also discussion by McLure et al. 2011) we attempt to quantify the additional uncertainty due to nebular emission in a physically motivated manner by bootstrapping approximate line fluxes from broadband scaling laws and typical nebular line ratios. We use the ground-based magnitudes to estimate the rest-frame monochromatic luminosity at 1500 Å, and convert this to a UV star formation rate using the Kennicutt (1998) relation. This UV SFR is corrected for extinction by estimating the UV slope from the photometry, and converting to an estimated extinction using the Meurer et al. (1999) relation in combination with a Calzetti et al. (2000) attenuation law (motivated by comparison with direct indicators of the dust emission at 24). We then assume that the extinction-corrected UV SFR is equal to the H SFR (see discussion by Erb et al. 2006b), and use the Kennicutt (1998) relation to estimate the corresponding H nebular emission line flux. Based on standard atomic physics and the observations of Maiolino et al. (2008) and Erb et al. (2006a), we assume that the other strong rest-optical nebular emission lines have typical flux ratios given by: , , , , . All of these estimated emission line fluxes are converted to observed values using the extinction coefficients described above.

The combined flux of emission lines that fall within the F160W bandpass at the redshift of each galaxy is added to the photometric bias corrected magnitude to obtain an estimate of the continuum magnitude . There are significant uncertainties associated with almost every step of our estimate of the nebular line-emission correction described above, not least of which is the strong variation in line flux ratios (e.g., ) with metallicity. We therefore conservatively estimate the uncertainty in the continuum magnitudes as . Typical values of this uncertainty are generally in the range mag, but values as high as 0.5 mag can occur in 10% of cases (and 1.0 mag in 1% of cases). Due to the downweighting of the WFC3 data point when , derived stellar masses in such cases differ by only 1% on average from stellar masses derived by omitting the WFC3 data point from the SED fit entirely.

## 3. Defining the Morphological Statistics

Many efforts have been made to quantify the morphologies of predominantly-irregular high redshift galaxies by using a combination of qualitative visual analyses, parametric Sersic model fits, and non-parametric numerical statistics (e.g., ‘’; Conselice 2003). Here we explore all of these methods and discuss the physical inferences that can be gleaned from each.

We describe below our methods for visual classification, Sersic profile fitting, and calculating the Gini coefficient , the second order moment of the light distribution , the concentration , asymmetry , and multiplicity statistics.555We do not calculate the smoothness parameter (i.e, the ‘’ in ‘’) because it is not robustly defined for galaxies as small and poorly resolved as those at (see discussion by Lotz et al. 2004). In our discussion of the non-parametric numerical statistics we define as the fluxes of the individual pixels in the segmentation map (see §3.8) with physical location , where ranges from 1 to .

### 3.1. Visual Classification

Our first morphological classification groups galaxies visually based on the apparent nucleation of their light profiles and the number of distinct components. As illustrated in Figure 3, we group galaxies from Figure 2 into three general classes:

Type I

Single, nucleated source with no evidence for multiple luminous components or extended low surface brightness features. 127 galaxies in our sample.

Type II

Two or more distinct nucleated sources of comparable magnitude, with little to no evidence for extended low surface brightness features. 56 galaxies in our sample.

Type III

Highly irregular objects with evidence of non-axisymmetric, extended, low surface-brightness features. 123 galaxies in our sample.

Type I galaxies appear consistent with being regular and isolated systems, while Type II galaxies may represent either early-stage mergers between two such formerly isolated systems or intrinsically clumpy systems with little continuum emission between the clumps. Type III galaxies in contrast may represent later-stage mergers with bright tidally induces disturbances, or clumpy concentrations within a single extended system (e.g., Bournaud et al. 2007). Of course, there is significant overlap between the three classes, and degeneracy in the classes to which a given galaxy may be assigned. Galaxies with identical luminosity profile but different surface brightness may, for instance, be assigned to either Type I or Type III depending on whether the low surface brightness features are above or below the limiting surface brightness of the data, and the division between Types II and III is similarly unclear. The goal of these visual classifications is not to provide decisive quantitative divisions however, but simply as a reference point to describe the general qualitative appearance of galaxies throughout the following discussion.

### 3.2. Sersic Profiles

In the local universe the surface brightness profiles of galaxies can often by well-fit by Sersic (1963) models over a large dynamic range in luminosities (e.g., Kormendy et al. 2009). While regular ellipsoidal models are clearly an incomplete description of the irregular galaxy morphologies illustrated in Figure 2, such models nonetheless provide a useful description of the characteristic sizes and surface brightness profiles of the major individual clumps. We therefore use GALFIT 3.0 (Peng et al. 2002, 2010) to fit the galaxy sample with two-dimensional Sersic profiles described by the functional form

 Σ(r)=Σeexp[−κ((rr1/2)1/n−1)] (1)

convolved with the observational PSF. These models are characterized by the effective half-light radius and the radial index of the profile. GALFIT actually calculates the effective half-light radius along the semi-major axis (); following a common practice in the literature (e.g., Shen et al. 2003; Trujillo et al. 2007; Toft et al. 2009) we convert this to a circularized effective radius , where is the minor/major axis ratio. As described in Peng et al. (2002, see their Figure 1), two of the most commonly observed values of the radial index in the nearby universe are , which corresponds to the exponential disk profile, and , which corresponds to a classical de Vaucouleurs profile with steep central core and relatively flat outer wings typical of elliptical galaxies and galactic bulges.

We use a median-combined stack of isolated, bright ( AB), unsaturated stars from across our WFC3 imaging fields to define the PSF model. While the structure of the PSF varies slightly across a given field, and from field-to-field with the HST-WFC3 roll angle, we find the details of our PSF model have little effect on the derived physical properties of our faint and extended galaxies (see also discussion by Szomoru et al. 2010). Since GALFIT convolves physical models with the observational PSF it is able to determine effective radii down to extremely small spatial scales. Following the method described by Toft et al. (2007), we use a variety of stellar point sources as PSF models to fit Sersic models to 11 stars in our WFC3 imaging fields, finding that the mean estimated size of known point sources is pixels. We therefore adopt a limit for unresolved point sources of pixels, or 0.073 arcsec, corresponding to 0.62 kpc at redshift .

Our procedure for fitting Sersic models to individual galaxies is as follows. We used the Source Extractor segmentation map to mask out all objects not associated with the target galaxies, replacing these pixels with Gaussian random noise matched to the noise characteristics of the image. We then cut out a arcsec region surrounding each galaxy and subtracted from it a ‘local sky’ estimated from the median of pixels excluded from the segmentation map. GALFIT is then used to fit the minimum number of axisymmetric (we do not introduce bending or Fourier modes) components required to satisfactorily reproduce the observed light distribution. For the majority of galaxies shown in Figure 2 we use a single component, unless there are clearly multiple spatially distinct clumps or significant asymmetry in the light distribution. All GALFIT models were inspected by two of us (DRL & SRN) in order to verify that a consistent approach was taken throughout the galaxy sample.

Unlike the non-parametric morphological statistics (which represent an integrated quantity over the entire light distribution of a galaxy), and can formally be multi-valued for galaxies fit by multiple Sersic components (i.e., Type II and some Type III galaxies). We adopt the convention of describing such multi-component galaxies by the and of the brightest individual component; as we discuss in §5.4, this assumption does not significantly bias our conclusions. For the few cases for which a reasonable model cannot be obtained with Sersic components (e.g., Q1009-MD28, which is in close physical proximity to the bright Q1009 QSO and turns out to be a Ly blob based on recent narrowband imaging) we consider and to be undefined.

### 3.3. Gini coefficient G

The Gini coefficient (; Gini 1912) was introduced into the astronomical literature by Abraham et al. (2003) and further developed by Lotz et al. (2004). measures the cumulative flux distribution of a “population” of pixels and is insensitive to the actual spatial distribution of the individual pixels.

Formally is defined (Glasser 1962) in the range as

 G=1¯fN(N−1)N∑i=1(2i−N−1)fi (2)

where is the average flux and the pixel fluxes are sorted in increasing order before the summation over all pixels in the segmentation map. High values of represent the majority of the total flux being concentrated in a small number of pixels, while low values represent a more uniform distribution of flux.

### 3.4. Second order moment M20

The spatial distribution of the light may be quantified via the second order moment of the light distribution, , introduced in this context by Lotz et al. (2004). is defined as the second order moment of the brightest pixels that constitute 20% of the total flux in the segmentation map, normalized by the second order moment of all of the pixels in the segmentation map. Mathematically,

 M20=log(∑iMiMtot),% while∑ifi<0.2ftot (3)

where

 Mtot=N∑iMi=N∑ifi[(xi−xc)2+(yi−yc)2] (4)

Following Lotz et al. (2004, 2006) we adopt the position that minimizes as the center () of the light distribution.

Typical values of range from (most irregular, often with multiple clumps) to (most regular).

### 3.5. Concentration C

The concentration index (Kent 1985; Abraham et al. 1994; Bershady et al. 2000; Conselice 2003) measures the concentration of flux about a central point in the galaxy. While slightly different versions have been introduced by various authors, we adopt the ‘’ standard:

 C=5log(r80r20) (5)

where are the circular radii containing 20%/80% respectively of the total galaxy flux within the segmentation map. Following Conselice et al. (2008), we adopt the flux-weighted centroid of the segmentation map as the center for the concentration calculation. While in many cases this corresponds naturally to a peak in the flux distribution, it is not necessarily the case for extremely irregular galaxies without well-defined central flux concentrations.

Typical concentration values range from (least compact) to (most compact). We note, however, that galaxies with two or more clumps (e.g., Type II galaxies) that are each individually compact are not generally compact in a global sense.

### 3.6. Asymmetry A

The asymmetry (Schade et al. 1995; Conselice et al. 2000) quantifies the rotational asymmetry of a galaxy. Mathematically, is calculated by differencing the original galaxy image with a rotated copy:666We note that Schade et al. (1995) and Conselice et al. (2000) included a factor of two in the denominator of Eqn. 6, while more recent work by Lotz et al. (2004) and Conselice et al. (2008) do not. We follow the convention of the more recent literature by neglecting this factor.

 A=min(∑|f0,i−f180,i|∑|f0,i|)−min(∑|B0,i−B180,i|∑|f0,i|) (6)

where represents flux in the original image pixels and flux in the rotated image pixels. Following Conselice et al. (2000, 2008) we determine the rotation center iteratively by allowing it to walk about an adaptively spaced grid with 0.1 pixel resolution until converging on the point that minimizes . The and terms represent fluxes in nearby background pixels to which we have applied an identical segmentation map, and are included to subtract the contribution of noise to the total galaxy asymmetry. As discussed by Conselice et al. (2008), the background sum is minimized similarly to the original image.

Typical values of range from 0 for the most symmetric galaxies to 1 for galaxies with the strongest rotational asymmetry.

### 3.7. Multiplicity Ψ

The multiplicity coefficient, introduced by Law et al. (2007b), calculates the effective “potential energy” of the light distribution

 ψactual=N∑i=1N∑j=i+1fifjrij (7)

where and are the fluxes in pixels respectively, is the separation between pixels and , and where the sum runs over all of the pixel pairs.777Note that and were defined incorrectly in Law et al. (2007b) with the sum double-counting each pixel pair. These factors of 2 would, however, cancel out upon constructing the final statistic from the ratio of to . This is compared to the most compact possible rearrangement of pixel fluxes, that by analogy with a gravitational system would require the most “work” to pull apart. This compact map is constructed by rearranging the positions of all galaxy pixels so that the brightest pixel is located in the center of the distribution, and the surrounding pixel fluxes decrease monotonically with increasing radius. Calling the distance between pixels and in this compact map,

 ψcompact=N∑i=1N∑j=i+1fifjr′ij (8)

The multiplicity coefficient measures the degree to which the actual distribution of pixel fluxes differs from the most compact possible arrangement, i.e.

 Ψ=100×log10(ψcompactψactual) (9)

As discussed by Law et al. (2007b), values for can range from 0 (i.e., for which the galaxy pixels are already in the most compact possible arrangement) to for extremely irregular sources. Generally, we find that isolated, regular galaxies in our sample may be described by , galaxies with some morphological irregularities by , and galaxies with strong morphological irregularities or multiple components by .

### 3.8. Detailed Segmentation Maps

The preliminary segmentation maps constructed in §2.3 above assign pixels to a given galaxy based on a constant surface brightness threshold tied to the noise characteristics of the WFC3 data. While such a segmentation map is sufficient for estimating total source magnitudes, it is inadequate for calculating quantitative morphologies using the nonparametric statistics defined in §3.3 - 3.7 since surface-brightness based pixel selection produces results that vary with total source luminosity, redshift, and limiting survey magnitude. Multiple methods have been adopted in the literature for defining robust segmentation maps; in the Appendix we discuss four such methods (Conselice et al. 2000; Lotz et al. 2004; Abraham et al. 2007; Law et al. 2007b) and calculate values for , , , , and in each.

In part, this Appendix is provided so that our results can be directly translated to the readers preferred choice of segmentation map, but it is also instructive to consider how the calculated values of the morphological parameters depend upon this choice. While we find that the values of , , , , and are well-correlated between different segmentation maps, there can be significant systematic offsets in dynamic range (particularly for ; see also Lisker 2008) between the systems. We discuss the implications of such offsets in Section 6 below.

Throughout the following analysis, we choose to calculate our baseline morphologies using the Abraham et al. (2007) quasi-Petrosian method with isophotal threshold as this method is arguably most well suited to the irregular morphologies of our target galaxies. Using the transformation relations presented in the Appendix however, we convert our values to the Lotz et al. (2004, 2006) systematic reference frame in order to compare to both recent observational results and numerical simulations (e.g., Lotz et al. 2010ab).

### 3.9. Robustness

The robustness of our morphological indices has been discussed in the literature many times before (e.g., Bershady et al. 2000; Lotz et al. 2006; Lisker 2008; Gray et al. 2009). Generally speaking, such work suggests that morphological statistics are relatively robust for large, bright galaxies but that they can become unreliable at faint magnitudes and for galaxies that are small with respect to the observational PSF. Most of these previous studies are tailored to the analysis of deep HST/ACS imaging in public survey fields however, and in order to understand the effects of systematic biases on our WFC3 imaging data (and on our specific galaxies) it is necessary to perform many robustness tests anew.

The details of our analysis exploring the robustness of each of the five quantitative morphological statistics, and the Sersic parameters and , to total source magnitude , the size of the observational PSF, and our choice of pixel scale are presented in the Appendix. In brief, we find that:

1. The derived values of six of the seven indices are fairly robust for galaxies with magnitudes (roughly corresponding to total S/N ), but become less reliable at fainter magnitudes. The exception is the concentration parameter , for which the small sizes of many of our galaxies cause the inner 20% flux isophote to be unresolved at all magnitudes and therefore to be unreliable (see also Bershady et al. 2000). We therefore omit from detailed discussion, and restrict our analyses in the following sections (except where indicated) to the subsample of galaxies with , resulting in a sample of 206 galaxies, 59/95/52 in the , , and redshift bins respectively. The physical implications of this self-imposed apparent magnitude limit, and of systematic variations with the observational PSF, are discussed in the relevant sections below.

2. Six of the seven indices (except ) are robust to our choice of a 0.08 arcsec pixel scale; our conclusions would be unchanged if we had drizzled our data to 0.06 arcsec or 0.1 arcsec pixels instead.

3. Given the small size of many of our galaxies, the nonparametric statistics , , , , and can vary systematically with the observational PSF as morphological features become more or less well-resolved (see also discussion by Lotz et al. 2004, 2008b). In particular, these five statistics will have less dynamic range to their values than in high-resolution imaging as it becomes progressively more difficult to distinguish them from point sources. This complicates quantitative comparisons to data obtained at different wavelengths or local comparison samples, but is less significant for comparisons within the population. In contrast, the Sersic parameters and are relatively robust to the PSF because the modeling process convolves theoretical models with the observational PSF.

4. The uncertainty in each of the seven indices is calculated via Monte Carlo simulations placing GALFIT model galaxies atop different blank-field regions of the WFC3 footprint in order to compare different realizations of the noise statistics. This uncertainty varies as a function of both source magnitude and morphological type; averaged over these considerations, typical uncertainties are 3% in , 4% in , 11% in , 22% in , 21% in , 2% in , and 15% in .

## 4. Basic Morphological Characteristics

In Figure 4 we plot histograms of the five nonparametric morphological statistics (, , , , and ) and the Sersic index divided according to redshift (we discuss the evolution of the characteristic effective radius in detail in §5). The typical star forming galaxy is best represented by a Sersic profile of index , , , , , and . Despite the range of rest wavelengths probed by the F160W filter across the redshift range of our sample ( Å), there is no evidence to suggest systematic variation with redshift across our sample (whether due to evolution or to a variable morphological k-correction). Applying a Kolmogorov-Smirnoff (KS) test suggests that all six indices are consistent at % confidence with the null hypothesis that they are drawn from the same distribution at all redshifts.

In contrast, there is significant difference between many of the morphological indices when divided according to their apparent visual morphology (Figure 5), indicating the underlying correlation between visual and numerical classification techniques. In particular, we note the strong correlation between visual type and the ‘irregularity’ statistics , , and . Broadly speaking, Type I galaxies have , Type III galaxies have , and Type II galaxies have . While is the statistic most strongly correlated with visual estimates of irregularity, qualitatively similar results are apparent for both and . As expected from our definition of Type III galaxies, this galaxy sample also has significantly lower mean values of , and slightly shallower Sersic indices.

### 4.1. Composite Luminosity Profile

As illustrated by Figures 4 and 5, typical galaxies are best described by an Sersic profile. Indeed, only six galaxies have , of which four have estimated radii less than our resolution estimate, suggesting that these galaxies are simply too small to robustly determine their structure. Focusing our attention on the galaxies with we find a mean with standard deviation of 0.39, corresponding to flat inner regions intermediate between a Gaussian () and an exponential profile (), and a steeply declining profile at larger radii (similar to previous results by, e.g., Ravindranath et al. 2006; Conselice et al. 2011b; Förster Schreiber et al. 2011).

In order to investigate the faint extended structure of our star forming galaxy sample we create a composite stack of galaxies (irrespective of redshift and total magnitude). We cut out arcsec regions around each galaxy, align all of the flux-weighted image centroids using sub-pixel bilinear interpolation, and stack the individual images together using a -clipped mean algorithm. The resulting stack for our 127 galaxies of morphological type I (i.e., those galaxies whose morphologies are most regular and well-defined) reaches a limiting surface brightness of 27.8 AB arcsec (31.2 AB for a detection in a 0.2 arcsec radius aperture).

As illustrated in Figure 6 the stacked radial profile is well described by an Sersic model with effective radius kpc. The profile is a good match to the Sersic model out to at least 6 kpc () and deviates only moderately from the model out to the detection limit at kpc. As expected, including the Type III galaxies (which, by definition, are more extended) in the stack results in a slightly larger characteristic effective radius kpc but a similarly good match to an Sersic model.

We caution, however, that while Figure 6 confirms that models are a fairly good representation of star forming galaxies, the stacked profile does not account for variability in the size or orientation of its component galaxies. By effectively discarding information about the projected ellipticity the stack overestimates the mean effective radius of the sample by a factor . We discuss the characteristic sizes of the star forming galaxies in detail in §5.

### 4.2. Distribution of Axial Ratios

In Figure 7 we plot a histogram of for galaxies with , , and both major and minor axis lengths well resolved (a total sample of 164 galaxies).888Since the K-S test indicates a greater than 50% likelihood of the null hypothesis that the , , and samples are drawn from the same distribution we simply combine these three subsamples. Statistically indistinguishable results are obtained if we exclude Type II galaxies from our analysis, or include galaxies with resolved major but unresolved minor axes. The distribution999We assess the reliability of our measurements using Monte Carlo simulations. Artificial galaxies with magnitude, radius, Sersic index, and position angle drawn at random from the observed distributions and uniformly distributed in the range are created using GALFIT and placed within our WFC3 fields. We find that the mean error for values of and for values of . is strongly peaked about with tails extending to both extremes and . As we demonstrate below, such a distribution is strongly inconsistent with a population of thick exponential disks as is commonly assumed in the literature (e.g., Genzel et al. 2008) and much more consistent with a population of triaxial ellipsoids.

As discussed by Padilla & Strauss (2008, and references therein) for a large sample of local galaxies drawn from the Sloan Digital Sky Survey (SDSS), a population of spiral galaxies with random orientations defines a distribution in that is relatively flat above some minimum value corresponding to the edge-on thickness of the disks. Taking to be the inclination of such a disk to the line of sight (where represents a disk viewed face-on), the observed axial ratio () of a flattened axisymmetric system is given by (see, e.g., Hubble 1926; Tully & Fisher 1977):

 cos2i=(b/a)2−r201−r20 (10)

where is the intrinsic minor/major axis ratio for a perfectly edge-on system. In the thin-disk approximation and Equation 10 reduces to the familiar . In the local universe, typical values for range from for Sa-type to for Sd-type galaxies (Guthrie 1992; Ryden 2006), although variations can also occur with wavelength (e.g., Dalcanton & Bernstein 2002). At redshifts however, star forming galaxies are known to have significant vertical velocity dispersion (e.g., Law et al. 2009; Förster Schreiber et al. 2009), and analysis of five of the most disk-like objects (based on velocity maps derived from integral field spectroscopy) indicates that the median (Genzel et al. 2008). For more typical dispersion-dominated galaxies () might be expected to be even larger.

We perform Monte Carlo tests in which we artificially observe a sample of flattened axisymmetric disks from a random distribution of inclinations.101010Strictly, we observe a single model galaxy from random viewing angles in the spherical polar coordinate system , where the random viewing positions are distributed uniformly in the azimuthal coordinate and the cosine of the polar coordinate , thereby uniformly covering the sky as seen from the perspective of the model galaxy. Formally, we quantify the difference between the observational data and the model by the statistic

 χ2=1νB∑i=1(Nmodel−Nobs)2Nobs (11)

where is the number of galaxies observed in each of our bins in , , and is the number of galaxies expected in each bin according to the assumed model. We overplot the distribution of obtained using such flattened axisymmetric disk models on the observational data in Figure 7. Regardless of the value of adopted, it is not possible to satisfactorily explain the observed distribution of ; , 0.2, and 0.4 models have and 16.9 respectively.

In contrast, the peaked distribution of is exactly the form expected for a population of randomly oriented triaxial ellipsoids such as that found by van den Bergh (1988) for a sample of local irregular galaxies. We therefore repeat our Monte Carlo analysis assuming the the galaxies can be characterized as triaxial ellipsoids with axis lengths . Calculating the projected minor/major axis ratio of a triaxial ellipsoidal surface viewed in an arbitrary orientation is an interesting problem in its own right, and we discuss the details of this calculation in Appendix B. Since we are only interested in axial ratios rather than the absolute lengths we set and consider a grid of values in the range .

In Figure 8 we show a surface plot of as a function of and . The best agreement between model and observations clearly occurs for a well-defined region around (). The expected distribution of for () is shown in Figure 7. While this is clearly a better description of the observations than the axisymmetric disk model (particularly in the expected number of systems with ) it is still imperfect (; ), predicting no galaxies with and a large excess with . These remaining imperfections likely reflect the intrinsic range of morphologies within the galaxy sample- rather than every galaxy having an identical shape there is undoubtedly some range about these values. Permitting a more realistic distribution of axis ratios (i.e., picking and at random from gaussian distributions with mean 0.7 and 0.3, and width 0.1 and 0.2 respectively), it is possible to reproduce the observed distribution of extremely well (solid red line in Figure 7; with ).

At present, it is meaningless to distinguish between minor/major axis ratios of 0.2 versus 0.3, or to state with certainty that a gaussian distribution of intrinsic axis ratios is appropriate. Our fundamental conclusion, however, is that the majority of star forming galaxies are best represented by triaxial systems rather than geometrically thick disks (as previously discussed by Ravindranath et al. 2006 and Elmegreen et al. 2005) and it is worth asking what this means in a physical sense.

Given the overall similarity between the rest-UV and rest-optical morphology (§4.3) it may simply be that light from clumpy (and asymmetrically distributed) star forming regions within galaxies (e.g., Bournaud et al. 2009) dominates the emergent flux at both UV and optical wavelengths. Alternatively, since we derived for the brightest subcomponent of each galaxy we may be measuring the intrinsic shape distribution of individual giant star forming clumps. However, the peaked distribution of persists if we restrict our attention to the most regular single-component systems (i.e., Type I galaxies) suggesting that we are observing galaxy scale structures with characteristic radii kpc. Similarly, the distribution of persists for galaxies with stellar masses greater than in which stellar continuum emission should be well detected in the WFC3 imaging data, suggesting that the stellar mass distribution itself is strongly asymmetric.

Combining our morphological results with observations (e.g., Law et al. 2007a, 2009; Förster Schreiber et al. 2009) that typical star forming galaxies have large gas fractions, high velocity dispersions km s, and velocity fields that are in many cases inconsistent with rotationally-supported disk models (especially at lower stellar masses; see discussion by Law et al. 2009), we suggest that the distribution of stars and gas in these rapidly star-forming galaxies may be inherently triaxial rather than residing largely in a geometrically thick disk. Such a distribution of gas would be gravitationally unstable, suggesting that the life cycle of star forming galaxies may be continually passing in and out of dynamical equilibrium (e.g., Ceverino et al. 2010). In such a scenario, gas disks may be only short-lived and continuously forming from recently accreted gas (whether acquired from mergers or hot/cold mode accretion; e.g., Dekel et al. 2009a, Kereš et al. 2009), rapidly becoming disrupted, and reforming again until the triaxial stellar component (perhaps a precursor of modern-day bulges) acquires sufficient mass to stabilize the growth of a long-lived and extended gas disk (e.g., Martig et al. 2010). We discuss additional observational support for such a scenario based on low-ionization gas phase kinematics in a companion paper (Law et al. 2012, in preparation).

We note that both our results and conclusions are qualitatively consistent with those of Ravindranath et al. (2006),111111More recently, see also Yuma et al. (2011). who used HST/ACS imaging in the GOODS fields to demonstrate that the rest-UV morphologies of star-forming galaxies at also have a peaked distribution of ellipticities. While we found the distribution for rest-optical morphologies to be peaked about however, Ravindranath et al. (2006) found /0.3 for galaxies at respectively as seen in the rest-UV. This difference may be explained in part by the difference in rest-frame wavelength probed by the two studies; it is perhaps unsurprising that the ellipticity of star forming galaxies in the young universe changes slightly from rest-frame Å(tracing the regions of most recent star formation) to rest-frame Å(tracing the older stellar population). In contrast, van der Wel et al. (2011) observed a relatively flat distribution of (above ) for a sample of 14 massive () compact quiescent galaxies at . While this may represent a fundamental structural difference between the star-forming and quiescent galaxy samples, we caution that the quiescent galaxies are ten times more massive than the typical star forming galaxy in our survey, and note that if increasing stellar mass stabilizes the formation of disks then star forming galaxies of similarly high mass may prove to have similarly disk-like ellipticities.

### 4.3. Rest-Optical vs Rest-UV Morphologies

One of our fields (Q1700+64) was imaged previously using HST/ACS with the F814W filter (GO-10581, PI: Shapley). This filter ( Å) traces rest-UV wavelengths ranging from 2000-3000 Å, depending on the redshift of the target galaxy. The detailed morphologies resulting from this rest-UV imaging program have already been discussed elsewhere (Peter et al. 2007). Here we compare the rest-optical and rest-UV morphologies of galaxies overlapping with our WFC3/IR imaging. For consistency we re-reduce the raw observational data from GO-10581, drizzling them to a 0.08 arcsec pixel scale and smoothing them to a FWHM of 0.18 arcsec in order to match the observational characteristics of our WFC3/IR imaging data. We calculate that the F814W image reaches a limiting depth of 28.7 AB for a detection within a 0.2 arcsec radius aperture, or 1 mag deeper than our WFC3/IR imaging data.

We show the morphologies of the 18 star forming galaxies that overlap between the two samples in Figure 9. Qualitatively, we note that the morphologies of most galaxies are similar in both rest-UV and rest-optical bandpasses; morphological irregularities or multiple components visible in one bandpass are similarly visible in the other, resulting in a small morphological k-correction (see discussion by Conselice et al. 2011b). The smallest variation is exhibited by galaxies of low stellar mass (for which the light from young stars might reasonably be expected to dominate both the rest-UV and rest-optical light of the galaxy), while high-mass galaxies exhibit greater differences consistent with the establishment of an evolved stellar population. In particular, the galaxies that were observed to be extremely low surface-brightness, red ( AB), ‘wispy’ systems in the rest-UV tend to be high-mass systems that are much brighter and well-nucleated in the rest-optical (e.g., Q1700-MD103, Q1700-BX767). This result is similar to that found by Toft et al. (2005) for a population of red star forming galaxies.

We quantify this morphological difference by calculating the internal color dispersion (Papovich et al. 2005) after carefully aligning the ACS and WFC3 images using the measured centroids of 10 stars.

 ξ(I1,I2)=∑(I2−αI1−β)2−∑(B2−αB1)2∑(I2−β)2−∑(B2−αB1)2 (12)

where and are the pixel fluxes in the F814W and F160W bandpasses, is a scaling factor describing the overall color of the galaxy, adjusts for the variable background level, and and represent blank background sky regions in each image. is set by minimizing the sum , i.e. . The sum is performed over all pixels in the F160W segmentation map. The background sums were done by adopting the mean from the calculation performed on the segmentation map grafted onto 1000 different regions of blank sky.

Values for calculated for each galaxy are quoted in Figure 9, and confirm our visual impression that the UV and optical morphologies differ more greatly for high-mass (alternatively, red) galaxies. At the low-mass end () , while for galaxies with we find , peaking at for the highest-mass galaxy Q1700-BX767 which displays a red core with a surrounding blue ring. Similar trends were noted by Labbé et al. (2003), who found significant rest-UV to rest-optical morphological differences for a sample of six -bright disk galaxies, and Papovich et al. (2005), who noted that their galaxies with the highest values of were those with the reddest colors. We caution that there are relatively few galaxies in our sample however, and recent work by Bond et al. (2011) looking at the rest-optical vs rest-UV morphologies of 117 () star forming galaxies in the GOODS-S field found a similar mean but no evidence for a correlation with galaxy color. In the near future we anticipate that the relation between rest-UV and rest-optical morphology will be greatly refined by the large-area and multi-band CANDELS survey (Grogin et al. 2011; Koekemoer et al. 2011).

It is also possible to compare the effective radii derived in each of the two bandpasses. Similarly to Dutton et al. (2010) and Barden et al. (2005), we find (Figure 10) that the rest-UV sizes of these galaxies are % larger on average than their optical sizes, although there is increasing scatter in the relation at large radii (i.e., large mass) in part because these galaxies are red ( AB) and poorly defined in the F814W data. This relation is largely unchanged if the Sersic index of the radial profile is kept fixed between the F160W and F814W data.

## 5. The Stellar Mass Radius Relation

### 5.1. Observed Relation

In Figure 11 we plot the effective circularized radius as a function of stellar mass for all galaxies with and Sersic index , constituting a sample of 59/93/50 galaxies in the redshift ranges respectively. Of these 202 galaxies, 9 (%) have effective radii consistent with an unresolved point source, and may represent either the compact end of the galaxy distribution or faint AGN (albeit with no obvious signature in the UV spectra or broadband SED out to Å rest frame).

Figure 11 indicates that galaxies occupy a large range of effective radii at all redshifts and stellar masses with the standard deviation of the distribution dex comparable to the scatter in the local star forming galaxy relation (e.g., Shen et al. 2003). Despite the large width of the distribution in , however, there is a mean mass-radius relation in place at early as that evolves with decreasing redshift. Binning our sample by redshift and stellar mass we calculate121212Values represent the -clipped mean. that () kpc for galaxies in the mass range ( respectively at redshift , increasing with cosmic time to () kpc by , and to () kpc by (see summary in Table 3). These results are consistent with the early values calculated for a subset of our sample by Nagy et al. (2011) to within the estimated uncertainty; the strongest evolution in effective radius with redshift occurs for higher mass galaxies . Parameterizing the stellar mass-radius relation as we find that the best-fit value of the powerlaw index , , and for the redshift , , and intervals respectively.

As indicated by the top histogram in Figure 11 the three redshift samples each probe galaxies with a slightly different range of stellar masses, and it is therefore useful to calculate a normalized quantity for each galaxy, where as a function of (solid black line in Figure 11) is the mean effective circularized radius for late-type (i.e., ) low-redshift galaxies in the SDSS (Shen et al. 2003). As indicated by Figure 12, typical star forming galaxies at fixed stellar mass were significantly smaller at than in the nearby universe, with , , and for the , , and samples respectively.131313Since our galaxies were selected from unresolved ground-based imaging data we do not expect intrinsic size to have an effect on our selection function. If galaxies at fixed stellar mass in the range can be assumed to grow with redshift as , a linear least squares fit to the data indicates that between and (solid line in Figure 12). This is consistent with similar determinations and found for massive star forming galaxies by van Dokkum et al. (2010) and Mosleh et al. (2011) respectively. Extrapolation of this power law suggests that actively star forming galaxies in the young universe may evolve onto the local late-type mass-radius relation by (although see §5.4), consistent with recent evidence that the mass-radius relation for star-forming galaxies evolves only weakly in the redshift interval (Barden et al. 2005).

Individual star forming galaxies, however, grow in both stellar mass and radius simultaneously and eventually evolve into typical galaxies by the present day as indicated by clustering analyses (e.g., Conroy et al. 2008). Given the shallow observed mass-radius relation for star forming galaxies at , it is clearly not possible for individual galaxies to evolve along this relation to match the local sample. Rather, galaxies need to add mass at large radii via steeper growth of the form or as illustrated in Figure 13 (see also Figure 8 of van Dokkum et al. 2010). Such growth may be consistent with expectations for major and minor mergers respectively (e.g., Bezanson et al. 2009, Naab et al. 2009 for early-type galaxies).

### 5.2. Comparison with Previous Results

In Figure 13 we plot the best-fit power law model of the stellar mass — radius relation for our galaxy sample against a variety of previous observational samples available in the literature.141414Where necessary, results have been converted to a Chabrier IMF and circularized effective half-light radii. Our results are generally consistent at the level with previous studies that, due to observational limitations, have typically been conducted for galaxies with high stellar masses (e.g., Franx et al. 2008; Toft et al. 2009; Williams et al. 2010; Targett et al. 2011) and extend these previous results down to .

The most direct comparison can be made to Mosleh et al. (2011), who used deep ground-based -band imaging across the GOODS-N field to measure the characteristic sizes of 41 massive () BM/BX star-forming galaxies in the redshift range and 4 LBGs in the redshift range for which spectroscopic redshifts have been made publicly available by Reddy et al. (2006). Although these target galaxies were selected and spectroscopically confirmed in a manner identical to our own sample, we find significant disagreement with respect to the mean as a function of stellar mass. As given in their Table 3, the Mosleh et al. (2011) BM/BX galaxy sample has a median mass of and median radius of kpc, and the LBG sample a median mass of and median radius of kpc. Within the same ranges of redshift and stellar mass, our BM/BX and LBG samples have -clipped mean radii of kpc and kpc respectively. Although our WFC3 imaging data are significantly deeper ( mag) and better resolved ( arcsec vs. arcsec) than the ground-based -band imaging, our experience with the robustness of (see §A) and tests degrading our images to the quality of the ground-based data do not suggest an obvious instrumental reason for the large difference in the mean values.151515Likewise, using the median instead of the sigma-clipped mean makes a negligible difference to our calculations.

A particularly valuable comparison can also be made to Förster Schreiber et al. (2011), who used HST/NICMOS F160W (PSF FWHM arcsec) to study the rest-optical morphologies of 6 massive star-forming galaxies at selected from the SINS H integral-field kinematics survey (Förster Schreiber et al. 2009). Each of these six galaxies are plotted as colored circles in Figure 13; we convert their measurements to circularized effective radii by multiplying by as tabulated in their Table 4.161616We plot the of the brightest component from their two-component fit to the galaxy Q1623-BX528 for consistency with our procedure described in §3.2. Two of these six galaxies were also observed as part of our HST/WFC3 imaging program: Q1623-BX528 and Q2343-BX389. While our measured effective radii for Q1623-BX528 differ by % due to a different number of morphological components used to fit the complicated light distribution (Förster Schreiber et al. 2011 used two components, while we used three), our radii for the single-component Q2343-BX389 agree to within 1%, suggesting that there is negligible systematic difference between the radii calculated by the two surveys. Except for the multicomponent Q1623-BX528 (which Förster Schreiber et al. 2009 classify as a merger on the basis of kinematic data and multi-component rest-frame optical continuum morphology) , all of the galaxies studied by Förster Schreiber et al. (2011) have radii roughly twice the mean size of their parent color-selected, spectroscopically confirmed galaxy population at a given stellar mass and lie in the top 5% of the distribution for our observed sample of BM/BX galaxies at . This suggests that the subset of galaxies observed by Förster Schreiber et al. (2011), the majority of which were selected to be the most disk-like within the SINS sample, falls among the high extreme of the galaxy population in the stellar mass range (see also discussion by Law et al. 2009; Dutton et al. 2010), while following some of the general trends observed at this redshift between size, specific SFR, and stellar mass surface density (Franx et al. 2008). We expand upon this discussion by relating the morphologies of these galaxies (plus 12 additional galaxies from the OSIRIS and/or SINS kinematic surveys that fell within our WFC3 imaging fields) to their ionized-gas kinematics in a forthcoming contribution (Law et al. 2012, in preparation).

### 5.3. Comparison with Theoretical Simulations

Although theoretical simulations of star forming galaxies are still in their infancy, the sizes predicted by such simulations are in rough agreement with our observed values. In Figure 13 (dotted and dashed lines) we illustrate the results of two such models from Sales et al. (2010) and Dutton et al. (2010) respectively.

Sales et al. (2010) use cosmological -body/SPH simulations to model the growth of baryonic structures in galaxies for four different feedback prescriptions. Of these four prescriptions, their “WF2Dec” model most closely matches both our observations and our physical understanding of these galaxies; in this model relatively strong feedback from star forming regions results in the efficient removal of gas from galaxies via an outflowing wind with velocity km s. Such peak outflow velocities are generally consistent with observations for our BM/BX/LBG galaxy sample (see, e.g., Steidel et al. 2010). As discussed by Sales et al. (2010), as feedback strength increases it suppresses star formation so that galaxies of a given stellar mass tend to inhabit larger haloes and can thus have correspondingly larger characteristic sizes. Assuming that their stellar half-mass radii roughly correspond to visible-band half-light radii, and converting to circularized values by multiplying by , we plot their predicted stellar mass — radius relation in Figure 13 (dotted line). The model is generally consistent with our observations, although it slightly underpredicts the typical galaxy size by dex. In contrast, in models with no feedback the majority of stars form in dense systems at early times, resulting in mean circularized half-light radii kpc at that disagree strongly with our observations.

Dutton et al. (2010) also study the evolution of scaling relationships with redshift using a series of semi-analytic models that roughly reproduce the velocity-mass-radius relations at . In particular, they focus on the evolution of the zero-point calibration of these relations, predicting that the evolution from to shifts the mass-radius relation upwards in radius by 0.3 dex. As illustrated in Figure 13 (dashed line), the magnitude of this zeropoint shift is consistent with our observations at (i.e., the mass at which the models also overlap the observed local relation).171717The slope of the Dutton et al. (2010) relation is too steep to match the observational data, but this is simply because their study was not intended to address the mass dependence of the galaxy mass vs. halo mass fraction.

### 5.4. Caveats

We close by discussing a few of the caveats and complications that can affect the mass-radius relation that we have derived.

First, the galaxies in the subsample have fainter magnitudes than galaxies in the lower redshift bins (Figure 1), and Figure 22 demonstrated (see discussion in §A.2) that the recovered value of can vary as a function of total source magnitude. However, this effect does not significantly influence our conclusions. First, is extremely stable for galaxies with isolated morphologies and small radii characteristic of much of the observational sample. While is less robust to magnitude for larger and more irregular galaxies, the majority of the variation occurs for magnitudes which we deliberately exclude from our analysis. The mean observed magnitudes of our , , and samples are respectively. Across such a small range mag the change in radius for all morphological types is %, comparable to the statistical uncertainty in the quoted . Indeed, even were we to include faint galaxies with in our analysis we find that the mean values of change by .

It is also possible that our results may be biased due to our assumption that the radius of a multi-component system may be characterized by the radius of the brightest individual component, while our stellar masses (derived from seeing-limited ground-based photometry and similarly confused Spitzer/IRAC photometry) represent the integral over the light of all of the components. If we repeat our previous analyses instead assuming that the stellar mass of these systems is proportional to the fraction of the flux in the primary component, or simply omitting galaxies with multiple well-defined individual components from our analysis, we find that values for in each of the three redshift bins are consistent with their previously calculated values to within . We are therefore confident that our results are not significantly affected by our assumption of how to define for multi-component systems.

Some of the apparent evolution in characteristic radius at fixed stellar mass from to may also be due to the variable -correction in our fixed observational bandpass. With an effective wavelength of Å, the F160W filter probes rest frame 5548, 4758, and 4044 Å emission at the mean redshift of three samples (). However, we note that the effective radii derived for our galaxies in the Q1700+64 field varied by only % from rest-frame 5000Å to rest-frame 2500Å; linear interpolation suggests that the change from 5000Å to 4000Å would be much smaller, %. Similarly, Dutton et al. (2010) make theoretical predictions for the difference in effective radius between a variety of optical/NIR bandpasses; interpolating their results suggests that we might expect a systematic increase of dex in log() from the lowest to highest redshift sample (i.e., sizes measured at longer wavelengths are smaller than those measured at shorter wavelengths, corresponding to inside-out disk growth) due to such bandshifting. This is comparable to the formal uncertainty on our measured in each of the 3 redshift bins, and would represent only a minor correction. Likewise, the results of Barden et al. (2005; see their Fig. 2) suggest that the correction factor would be %, which is much smaller than our % uncertainty on in each of our redshift bins.

Finally, we caution that the precise values derived for the size evolution of galaxies compared to their low-redshift counterparts at similar stellar mass is complicated by uncertainties in the local relation. Although we adopted the Shen et al. (2003) estimate of the local mass-radius relation for late type galaxies, we note that numerous authors (e.g., Barden et al. 2005; Trujillo et al. 2006; Guo et al. 2009) find that Shen et al. (2003) underestimate their effective radii. This discrepancy is due in part to systematic differences in analysis techniques (GALFIT modeling vs 1-dimensional radial profile fitting), definition of early vs late-type galaxies ( versus ), and effective wavelength ( vs band) of the observations. Although the measured discrepancy among radii is less pronounced for low Sersic indices similar to those of our galaxy sample (Guo et al. 2009), these varied effects may considerably complicate interpretations of the evolution of the high-redshift mass radius relation to the present day.

## 6. Quantifying Mergers in the Star Forming Galaxy Sample

While the irregular and clumpy morphologies of galaxies at may be interpreted as arising from dynamical instabilities within gas-rich systems (e.g., Bournaud et al. 2008; Dekel et al. 2009b; Genzel et al. 2011), they have also commonly been taken as indicators of ongoing mergers by numerous authors (e.g., Conselice et al. 2011b; Lotz et al. 2008a; and references therein). In this section, we discuss the properties of galaxies that can be identified as mergers via three common morphological criteria (the quantitative statistics and , and the observed fraction of close pairs) and assess how their relative abundance evolves throughout the redshift range . Additionally, we discuss the association of putative mergers with physical quantities such as stellar mass, SFR, and gas-phase kinematics, finding (similar to Law et al. 2007b) that whether or not a galaxy looks like a merger makes little difference to many of its physical properties.

Since our apparent magnitude cut (adopted to ensure robustness of the morphological statistics) introduces a redshift-dependent bias in the absolute magnitudes of our galaxies, all numerical values for the merger fraction (and/or merger rate) are calculated for a mass-limited subsample of galaxies with for which of galaxies at all redshifts also fulfill the criterion.

### 6.1. Defining the Mergers

#### 6.1.1 Quantitative Morphologies

One common way of identifying mergers is to use their morphological asymmetry , as discussed extensively in the literature by (e.g.) Conselice et al. (2000, 2003, 2008, 2009), Lotz et al. (2008b, 2010ab), Papovich et al. (2005), and Scarlata et al. (2007). In Figure 14 we plot versus for our magnitude-limited sample of galaxies (, left panels) and for a mass-limited subsample (, right panels). At low redshifts ongoing mergers have typically been identified by the criterion (e.g., Conselice et al. 2003), although for less well resolved, lower surface-brightness galaxies similar to those of our sample Lotz et al. (2008b) find that is more appropriate. Adopting the criterion, we find that the merger fraction (for ) is in the , , and samples respectively.181818Uncertainties are estimated by a Monte Carlo technique randomizing the individual values of based on a Gaussian probability distribution about the measured values. The width of this distribution combines the uncertainty in the measured value of and the scatter about the mean relation in our transformation to the Lotz et al. (2006) reference frame.

Another common method of identifying mergers is by their location in space, as originally defined by Lotz et al. (2004, 2006). In Figure 15 we plot versus for our magnitude-limited sample of galaxies (, left panels) and for a mass-limited subsample (, right panels). The merger criterion defined by Lotz et al. (2008b) for high-redshift galaxies191919There is no merger criterion tailored specifically to our galaxy sample and angular resolution of WFC3/IR; we adopt the Lotz et al. (2008b) definition as an approximation given its popularity in the literature.

 G>−0.14M20+0.33 (13)

gives a merger fraction of at , , and respectively.

Figures 14 and 15 demonstrate the necessity for caution when estimating the merger fraction using different segmentation maps: If we had calculated the morphological statistics using a segmentation map modeled on the methods of Conselice et al. (2009), typical points in these figures would be offset in the direction indicated by the green arrows. While the effect in the plane is fairly minimal, values of can change drastically, pushing a large number of points over the merger/non-merger dividing line and resulting in a wildly different derived merger fraction if the merger/non-merger division is not made appropriately. As discussed by Lisker (2008), this offset may in large part account for the discrepancy in the number of mergers identified in similar observational samples at using the technique by Lotz et al. (2008a; see their Figure 10) and Conselice et al. (2008; see their Figure 8).

#### 6.1.2 Nearby pairs

Another method of identifying mergers is to count the number of systems with close physical pairs. In practice, we consider systems with multiple distinct clumps of comparable flux ( 3:1 - 1:1) in their light profiles, colors consistent with the rest-UV selection criteria, and well-defined separations in the range 5 16 kpc (i.e., are classified as Type II galaxies) as physical pair candidates.202020Of course, not all pairs at redshifts will be in our spectroscopic sample, but we do not expect this to bias the derived pair fraction because the spectroscopic targets were chosen independently of whether or not they appeared to be in angular pairs. For galaxies with and we find that the fraction of pairs is , , and at , , and respectively.212121Uncertainties are estimated using Bayesian binomial confidence intervals (see discussion by Cameron 2011). Some fraction of these candidates will not be physical pairs however, but simply projected angular pairs of galaxies with different redshifts and no physical association.

One effort to constrain the incidence rate of false pairs can be made by extrapolating the false pair fraction observed at larger distances for which spectroscopic redshifts can be obtained for individual objects. Considering the 2874 galaxies (across 19 different fields) in our catalog with spectroscopic redshifts in the range , we count the number of distinct angular pairs as a function of separation in comparison to the number of genuine physical pairs whose spectroscopic redshifts lie within of each other. As illustrated by Figure 16, extrapolation of this relation to the radii probed by our WFC3 data suggests that % of our observed pairs should correspond to genuine physical pairs.

Alternatively, we can also estimate the false pair fraction based on the statistical distribution of objects in the WFC3 imaging fields. Using our Source Extractor catalogs, we evaluate the number of unique pairs with primary magnitudes in the range and secondary magnitudes within 1 magnitude of the primary as a function of their separation radius. Assuming that the majority of such pairs in the WFC3 fields are false pairs, we estimate that % of galaxies have false pairs within kpc. Subtracting this 0.07 false pair fraction from the angular pair fraction calculated above, we obtain the true physical pair fractions , , and at , , and respectively for separations in the range 5 kpc 16 kpc. We note that these values are consistent to with observational uncertainty with what would be derived had we simply assumed that 50% of angular pairs were false pairs.

### 6.2. Evolution with Redshift

As detailed above, estimates of the merger fraction derived from all three methods are roughly constant across our three redshift ranges, albeit with mild evidence (at the level) for a decline in the merger fraction at (see Figure 17, left-hand panel). In order to construct the merger rate from the merger fraction it is necessary to combine the merger fractions with the estimated timescale for visibility and the comoving space density of the target sample (see, e.g., Lotz et al. 2008a):

 Nmerg=n(z)fmerg/T (14)

Estimating the comoving space densities by integrating the mass functions for the star forming galaxy sample given by Reddy & Steidel (2009) above , and adopting Gyr, Gyr, and Gyr (see discussion in §6.3), we obtain estimates of the merger rate as shown in Figure 17 (right-hand panel). Clearly the actual merger rate of our galaxies is highly uncertain, and for the small number of galaxies observed in the present sample it is not possible to comment meaningfully on the evolution of the merger fraction with redshift (although our results are consistent with those derived for similar populations of galaxies in other studies; see, e.g., Conselice et al. 2011a and references therein). Even for significantly larger galaxy samples (e.g., Faber et al. 2011) it may prove difficult to constrain the merger rate given the large uncertainty in observability timescales that require numerical simulations to constrain.

### 6.3. Physical Properties of the Mergers

It is not obvious whether it is meaningful from a physical sense to identify galaxies as mergers on the basis of their rest-frame optical morphology. As argued by some authors (e.g., Bournaud et al. 2008; Genzel et al. 2011) irregular morphologies may instead arise from dynamical instabilities within gas-rich systems. Additionally, as we demonstrated in Law et al. (2007b) and expand upon below, merger-like morphologies are poorly correlated with other physical observables.

There are significant differences between the subsamples of galaxies from our survey selected as mergers at according to different criteria. % of such galaxies are identified as mergers on the basis of their morphological asymmetry, while % are identified using the selection criterion, and % using pair statistics. As illustrated in Figures 14 and 15, 76% of galaxies selected as mergers according to are also selected as mergers using , but only 39% of galaxies selected as mergers using are also selected as mergers according to . Similarly, 59% (45%) of mergers identified by (asymmetry) are also identified as mergers based on the presence of a nearby angular pair. Clearly, while there is a significant overlap between the galaxy samples, there are also a significant number of galaxies uniquely selected by each technique.

This difference is unsurprising given that the various morphological selection criteria may isolate mergers with different mass ratios and in a different range of evolutionary phases. Lotz et al. (2008b, 2010ab) performed a series of hydrodynamic simulations to explore the timescales and visibility of disk galaxy mergers as a function of morphological selection criterion, mass ratio, and gas content. Dividing their mergers into six stages (pre-merger, first-pass, maximal separation, final merger, post merger, remnant) these authors found that the observability of mergers at rest-frame 4686 Å can vary dramatically from stage to stage. For the ‘G3gf1’ model,222222Stellar and gas masses . Lotz et al. (2010b) find that and pair criteria tend to have observability timescales Gyr and Gyr (predominantly identifying first-passage mergers) while the criterion has a longer observability timescale (identifying both first-passage and final mergers; see also Conselice et al. 2006). The greater fraction of galaxies that we identify as mergers based on their asymmetry than by the other two methods (Figure 17) may therefore simply reflect this large difference in observability timescales. Further, Lotz et al. (2010a) find that while is most sensitive to major mergers like those identified using our pair selection criteria ( flux ratio), detects both major and minor mergers, potentially explaining why we identify more mergers using than with pair selection.

In Figure 18 we plot histograms of various physical properties for galaxies classified as mergers/non-mergers according to the , , and pair criteria and use a KS test to evaluate the significance of the null hypothesis that both sets of galaxies (mergers and non-mergers) were drawn from the same distribution. We conclude that for almost all physical parameters (stellar mass, SFR, rest-frame color232323Estimated from the best-fit SED., etc.) there is no significant difference (confidence in the null hypothesis %) between putative mergers and non-mergers. Similarly, there is no obvious difference in the gas-phase kinematics between mergers and non-mergers, although our sample size of 35 galaxies with systemic H redshifts and high-quality UV spectra is too small to conclusively rule out association. The one notable exception is that galaxies identified as mergers via the or pair classification schemes have significantly smaller radii and correspondingly higher than non-mergers. This may suggest either that peaks around the first-passage during a major merger event, or that the and pair classification schemes are simply effective at finding galaxies with small radii.

The lack of correlation observed between morphology and these physical observables may be unsurprising in light of both numerical uncertainties in our morphologies (i.e., exactly where the dividing line between mergers and non-mergers lies) and expectations (e.g., Lotz et al. 2010ab) that star formation may typically peak after the major morphological disturbances have subsided. Regardless, it is unclear whether it is physically meaningful to classify galaxies as mergers on the basis of morphology alone given that there appears to be little to distinguish these systems (whether observed in the rest-optical or the rest-UV; see discussion by Law et al. 2007b, see also Swinbank et al. 2010 for a similar discussion of submillimeter galaxies) from their non-merging counterparts. Rather, it may simply be that most star forming galaxies are dynamically unstable systems driven by the accretion of large quantities of gas, whether this gas is acquired through mergers, cold-mode, or hot-mode accretion processes.

## 7. Summary

We have presented rest-optical morphologies for a sample of 306 spectroscopically confirmed star forming galaxies with stellar masses in the range . Since these galaxies were distributed among 10 different fields widely separated on the sky the effects of sample variance are expected to be greatly reduced compared to surveys over contiguous regions of similar total area. We summarize our principle scientific conclusions as follows:

1. Typical star forming galaxies have circularized effective radii kpc and a projected exponential surface brightness profile that extends out to in stacked galaxy images. The observed sizes are consistent with previous observational estimates (e.g., Buitrago et al. 2008; Kriek et al. 2009) for high-mass galaxy populations and with numerical simulations (e.g., Sales et al. 2010) that assume strong stellar feedback.

2. A stellar mass - radius relation for star forming galaxies is observed to exist as early as ; at fixed mass typical sizes evolve with redshift as in the interval to . These galaxies must grow at least as fast as in order to evolve onto the local late-type galaxy relation by the present day.

3. The distribution of axis ratios is strongly inconsistent with a population of axisymmetric thick exponential disks and more consistent with a population of triaxial ellipsoids with intrinsic minor/major and intermediate/major axis ratios and respectively. The typical ellipticity is qualitatively similar to that previously found by Ravindranath et al. (2006), but there may be mild evidence for evolution with wavelength. The ellipsoidal nature of these galaxies indicates at minimum that the distribution of stellar mass within them is markedly asymmetric, and (in combination with their high gas fractions and velocity dispersions) may further suggest that they are not in stable dynamical equilibrium with short-lived gas disks (e.g., Ceverino et al. 2010) continually forming and re-forming from recently accreted gas until stabilized (e.g., Martig et al. 2010) by a sufficiently massive triaxial stellar component.

4. Consistent with previous studies (e.g., Dickinson 2000; Papovich et al. 2005), rest-optical (Å) and rest-UV ( Å) morphology for star forming galaxies is generally similar with typical color dispersion (although rest-UV radii are larger by