HI near galaxies at z\approx 2.4

# Neutral hydrogen optical depth near star-forming galaxies at z≈2.4 in the Keck Baryonic Structure Survey**affiliation: Based on data obtained at the W.M. Keck Observatory, which is operated as a scientific partnership among the California Institute of Technology, the University of California, and NASA, and was made possible by the generous financial support of the W.M. Keck Foundation.

Olivera Rakic, Joop Schaye, Leiden Observatory, Leiden University, P.O. Box 9513, 2300 RA Leiden, The Netherlands Charles C. Steidel, and Gwen C. Rudie California Institute of Technology, MS 249-17, Pasadena, CA 91125, USA
###### Abstract

We study the interface between galaxies and the intergalactic medium by measuring the absorption by neutral hydrogen in the vicinity of star-forming galaxies at . Our sample consists of 679 rest-frame-UV selected galaxies with spectroscopic redshifts that have impact parameters (proper) Mpc to the line of sight of one of 15 bright, background QSOs and that fall within the redshift range of its Ly forest. We present the first 2-D maps of the absorption around galaxies, plotting the median Ly pixel optical depth as a function of transverse and line of sight separation from galaxies. The Ly optical depths are measured using an automatic algorithm that takes advantage of all available Lyman series lines. The median optical depth, and hence the median density of atomic hydrogen, drops by more than an order of magnitude around 100 kpc, which is similar to the virial radius of the halos thought to host the galaxies. The median remains enhanced, at the level, out to at least 2.8 Mpc (i.e.  comoving Mpc), but the scatter at a given distance is large compared with the median excess optical depth, suggesting that the gas is clumpy. Within 100 (200) kpc, and over , the covering fraction of gas with Ly optical depth greater than unity is (). Absorbers with are typically closer to galaxies than random. The mean galaxy overdensity around absorbers increases with the optical depth and also as the length scale over which the galaxy overdensity is evaluated is decreased. Absorbers with reside in regions where the galaxy number density is close to the cosmic mean on scales Mpc. We clearly detect two types of redshift space anisotropies. On scales , or Mpc, the absorption is stronger along the line of sight than in the transverse direction. This “finger of God” effect may be due to redshift errors, but is probably dominated by gas motions within or very close to the halos. On the other hand, on scales of 1.4 - 2.0 Mpc the absorption is compressed along the line of sight (with significance), an effect that we attribute to large-scale infall (i.e. the Kaiser effect).

galaxies: formation — galaxies: halos — galaxies: high-redshift — intergalactic medium — quasars: absorption lines — large-scale structure of Universe
slugcomment:

## 1. Introduction

Gas accretion and galactic winds are two of the most important and poorly understood ingredients of models for the formation and evolution of galaxies. One way to constrain how galaxies get their gas, and to learn about the extent of galactic feedback, is to study the intergalactic medium (IGM) in the galaxies’ vicinity. The interface between galaxies and the IGM can be studied either in emission (e.g. Bland & Tully, 1988; Lehnert et al., 1999; Ryan-Weber, 2006; Borthakur et al., 2010; Steidel et al., 2011) or in absorption against the continuum of background objects such as QSOs (e.g., Lanzetta & Bowen, 1990; Bergeron & Boissé, 1991; Steidel & Sargent, 1992; Steidel et al., 1994; Lanzetta et al., 1995; Steidel et al., 1997; Chen et al., 1998, 2001; Bowen et al., 2002; Penton et al., 2002; Frank et al., 2003; Adelberger et al., 2003, 2005; Pieri et al., 2006; Simcoe et al., 2006; Steidel et al., 2010; Crighton et al., 2011; Prochaska et al., 2011; Kacprzak et al., 2011; Bouche et al., 2011) or galaxies (e.g. Adelberger et al., 2005; Rubin et al., 2010; Steidel et al., 2010; Bordoloi et al., 2011).

Emission from intergalactic gas is very faint and observations are currently mostly limited to low redshifts. At Ly emission was recently seen out to kpc from star-forming galaxies in a stacking analysis by Steidel et al. (2011), and the origin of this light seems to be radiation of the central object scattered by galactic halo gas. However, these observations are limited to the immediate vicinity of galaxies, i.e. , and are currently feasible only for the H i Ly transition. On the other hand, studying this gas in absorption is viable at all redshifts as long as there are sufficiently bright background objects. Most importantly, absorption studies are sensitive to gas with several orders of magnitude lower density than emission studies.

The background sources used for absorption line probes of the IGM have traditionally been QSOs, but one may also use background galaxies or gamma ray bursts. While the surface density of background galaxies is much higher than that of QSOs, the quality of the individual spectra is much lower, since at the redshifts discussed in this paper the typical background galaxy is times fainter than the brightest QSOs. Studies using galaxies as background sources are therefore confined to analyzing strong absorption lines and generally require stacking many lines of sight. On the other hand, QSOs sufficiently bright for high-resolution, high S/N spectroscopy using 8m-class telescopes are exceedingly rare, but the information obtained from a single line of sight is of exceptional quality, well beyond what could be obtained with even the very brightest galaxy at comparable redshift. We note also that galaxies and QSOs do not provide identical information, as the former are much more extended than the latter. This is particularly relevant for metal lines, which often arise in absorbers with sizes that are comparable or smaller than the half-light radii of galaxies (e.g. Schaye et al., 2007).

In this paper we focus on studying the IGM near star-forming galaxies at using absorption spectra of background QSOs. At present we focus on H i Ly in the vicinity of galaxies, while a future paper will study the relation between metals and galaxies. Star-forming galaxies in this redshift range can be detected very efficiently based on their rest-frame UV colors (Steidel et al., 2003, 2004) and the same redshift range is ideal for studying many astrophysically interesting lines that lie in the rest-frame UV part of the spectrum (e.g.  Ly at 1215.67Å, CIV 1548.19,1550.77 Å, OVI 1031.93,1037.616 Å, where the CIV and OVI lines are doublets). At these redshifts the Ly forest is not as saturated as at , and at the same time the absorption systems are not as rare as they are at low redshifts. In addition, this redshift range is exceptionally well-suited for studying the galaxy-IGM interface given that this is when the universal star formation rate was at its peak (e.g. Reddy et al., 2008), and hence we expect the interaction between galaxies and their surroundings to be most vigorous.

The largest QSO-galaxy surveys at high redshift are those from Adelberger et al. (2003, 2005) and Crighton et al. (2011), and here we will briefly describe the former given that it is the most comparable in terms of the galaxy sampling density and data quality to the study presented in this paper. Adelberger et al. (2003) studied the IGM close to 431 Lyman Break Galaxies (LBGs) at in 8 QSO fields. They found enhanced Ly absorption within comoving Mpc ( proper Mpc) from star-forming galaxies. On the other hand, they found that the region within comoving Mpc ( proper Mpc) contains less neutral hydrogen than the global average. This last result was, however, based on only 3 galaxies. Adelberger et al. (2005) studied the IGM at with an enlarged sample: 23 QSOs in 12 fields containing 1044 galaxies. They confirmed the earlier result of enhanced absorption within comoving Mpc ( proper Mpc) and found that even though most galaxies show enhanced absorption within comoving Mpc, a third of their galaxies did not have significant associated Ly absorption.

In comparison with Adelberger et al. (2003, 2005) we use uniformly excellent QSO spectra taken with the HIRES echelle spectrograph, covering from 3100 Å  to at least the QSO’s CIV emission line. Adelberger et al. (2003) also used HIRES spectra, but at the surface density of galaxies with is times smaller than at , and the QSO spectra did not cover higher order Lyman lines, which are important for the recovery of optical depths in saturated Ly lines. Adelberger et al. (2005) used HIRES spectra as well as lower-resolution spectra of fainter QSOs obtained using LRIS and ESI.

This paper is organized as follows. We describe our data sample in Section 2. In Section 3 we discuss the so-called pixel optical depth method that we use to analyze the QSO spectra. The distribution of Ly absorption as a function of transverse and line of sight (LOS) separation from galaxies is presented in Section 4, while the distribution of galaxies around absorbers is described in Section 5. Finally, we conclude in Section 6.

Throughout this work we use , , , and (Komatsu et al., 2009). When referring to distances in proper (comoving) units we denote them as pMpc (cMpc).

## 2. Data

### 2.1. The Keck Baryonic Structure Survey

The Keck Baryonic Structure Survey (KBSS; Steidel et al 2012) is a new survey which combines high precision studies of the IGM with targeted galaxy redshift surveys of the surrounding volumes, expressly designed to establish the galaxy/IGM connection in the redshift range . The KBSS fields are centered on 15 background QSOs (see Table 1) that are among the most luminous known ( L) in the redshift range . The QSO redshifts were chosen to maximize the information content and redshift path sampled by their absorption line spectra. Toward that end, the QSO spectra used in the KBSS are of unprecedented quality, combining archival high resolution spectra from Keck/HIRES (and, in 3 cases, archival VLT/UVES spectra as well) with new Keck/HIRES observations to produce the final co-added spectra summarized in Table 1.

The mean flux in this sample of QSO spectra is 0.806. The formula for the effective Ly optical depth from Schaye et al. (2003) suggests that the mean flux at is 0.802(), which is consistent with our data, so we conclude that the IGM probed by QSO sightlines in our sample is representative for the considered redshift.

Within each KBSS field, UV color selection techniques (Steidel et al. 2003; Adelberger et al. 2004) were used to tune the galaxy redshift selection function so as to cover the same optimal range of redshifts probed by the QSO spectra. The spectroscopic follow-up using Keck/LRIS was carried out over relatively small solid angles surrounding the QSO sightlines, but typically 8 to 10 slit masks with nearly identical footprint were obtained in each field, leading to a high level of spectroscopic completeness and a very dense sampling of the survey volumes surrounding the QSO sightlines. The full KBSS galaxy sample contains 2188 spectroscopically identified galaxies in the redshift range , with . The spectroscopic sample was limited to those with apparent magnitude , where galaxies in the range and those within 1 arcminute of a QSO sightline were given highest priority in designing slit masks.

The effective survey area is (i.e.  0.2 square degrees), and the average surface density of galaxies with spectroscopic redshifts is (within 2.5 pMpc from QSO sightlines).

In this paper, we focus on a subset of 679 galaxies that satisfy the following criteria for redshift and projected distance from the relevant QSO sightline:

• their redshifts are within the Ly forest range, defined as:

 (1+zQSO)λLyβλLyα−1

where is the QSO redshift, and and are the rest-frame wavelengths of the hydrogen Ly (1215.67Å) and Ly (1025.72Å) lines, respectively (for a discussion of the limits in this expression, see §2.2);

• they are within 2 pMpc (′ or cMpc at ), the transverse distance to which there is significant coverage in all 15 KBSS fields.

Figure 1 shows the number of galaxies as a function of their (proper) distance from the QSO sightlines.

We note that 3 of the 15 KBSS fields (1623+268, 1700+64, 2343+12) were also included in the study by Adelberger et al (2005). However, since 2005 the data have been increased in both quantity and quality (for both the QSO spectra and the galaxy surveys) in all 3 of these fields. Further details on the KBSS survey and its data products will be described by Steidel et al (2012).

The smallest impact parameter in the present sample is 55 pkpc, with 29, 106, and 267 galaxies having impact parameters smaller than 200, 500, and 1000 pkpc, respectively.

#### 2.1.1 Redshifts

The redshifts of the vast majority (608 out of 679) of the galaxies in our sample are derived from interstellar absorption lines and/or the Ly emission line. These lines were measured from low-resolution (FWHM ) multislit spectra taken between 2000 and 2010 with LRIS-B on the Keck I and II telescopes. However, ideally one would want to measure the galaxy redshifts from the nebular emission lines ([O ii] , H, H, [O iii] ) since those originate in stellar H ii regions and are more likely to correspond to the galaxies’ systemic redshifts. Erb et al. (2006b) have used NIRSPEC, a near-IR instrument on Keck II, to obtain higher resolution (FWHM ) spectra than achieved by LRIS-B and have used these to measure nebular redshifts for 110 galaxies. Our sample contains 71 of those galaxies and we take their systemic redshifts to be equal to the nebular redshifts, i.e. . Marginally resolved lines have for LRIS-B and for NIRSPEC.

For the galaxies without near-IR observations we estimate the systemic redshifts using the empirical relations of Rakic et al. (2011). Using the same data as analyzed here, they calibrated galaxy redshifts measured from rest-frame UV lines (Lyman- emission, , and interstellar absorption, ) by utilizing the fact that the mean Ly absorption profiles around the galaxies, as seen in spectra of background QSOs, must be symmetric with respect to the true galaxy redshifts if the galaxies are oriented randomly with respect to the lines of sight to the background objects. The following values represent their best fits to the data:

1. for galaxies without available interstellar absorption lines (70 objects):

 zgal=zLyα−295+35−35kms−1 (2)
2. for galaxies without detected Ly emission lines (346 objects):

 zgal=zISM+145+70−35kms−1 (3)
3. for galaxies for which both interstellar absorption and Ly emission are detected (263 objects) we use the arithmetic mean of the above expressions.

These offsets yield redshift estimates free of velocity systematics (Rakic et al., 2011; Rudie et al., 2012). Repeated observations of the same galaxies suggest that the typical measurement uncertainties are for . Comparison of redshifts inferred from rest-frame UV lines with those from nebular lines for the subset of 89 galaxies that have been observed in the near IR, shows that the rest-frame UV inferred redshifts have a scatter of . In addition, Rudie et al. (2012) concluded that redshift errors in the KBSS sample are based on the velocity structure of the strongest H i absorbers around galaxy positions (first panel of their Figure 18), and Trainor & Steidel (2012) reached the same conclusion based on the small observed velocity dispersion of galaxies with respect to QSOs.

Figure 2 shows a histogram of the galaxy redshifts, both for the full sample and for the subsample with redshifts measured from near-IR lines. The median redshifts of the full and near-IR samples are 2.36 and 2.29, respectively. The mean and standard deviations of the redshift distributions for the two samples are and , respectively.

#### 2.1.2 Physical properties

These UV-selected star-forming galaxies have stellar masses (Shapley et al., 2005). The typical star formation rates (SFRs) are , where the SFRs of individual objects vary from to , and the mean SFR surface density is (Erb et al., 2006). These stellar mass and SFR estimates assume a Chabrier (2003) IMF. The galaxies show a correlation between their stellar mass and metallicity, but the relation is offset by 0.3 dex as compared to the local relation, with the same stellar mass galaxies having lower metallicity at (Erb et al., 2006c). Typical metallicities range from for galaxies with to for galaxies with .

As discussed in section 2.1.1, ISM absorption lines are almost always blue-shifted with respect to the galaxy systemic redshift, and the Ly emission line is always redshifted. These observed velocity offsets suggest that galaxy-scale outflows, with velocities of hundreds of , are the norm in these star-forming galaxies.

Trainor & Steidel (2012) use the MultiDark simulation (Prada et al., 2011), together with a clustering analysis, to connect galaxies from KBSS to dark matter halos. They find that this type of galaxy resides in halos with masses above , with a median halo mass of (similar conclusions were reached by e.g. Adelberger et al., 2005b; Conroy et al., 2008). The corresponding virial radii are and , respectively, with circular velocities and .

### 2.2. QSO spectra

The typical resolution of the QSO spectra is R , and they were rebinned to pixels of . The spectra were reduced using T. Barlow’s MAKEE package and the continua were normalized using low-order spline fits. Further details about the QSO observations will be given in Steidel et al. (in preparation). The QSO redshift distribution can be seen in Figure 2 and a summary of the properties of the final spectra is given in Table 1.

The redshift range that we consider when studying Ly absorption in the spectrum of a QSO at redshift is given by equation (1). The lower limit ensures that only Ly redwards of the QSO’s Ly emission line is considered, thus avoiding any confusion with the Ly forest from gas at higher redshifts. The upper limit is set to avoid contamination of the Ly forest by material associated with the QSO and to avoid the QSO proximity effect. We verified that excluding 5,000 rather than 3,000 gives nearly identical results.

The median S/N in the Ly forest regions of the spectra ranges from 42 to 163 (Table 1).

A number of QSO spectra (5 out of 15) contain damped Lyman- systems (DLAs) and sub-DLAs within the considered redshift range. We divided out the damping wings after fitting the Voigt profiles (see Rudie et al., 2012, for more details on this procedure). The saturated cores of the (sub-)DLAs were, however, flagged and excluded from the analysis, but this does not have a significant effect on our results.

## 3. Pixel Optical Depth

The goal of the pixel optical depth (POD) method is to use automatic algorithms (as opposed to manual fits) to find the best estimate of the optical depth of the H i Ly absorption line and those of various metal transitions as a function of redshift. Obtaining accurate pixel optical depths can be challenging due to the presence of noise, errors in the continuum fit, contamination and saturation.

It is more instructive to plot optical depths than normalized fluxes because the optical depth is proportional to the column density of the absorbing species, while the flux is exponentially sensitive to this density. Hence, a power-law column density profile translates into a power-law optical depth profile and a lognormal column density distribution becomes a lognormal optical depth distribution. The shapes of the profiles and distributions common in nature are therefore preserved, which is not the case if we work with the flux. Of course, if the spectral resolution is coarser than the typical line width, or if the wavelength coverage does not allow for the use of weaker components of multiplets to measure the optical depth of saturated lines, then the interpretation of the recovered optical depths is more complicated. Our spectral resolution is, however, sufficient to resolve the Ly lines and our wavelength coverage is sufficient to recover the optical depth of most saturated lines.

Another advantage of optical depths is that their dynamic range is unconstrained, whereas the normalized flux is confined to vary between zero and one. The large dynamic range does imply that one must use median rather than mean statistics, because the mean optical depth merely reflects the high optical depth tail of the distribution. While this problem could also be solved by using the mean logarithm of the optical depth, that would not resolve another issue resulting from the large intrinsic dynamic range: the observed optical depth distribution is cut off at the low absorption end by noise and at the high end by line saturation (and thus ultimately also by noise). However, errors in the tails of the distribution do not matter if we use median statistics (or other percentiles, provided they stay away from the noise).

The POD statistics method was introduced by Cowie & Songaila (1998) (see also Songaila, 1998), and further improved by Ellison et al. (2000), Schaye et al. (2000b), and Aguirre et al. (2002). The POD method used in this study is the one developed and tested by Aguirre et al. (2002), and is identical to that of Ellison et al. (2000) for the case of Ly.

The H i Ly optical depth in each pixel is calculated from the normalized flux ,

 τLyα(λ)=−ln(F(λ)). (4)

Pixels with (which can occur in the presence of noise) are assigned an optical depth of , which is smaller than any recovered optical depth.

Pixels that have are considered saturated, where is the rms noise amplitude at the given pixel and is a parameter that we set to 3 (see Aguirre, Schaye & Theuns, 2002 for more details). One advantage of the POD method is that it is easy to recover a good estimate of the Ly optical depth in these saturated pixels by using the available higher order Lyman lines. The recovered Ly optical depth is then given by

 τrecLyα=min{τLynfLyαλLyα/fLynλLyn}, (5)

where is the oscillator strength of the th order Lyman line and is its rest wavelength ( corresponds to , to , etc.). Taking the minimum optical depth (Equation  5) minimizes the effect of contamination by other lines. Higher order lines used for Ly optical depth recovery are those that lie in the wavelength range of the spectrum, and for which , where is the noise at . If none of the available higher order lines satisfy this criterion, or if none of the higher order lines are available, then the pixel is set to a high optical depth (we use ). An example of the recovered optical depth in one QSO spectrum is shown in Figure 3, together with the positions of galaxies in the same field.

Setting pixels to and allows them to correctly influence the median and other percentiles. The actual values they are set to are unimportant as long as for the percentile of interest. In other words, as long as the or are on the same side of a given percentile as the true optical depth that they are replacing, their actual values are unimportant.

We use the POD method because: a) It is fast, robust, and automatic, which means it can deal with large amounts of data, both observed and simulated; b) It makes it straightforward to exploit the full dynamic range measurable from our spectra; c) The fact that the optical depth is directly proportional to the column density of neutral hydrogen makes it easy to interpret the results (see §4.5). While the POD method has clear advantages, it is important to note that it does not replace the more traditional method of decomposing the absorption spectra into Voigt profile components. For example, in its simplest form, the POD method does not make direct use of information about the line widths, which are clearly of interest as they contain important information about the temperature and the small-scale velocity structure of the absorbing gas. An analysis of the KBSS data based on Voigt profile decompositions is presented in Rudie et al. (2012).

## 4. Lyα absorption near galaxies

After recovering the optical depths in each pixel of a QSO spectrum, we can build a map of the galaxies’ average surroundings, as a function of the transverse distance from the line of sight (LOS), i.e. the impact parameter, and the velocity separation from the galaxy along the LOS. We sometimes convert velocity separation into distance by assuming that it is due to the Hubble flow. However, as we will show in sections 4.1 and 4.2, this is a poor approximation at small velocity differences from galaxies. We will refer to distances computed under the assumption of zero peculiar velocities (and zero redshift errors) as “Hubble distances".

Different galaxies and QSO fields are combined, meaning that the median POD for a given impact parameter and velocity difference is estimated over all galaxies irrespective of which field they come from, without applying any weighting. Each galaxy therefore provides an array of pixels with varying velocity difference but fixed impact parameter.

For reference, we note that 0.2, 2.1, and 9.9 percent of the pixels in our sample are at 3-D Hubble distances smaller than 200, 500, and 1000 pkpc, respectively.

### 4.1. 2-D Maps of the median Lyα absorption

The left panel of Figure 4 shows the logarithm of the median as a function of the transverse and the LOS separations from galaxies. The distance bin size in this plot is 200 by 200 pkpc (which corresponds to along the LOS), and the image has been smoothed with a 2-D Gaussian with FWHM equal to the bin size. Note that we take a galaxy-centered approach in constructing this image. Each galaxy contributes a column of pixels whose position along the -axis corresponds to the galaxy’s impact parameter. A single pixel can be used multiple times: once for each galaxy whose separation from the pixel falls within the range plotted. This is the first published 2-D absorption map around galaxies. Studies by e.g. Adelberger et al. (2003); Ryan-Weber (2006); Wilman et al. (2007) and Shone et al. (2010) show cross-correlation maps of galaxies with Ly absorption systems, which is not equivalent. The map shows a strong correlation between the Ly absorption strength and the distance to the galaxies.

The right panel of Figure 4 shows the same data, after randomizing the galaxy redshifts. We randomize the galaxy redshifts within the Ly forest region of each QSO spectrum while keeping their impact parameters unchanged in order to preserve the number of pixels per galaxy. In this way we can estimate the magnitude of the fluctuations in the absence of correlations between the locations of galaxies and absorbers. We can see that the signal is lost, which implies that the features seen in the left panel are real. Finally, we note that because a single galaxy contributes a full column of pixels at its impact parameter, bins along the LOS are somewhat correlated. On the other hand, bins in the transverse direction are independent. In Appendix B we demonstrate that along the LOS the errors are significantly correlated for scales .

In Figure 4 we kept the positive and negative velocity differences between absorbers and galaxies separated, which gives insight into the amount of noise and sample variance. However, given that the Universe is statistically isotropic111The Universe is denser at higher redshift and therefore we expect more absorption at positive velocity differences in Figure 4 than at negative velocities, but this effect is negligible on the scales that we consider. The mean transmission varies by over at (e.g. Schaye et al., 2003), which is much smaller than the excess absorption in the vicinity of galaxies. (and our observations are consistent with this assumption), we can increase the S/N by considering only absolute velocity differences. Figure 5 shows a map of gas around galaxies by taking into account only absolute velocity separations. We use logarithmic axes, which brings out the small-scale anisotropy more clearly.

The most prominent feature is the region of strongly enhanced absorption that extends to pkpc in the transverse direction, but to pMpc () along the LOS. This redshift space distortion (called the “finger of God” effect in galaxy redshift surveys) could be due to redshift errors and/or peculiar velocities. A more subtle anisotropy is visible on large scales. In the transverse direction the correlation between absorption strength and galaxy position persists out to the maximum impact parameter we consider, 2 pMpc, whereas it becomes very weak beyond pMpc () along the LOS, as is most clearly visible in Figure 4. If real, such a feature would imply infall of gas on large scales, i.e. the Kaiser (1987) effect. Other scenarios are unlikely, as the signal distortion along the LOS can only be caused by redshift errors and peculiar velocities. However, redshift errors would act to spread the signal along the LOS rather than compressing it, which leaves peculiar velocities as the only other possibility. We will examine the significance of these anisotropies in the next section.

### 4.2. Redshift space distortions

Figure 6 shows “cuts" along the first 9 rows and columns of the 2-D map shown in Figure 5, spanning 0–2 pMpc. The red circles and grey squares show the profiles along the transverse and LOS, respectively. As indicated in the insets, these cuts correspond, respectively, to the horizontal and vertical strips in Fig. 5. Observe that the transverse and LOS directions are identical for the data point in the panel, which corresponds to the intersection of the horizontal and vertical strips shown in the insets. In other words, where the horizontal and vertical strips meet, the data point is replicated as both a red and a black symbol. Note also that the red circle (black square) of the panel is identical to the black square (red circle) of the panel. The figure thus contains redundant information. For example, the LOS direction (i.e. black squares) in the panel shows all the data points appearing in the transverse direction (i.e. red circles) of the other panels. Similarly, the panel shows all the transverse data points, etc.

As LOS separations have been computed under the assumption of pure Hubble flow, any significant difference between the red and black curves must be due to redshift space distortions (assuming the Universe is isotropic in a statistical sense). By comparing the two curves in each panel, we can therefore identify redshift space distortions and assess their significance.

The error bars in Figure 6, as well as those in subsequent plots, were computed by bootstrapping the galaxy sample 1,000 times. That is, for each bootstrap realization we randomly select galaxies, where each galaxy could be selected multiple times, until the number of galaxies from the original set is reached. We calculate the results for each realization and the errors show the 1 confidence interval. As demonstrated in Appendix B, along the LOS the errors are correlated for separations (i.e. black squares in each panel are correlated on these scales), but not in the transverse direction (i.e. red circles in each panel are independent).

The number of galaxies per transverse bin depends slightly on the velocity difference, because galaxies separated by from the Ly forest region (as defined in section 2.2) will still contribute pixels to bins with velocity separations greater than . From small to large impact parameters, the number of galaxies contributing to the () velocity bin is 14 (16), 8 (8), 11 (12), 22 (23), 47 (48), 58 (62), 96 (99), 175 (180), 202 (211). Thus, the black squares in the first panel, as well as the first red circle in all panels, are based on 14 – 16 galaxies. Similarly, the black squares in the panel, as well as the last red circle in all panels, are based on 202 – 211 galaxies.

The first panel, which shows cuts near the axes of Figure 5, clearly shows that out to pMpc, which translates into along the LOS (see the top -axis), the absorption is stronger along the LOS (black squares) than in the transverse direction (red circles). In panels 2 – 7, which correspond to strips offset by 0.13 – 1.00 pMpc from the axes in Figure 5, the smearing in the LOS direction manifests itself as enhanced absorption at small impact parameters relative to the signal at small LOS separations. The confidence level associated with the detected discrepancy between the two directions, which we estimate as the fraction of bootstrap realizations in which the sign of the discrepancy is reversed, is at least 99% for each data point out to 1 pMpc (i.e. points 2–7 of the first panel).

The differences on scales pMpc () can be explained by two effects. Firstly, we expect gas in and around galaxy halos to have peculiar velocities comparable to the circular velocity of the halos, i.e.  (and possibly significantly higher for outflowing gas). Secondly, as discussed in Section 2.1.1, there are random errors in the redshift measurements of for LRIS redshifts, and for the NIRSPEC subsample. These two effects smooth the signal in velocity space on the scale that is a result of the combination of these velocities. In Appendix A we show that redshift errors may be able to account for the observed smoothing. However, this does not prove the observed elongation along the LOS is due to redshift errors rather than peculiar velocities. In fact, since the elongation is more extended than the typical redshift errors ( vs. ), but similar to the expected circular velocities (), it is likely that the finger of God effect is dominated by small-scale peculiar velocity gradients due to virial motions, infall, and/or outflows. This would not be surprising considering the observations of large velocity fields in the gas in star-forming galaxies (e.g. Steidel et al., 2010). Rudie et al. (2012) use the KBSS sample to study Ly absorption near galaxies by decomposing the forest lines into Voigt profiles, and they also detect the finger of God effect at small impact parameters. After presenting a more detailed analysis of the impact of redshift errors, they conclude that redshift errors alone cannot account for the detected elongation of the absorption signal along the LOS, which strengthens our conclusion. It is also worth noting that even if redshift errors were solely responsible for the finger of God effect, we can still safely conclude that the detected peculiar motions of Ly absorbing gas near galaxies do not exceed . However, this does not mean that such gas does not reach higher velocities very close to galaxies, as the smallest impact parameter in the KBSS sample is pkpc.

On the other hand, at distances  pMpc the situation is reversed: the absorption is compressed in the LOS direction. For impact parameters 1.42 – 2 pMpc and LOS separations 0– 0.71 pMpc (0 – 165 ; (i.e. last) red circles in panels 1–6) the absorption is stronger than for impact parameters 0– 0.71 pMpc and LOS separations 1.42 – 2 pMpc ( black square in panels 1–6). The same information is collected in the last panel, which corresponds to strips separated by 1.42 – 2.00 pMpc from the axes. The first six black squares are higher than the corresponding red circles, which implies that the absorption is enhanced along the LOS relative to the direction transverse to the LOS. Observe that this enhancement is absent from all other panels. The confidence level with which this discrepancy is detected is, from small to large scales, 99.4%, 77.9%, 83.2%, %, 91.6% and 82.6% for points 1–6, respectively. As mentioned above, these confidence levels represent the fraction of bootstrap realizations in which the sign of the discrepancy is the same as for the original sample.

In order to estimate the significance of the compression along the LOS, we combine different measurements of the difference between pairs of points. For each pair of points i (points 1–6 in the last panel of Figure 6), we first compute the difference between median optical depths, , where and are the observed values, and the associated error , where and are errors on measurements and . After that we compute the weighted average difference for the 6 pairs of points, , and the corresponding error, . The significance is then (assuming Gaussian errors). This procedure suggests that the compression along the LOS is detected at 3.5.

This compression along the LOS can be explained by large-scale infall, i.e. the Kaiser effect. As the absorbing gas must be cool, i.e. , to be visible in HI, this form of gas accretion could be called “cold accretion” (e.g. Kereš et al., 2005), although we note that this term is most often used in the context of cold streams within the virial radii of halos hosting galaxies. We conclude that we have an unambiguous and highly significant detection of cool gas falling towards star-forming galaxies at . The large-scale infall of gas onto haloes and its impact on the absorption signal has been a topic of several theoretical studies (e.g. Kollmeier et al., 2003; Barkana, 2004; Kim & Croft, 2008), and is also investigated in Rakic et al. (in preparation) where we will compare the observational results with cosmological hydrodynamical simulations.

The dashed, horizontal lines in Figure 6 indicate the median Ly optical depth of all pixels in the Ly forest regions of the QSO spectra. In the transverse direction we do not probe sufficiently large distances to see the signal disappear: for all but the last panel the red curves stay above the dashed lines out to impact parameters of 2 pMpc. In the LOS direction (black curves) we do see convergence for separations  pMpc. The last black square in the first panel is an outlier, but note that it is based on only 16 galaxies, whereas the same points in panels 5-9 are based on 49–217 galaxies. Indeed, according to the redshift randomization method described in the next section, the last black square in the first panel is only a 1.8 outlier, whereas the last red circles of panels 1-6 (or, equivalently, the first 6 black squares in the last panel) represent detections of excess absorption with significance varying between 2.6 and .

The median log() and 1 confidence intervals from Figure 5 are tabulated in Table 2.

### 4.3. Lyα absorption as a function of 3-D Hubble distance

Figure 7 shows the median optical depth in radial bins around the galaxy positions, where we assumed that velocity differences between absorbers and galaxies are entirely due to the Hubble flow. We emphasize that Figures 5 and 6 show that this is not a good approximation, particularly for distances pMpc. It does, however, provide us with a compact way to present a lot of information.

The 3-D Hubble distance, as we call it, is therefore just , where is the galaxy’s impact parameter, is the Hubble parameter, and is the velocity separation between an absorber and the galaxy. As mentioned in Section 2.1, we use only galaxies with impact parameters smaller than 2 pMpc, even when making Figure 7. Hence, distances pMpc reflect mostly LOS separations.

The horizontal dashed line shows the median level of absorption in the Ly forest pixels. The significance of the excess absorption can be estimated by comparing the error bars, which indicate the confidence intervals determined by bootstrap resampling the galaxies, to the difference between the data points and the horizontal dashed line. More precisely, we can estimate the confidence level associated with the detection of excess absorption as , where is the fraction of 1,000 bootstrap realizations for which the data point falls below the dashed line. This method indicates that within 2.8 pMpc excess absorption is detected with greater than 99.7% significance (i.e. ). For 2.8 – 4.0 pMpc the significance is 87% (i.e. ), while the absorption is consistent with random beyond 4 pMpc.

The dotted curve and the right -axis show the number of galaxies contributing to each bin. Since the inner few bins contain only a few tens of galaxies each (14 for the first bin), the bootstrap errors may not be reliable for these bins. The significance of the excess of absorption can be estimated more robustly by making use of the fact that each QSO spectrum provides many independent spectral regions at the impact parameter of each galaxy. We can do this by comparing the excess absorption to the grey region, which indicates the detection threshold and which was determined by re-measuring the median after randomizing the galaxy redshifts (while keeping the impact parameters fixed). The grey shaded region shows the confidence interval obtained after doing this 1,000 times. For each distance bin, the confidence level of the detection is then given by , where is the fraction of realizations resulting in a median optical depth that is higher than actually observed. In agreement with the errors estimated by bootstrap resampling the galaxies, we find that the significance is % within 2.8 pMpc and that there is no evidence for excess absorption beyond 4 pMpc. For 2.8–4.0 pMpc the significance of the detection is, however, larger than before: 99.2% (i.e. ). We conclude that the absorption is significantly enhanced out to at least 3 pMpc proper, which is  cMpc.

The fact that the absorption is enhanced out to several pMpc is in good agreement with Adelberger et al. (2005), who measured the mean flux as a function of 3-D Hubble distance. However, the profile measured by Adelberger et al. (2005) is much flatter. Converting their data into optical depths, they measure in their innermost bin, which extends to about 200 pkpc. This is about an order of magnitude lower than our median recovered optical depth at this distance. Conversely, at large distances their mean flux asymptotes to 0.765, or , which is much higher than our asymptotic median optical depth of , even though we measure a similar mean flux of 0.806, or . Thus, our dynamic range is about two orders of magnitude larger than that of Adelberger et al. (2005). This difference arises because we use median optical depth rather than mean flux statistics and because we use higher order Lyman lines to recover the optical depth in saturated lines.

The dashed curves show the 15.9% and 84.1% percentiles, indicating the scatter in the PODs (which is obviously much larger than the error in the median). It is important to note that, except on the smallest scales ( pkpc), the scatter is similar to or larger than the median excess absorption. Hence, there will be a wide range of PODs for all separations probed here.

Finally, the blue line shows the best-fit power-law through the data points,

 median(log10τLyα)=(0.32±0.08)d−0.92±0.173D−1.27, (6)

where we required the fit to asymptote to the median of all pixels (horizontal, dashed line in Figure 7).

#### 4.3.1 Testing the robustness

In this section we will show that the results we presented (2-D and 3-D absorption profiles) are robust to changes in the S/N ratio of the QSO spectra, to the omission of NIRSPEC or non-NIRSPEC redshifts, to the exact redshift calibration, and, for optical depths , to the use of higher order Lyman lines. We have chosen to demonstrate this using the plots of median absorption versus 3-D Hubble distance, because these offer a compact summary of the data.

In the left panel of Figure 8 we compare the median Ly absorption as a function of 3-D Hubble distance for the lower and higher S/N subsamples of the QSO spectra (with median S/N ratios of and respectively). It appears that better data yields slightly more absorption at 3-D Hubble distances of  pMpc, but the differences are not significant.

The middle panel of Figure 8 compares the subset of 71 galaxies for which redshifts have been measured from nebular emission lines using the NIRSPEC instrument (red circles) with the default sample (grey curve) as well as with the result obtained when we ignore the NIRSPEC redshifts and instead use Ly emission and/or interstellar absorption redshifts measured from LRIS spectra for all galaxies (black squares). As discussed in Sections 2.1.1 and appendix A, the redshifts estimated from the NIRSPEC spectra have errors of , while the redshifts estimated from LRIS spectra typically have . In Figure 8 we see that the signal appears to drop slightly more steeply for the NIRSPEC subsample, as would be expected given the smaller redshift errors, but both the NIRSPEC and pure-LRIS samples are consistent with the default sample.

We note that for Ly emission and interstellar absorption we also tried using the redshift calibrations from Adelberger et al. (2005) and Steidel et al. (2010) instead of the one from Rakic et al. (2011). For Rakic et al. (2011) the signal tends to be slightly stronger and the bootstrap errors slightly smaller, but the differences are small compared with the errors (not shown).

One of the advantages of the POD method is the possibility to recover the optical depth in the saturated Ly pixels by using higher order Lyman lines. The right panel of Figure 8 shows the effect of omitting this feature of the POD method. The two curves are nearly identical except for the first two bins,  pMpc, where the median optical depths increase from when we make use of higher order lines to when we do not. The latter value is not meaningful as it is the optical depth that we assign to pixels for which saturation prevents recovery of the optical depth (see §3). Without higher order lines, we cannot constrain the flux to be much smaller than the S/N ratio, which in our case corresponds to optical depths of about 4–5.

Hence, measuring the median optical depth in the circumgalactic region requires the use of higher order lines, but the recovery of the optical depth in saturated pixels appears to be unimportant at large distances. This is, however, only true if we restrict ourselves to median statistics. As we will see in Sections 4.6.2 and 4.6.3, higher order lines are also crucial at large distances if we are interested in the PDF of pixel optical depths.

### 4.4. Lyα absorption as a function of transverse distance

Figure 5 shows that absorption signal is smeared out over  pMpc (i.e. ) in the LOS direction. Indeed, for all panels of Figure 6 the first 6 bins along the LOS (black squares), which correspond to velocities , are consistent with each other at the level. This velocity is slightly larger than the expected redshift errors and similar to the expected circular velocities of the halos hosting our galaxies. Because there is no evidence for structure in the velocity direction on scales and because the errors are strongly correlated for smaller velocity differences (see Appendix B), it makes sense to group these first 6 LOS bins together and measure the absorption as a function of transverse distance. The result is shown as the blue squares in Figure 9 (all data from this figure is tabulated in Table 3), which shows the 3-D Hubble distance results for comparison (grey circles).

Estimating the errors by bootstrapping galaxies, we find that excess absorption is detected with confidence over the full range of impact parameters. Using the more robust method of randomizing galaxy redshifts, we find that the significance is at least for all but the second point (0.13–0.18 pMpc), which is based on only 8 galaxies and for which the significance is only .

As expected, the Hubble and transverse results converge at large distances, where redshift space distortions are small222Note that in the last bin the absorption is slightly stronger in the transverse direction. This is a weaker version of the distortion that we attributed to the Kaiser effect when we compared the transverse and LOS directions (see Fig. 6).. They also agree at very small distances (first bin;  pMpc), because a small 3-D Hubble distance implies that the transverse distance must also be small (note that the reverse is not true since galaxies with small impact parameters contribute pixels with large LOS separations). On intermediate scales, however, the two measures of distance yield significantly different results, with the absorption at fixed 3-D Hubble distance being stronger than that at the same fixed transverse distance. In particular, while the transverse direction shows a rapid fall off at pMpc followed by a constant excess absorption out to 2 pMpc, the absorption decreases smoothly with 3-D Hubble distance.

Because of redshift space distortions, it is not possible to measure the absorption as a function of the true, 3-D distance. However, we expect the true result to be in between the transverse and Hubble results shown in Figure 9. The 3-D Hubble distance may overestimate the true distance on intermediate scales, because it assumes all velocity differences to be due to the Hubble flow, whereas in reality redshift errors and peculiar velocities will contribute. Note, however, that infall of the right magnitude could lead the 3-D Hubble distances to underestimate the true distances. On the other hand, because we group LOS separations of , the transverse distance will typically underestimate the true distance on intermediate scales because it implicitly assumes that the contribution of the Hubble flow is negligible up to velocity differences of .

However, we chose to average over this velocity difference with good reason: within the signal is independent of velocity (see Fig. 6) and the errors are strongly correlated (Appendix B), which suggests that the velocities are not dominated by the Hubble flow. We therefore expect the transverse results to be closer to the truth, i.e. to better represent the real space absorption profile, and the excess absorption for 3-D Hubble distances  pkpc to be due mostly to the inclusion of absorption around galaxies with smaller impact parameters. Indeed, we find that around these distances the scatter in the PODs is much greater if we bin in terms of 3-D Hubble distance than if we bin in terms of transverse distance (even though a velocity interval of corresponds to a relatively large Hubble distance of 710 pkpc).

#### 4.4.1 Scatter

We have so far focused on the median absorption. However, the dashed curves in Figure 7 demonstrate that there is a large degree of scatter in the distribution of PODs at a fixed distance. In this section we will therefore investigate how the median absorption around individual galaxies varies from galaxy to galaxy333We could also have studied the distribution of individual PODs, but we prefer to consider only one data point per galaxy (i.e. the median POD within 165 ) because galaxies are independent, whereas pixels are not..

Figure 10 shows histograms of the distribution of the median optical depth in velocity intervals of centered on galaxies. The different panels correspond to different impact parameters. The -axis has as its lower limit. This indicates pixels set to , because they had (see Section 3 for details). We expect the true optical depths of these pixels to be similar to the inverse of the S/N ratio, which is . The label “saturated" at the high-absorption end of the -axis is for saturated pixels whose optical depth could not be recovered using higher order Lyman lines (because they are either also saturated or unavailable). These pixels do not necessarily have optical depths that are higher than the highest recovered values (), because higher order lines can be either contaminated or unavailable. In that case we cannot constrain the flux to be smaller than the S/N ratio, which corresponds to optical depths of about 4–5.

Comparing the different panels with each other, it seems that the distribution is shifted to higher optical depths for  pMpc, but that the results are similar for impact parameters  pMpc. Indeed, Kolmogorov-Smirnov (K-S) tests show that, except for the first panel, panels 2–8 are consistent with the last panel (at the level). For  pMpc, however, the distribution of median optical depths differs at the 99.98% confidence level (). These results are consistent with our findings for the medians of the distributions (Fig. 9, blue squares).

Ignoring the outliers for which the median optical depth is saturated or , the scatter is about 0.75 dex. The median absorption near galaxies with a fixed impact parameter is thus highly variable, suggesting that the gas is clumpy. This is true at all impact parameters and therefore not a distinguishing feature of the circumgalactic medium.

The blue curve, which is repeated in every panel, indicates the distribution after randomizing the galaxy redshifts. It therefore gives the distribution that we would expect to measure if the absorption were uncorrelated with galaxy positions. Using the K-S test to compare each histogram with the blue curve, we find that except for the second panel (which contains only 8 galaxies), all are discrepant at the 2 to 6 level. For  pMpc the excess absorption appears small, but the difference is in each case significant at the level. For the second panel ( pMpc) the significance of the detection of excess absorption is . These results are in near perfect agreement with those based on median statistics. Thus, examination of the full distribution of median optical depths around galaxies confirms the result obtained for the medians. There is a sharp drop in the absorption around  pkpc, which is similar to the virial radius, followed by a near constant, small, but highly significant, excess absorption out to transverse separations of at least 2 pMpc.

### 4.5. Interpreting PODs

The central optical depth is related to the absorbing gas column density, , through the following approximate relation:

 τ0≈(N3.43×1013cm−2)(f0.4164)
 ×(λ01215.67\AA)(bD26kms−1)−1 (7)

where is the oscillator strength, is the transition’s rest wavelength, is the line width, and is the line of sight velocity dispersion (e.g. Padmanabhan, 2002, §9.5). In our study we use statistics based on either median optical depths or (in §4.6.3) on the maximum optical depth within a given distance range from a galaxy. Measurements based on the maximum optical depth in a given region are relatively easy to interpret using Equation (7). However, given that most of our analysis is based on median optical depths, all the conversions that we make to column densities of the absorbing gas are likely to be underestimates if most of the lines are not blended, and could be overestimates in regions where the lines are highly blended. For example, if there is a single line in a given velocity interval, the column density estimated from the median optical depth would be lower than the real column density of such a system. If, on the other hand, there is one strong line and a number of weaker lines that are blended with it, then the column density inferred from the median optical depth could be higher than the median column density of individual absorption systems.

The left panel of Figure 11 shows the median neutral hydrogen column density as a function of both transverse distance for LOS velocities (solid curve) and 3-D Hubble distance (dashed curve), which we obtained from Figure 9 and equation (7), using the typical line width of measured for our sample by Rudie et al. (2012). The median column density decreases from at  pkpc to at  pMpc, which is in excellent agreement with results from Rudie et al. (2012) based on Voigt profile decompositions. Note that if we had selected the strongest system within a given distance of each galaxy, instead of taking into account all pixels within that distance, the value of the median column density (i.e., the median of the maximum column density) would have been significantly higher.

To gain intuition about what the observed absorption represents, we will convert the optical depths into overdensities using two approximations. Combining Equation (7) with Equation (10) of Schaye (2001), who treats Ly absorbers as gravitationally confined gas clouds with sizes of order the local Jeans length, we obtain:

 Δ ≈ 2.1τ2/30,LyαΓ2/312(1+z3.36)−3(T2×104K)0.17 (8) ×(fg0.162)−1/3(b26kms−1)2/3

where is the density of gas in units of the mean baryon density of the Universe, is the photo-ionization rate in units of , and denotes the fraction of the cloud mass in gas. We have assumed a temperature typical for the moderately overdense IGM (e.g. Schaye et al., 2000a; Lidz et al., 2010; Becker et al., 2011), a line width consistent with the median value measured by Rudie et al. (2012) for our data, and a photo-ionization rate appropriate for ionization by the ultraviolet background radiation (e.g. Bolton et al., 2005; Faucher-Giguère et al., 2008). In collapsed gas clouds could be close to unity, but far away from galaxies gas will not be in dense clumps and should be close to its universal value of . Note that these densities are effectively evaluated on the local Jeans scale, which is typically for the densities of interest here (Schaye, 2001).

The above expression is a good approximation for overdense absorbers. Absorbers with densities around or below the cosmic mean have not had sufficient time to reach local hydrostatic equilibrium (Schaye, 2001) and will be better described by the fluctuating Gunn-Peterson approximation (e.g. Rauch et al., 1997), which assumes smoothly varying density fluctuations and pure Hubble flow, yielding

 Δ≈2.02τ1/2LyαΓ1/212(1+z3.36)−9/4(T2×104K)0.38. (9)

Both equations (8) and (9) assume primordial abundances, highly ionized gas, and photo-ionization equilibrium. However, close to galaxies, UV radiation from local sources may dominate over the background and the gas may be shock-heated to temperatures sufficiently high for collisional ionization to dominate. Both of these effects would cause us to underestimate the gas density, possibly by a large factor.

The middle and right panels of Figure 11 show the median overdensity profiles obtained after applying the above equations to the median optical depth measured as a function of 3-D Hubble and transverse distance (for LOS separations ), respectively. The “Jeans” and fluctuating Gunn-Peterson approximation are generally in good agreement, although the former yields steeper density profiles at small distances, where the gas is highly overdense. However, in this regime we do not expect the fluctuating Gunn-Peterson approximation to hold. The dashed curve shows the result if we convert the recovered optical depths into overdensities using the fit to hydrodynamical simulations given in Aguirre et al. (2002) for . The relation provided by those authors was obtained by producing mock Ly absorption spectra for sight lines through a hydrodynamical simulation, scaling the simulated spectra to fit the observed mean flux decrement, and then fitting the relation between the Ly optical depth-weighted overdensity of gas responsible for the absorption in each pixel and the recovered Ly POD in that pixel. The agreement between the hydrodynamical simulation and the Jeans method is clearly excellent.

We conclude that typical gas overdensities decrease from at distances  pkpc, which is similar to the virial radii of the halos hosting our galaxies, to at  pMpc. The steepness of the drop at intermediate scales is uncertain due to redshift space distortions, but it is likely to be bracketed by the the middle and right panels of Figure 11. Observe that on large scales the overdensity asymptotes to values smaller than unity. This is expected, as the median optical depth is a volume weighted measure rather than a mass-weighted measure. While the mass-weighted mean overdensity is by definition unity, the volume-weighted mean overdensity is smaller because underdense regions dominate the volume. As most of the pathlength in the Ly forest is across voids, the median density inferred from the Ly forest is less than 1. The difference between volume and mass-weighted quantities probably also accounts for the relatively low overdensity measured around the virial radius ( pMpc).

### 4.6. Circumgalactic matter

The smallest impact parameter in our sample is pkpc. It is difficult to probe scales smaller than this with QSO-galaxy pairs owing to the small number of bright QSOs and the rarity of close pairs, as well as to the difficulty of observing objects right in front of a bright QSO. For such small scales it is therefore more efficient to resort to using galaxies as background objects (e.g. Adelberger et al., 2005; Rubin et al., 2010; Steidel et al., 2010). Nevertheless, in this section we will use our sample of background QSOs to study the gas within transverse distances of 200 pkpc of galaxies. But before doing so, we will compare our results with the small-scale data from Steidel et al. (2010).

#### 4.6.1 Comparison with results for galaxy-galaxy pairs

Steidel et al. (2010) measured Ly rest-frame equivalent width (EW) as a function of distance from the same sample of galaxies that we study here. Since they used background galaxy spectra for probing foreground galaxies’ circumgalactic matter, they were able to study smaller scales than we can probe with the QSO spectra. While they could measure the EW in a stack of background galaxy spectra, the spectral resolution of the galaxy spectra was too low to measure column densities or resolved pixel optical depths and the S/N was too low to obtain EW measurements for individual galaxies. The red circles in Figure 12 show their measurements at small impact parameters and the blue circles show their measurements using QSO spectra, together with our EW measurements from QSO spectra at larger impact parameters (black squares). Although we used the same data, their technique for measuring EW differs from ours (see Steidel et al., 2010, for more details).

We computed the EWs as follows. We shifted the spectra into the rest-frame of each galaxy within a given impact parameter bin, found the mean flux profile, divided it by the mean flux level of all pixels in the spectra (i.e. 0.804) in order to mimic the effect of continuum fitting low-quality spectra, and integrated the flux decrement over centered on the galaxies’ positions. We verified that using a larger velocity interval () gives consistent results.

It appears that the absorption strength falls off according to a power law, out to pkpc. Beyond this impact parameter the relation flattens off at EW Å.

#### 4.6.2 Scatter

Adelberger et al. (2005) found that the absorption near galaxies follows a bimodal distribution. While the flux decrement is usually large, of galaxies are not associated with strong Ly absorption within a 3-D Hubble distance of 1  pMpc (i.e. ).

As shown in Figure 10, we also find that the absorption varies strongly from galaxy to galaxy. However, we showed that this is not only true near galaxies, but also if we look in random places of the spectrum. Moreover, the distribution of median optical depths is not bimodal.

As Adelberger et al. (2005) did not measure optical depths, we show the distribution of the median flux decrement for impact parameters and velocity differences from galaxies in the left panel of Figure 13. For comparison, the right panel shows the corresponding median optical depths. As before, we chose this velocity interval because there is little structure in the median profiles for velocity separations smaller than this value (see Figs. 5 and 6), which is not surprising given that is similar to both the redshift errors and the circular velocities of the halos thought to host the galaxies. The blue curves show the results for randomized galaxy redshifts, where we assign random redshifts (within the Ly forest redshift range) to the galaxies in our sample. The curves are the result of estimating a histogram for randomized redshifts 1,000 times, and taking the mean median in each bin. A K-S test shows that the two distributions are discrepant at the level, so the absorption is clearly enhanced near galaxies.

It can be seen in the left panel that galaxies either show very little transmission, or relatively high flux. The flux distribution is thus bimodal, in agreement with Adelberger et al. (2005). However, the right panel shows that there is no evidence for bimodality of the optical depth distribution. Hence, the bimodality seen in the left panel is a consequence of the mapping which bunches the low (high) optical depth tail of the distribution together at a flux decrement of zero (one). Note that the shape of the optical depth distribution (which appears to be approximately lognormal) is physically more relevant than that of the flux, because the former is proportional to the neutral hydrogen column density.

We will demonstrate next that the scatter is not random as the optical depth is strongly anti-correlated with the impact parameter.

#### 4.6.3 Cold flows

In recent years cosmological simulations and theoretical models have converged on the idea that gas accretion is bimodal. While the gas accreting onto massive halos is shock-heated to the virial temperature, in lower mass halos most of the gas falls in cold (K). The cold mode feeds the galaxy through filaments (cold streams) whereas the hot mode results in the formation of a hydrostatic halo which fuels the galaxy through a cooling flow (e.g. Kereš et al., 2005; Ocvirk et al., 2008; Dekel et al., 2009; Crain et al., 2010; van de Voort et al., 2011a, b; Faucher-Giguère et al., 2011b). At halo masses of , i.e. similar to those hosting our galaxies (Trainor et al. 2011, in preparation), are especially interesting in that they mark the transition between the cold- and hot-mode accretion regimes. For example, van de Voort et al. (2011a) predict that the hot mode contributes on average and to the growth of halos and their central galaxies, respectively. The simulations predict that individual galaxies in such halos are fed simultaneously through both modes with cold streams penetrating hot, hydrostatic halos.

Despite the theoretical consensus, there is no direct observational evidence for this galaxy formation picture as of yet. Absorption by neutral hydrogen is a promising way to observe the cold streams. Based on a high-resolution simulation of a single halo with mass at , Faucher-Giguère & Kereš (2011a) predict that the covering fraction of cold flows with Lyman Limit (LLS; ) and DLA column densities (), within the virial radius (2 virial radii) is 10–15% (4%) and 3–4% (1–2%), respectively, where they quote 74 pkpc for the virial radius. Kimm et al. (2011) use cosmological simulations to predict that the covering fraction of systems with within 100 pkpc of halos of at is %. Stewart et al. (2011) simulate two halos with masses of at and predict covering fractions for of about 15% at 33 pkpc. Fumagalli et al. (2011) study the covering fraction of cold streams around 7 haloes. For two haloes with virial masses of and at , which are typical for star-forming galaxies from our sample, they predict a covering fraction for DLAs (LLSs) within 2R of 2.22 (6.89) and 1.54 (9.11) per cent, respectively.

The left panel of Figure 14 shows the median optical depth within from galaxies, as a function of the impact parameter. The results are insensitive to the exact velocity interval chosen, although a galaxy with pkpc would have been a DLA system if we had used a maximum velocity separation . Each circle represents a galaxy. Sub-DLA systems are shown as star symbols. In two non-DLA cases the median optical depth could not be measured because more than half of the pixels were saturated and there were insufficient higher order lines to recover the optical depth. For those cases we plot the largest recovered optical depth and show it as an upwards-pointing arrow to indicate that it is a lower limit. Note, however, that there is no evidence for DLA absorption associated with any of the galaxies plotted as lower limits. The vertical dashed line shows the virial radius for halos with mass (Trainor et al. 2012). To compare with predictions from the literature, it is more appropriate to look at the maximum pixel optical depth in the interval around each galaxy, because we can convert this into a neutral hydrogen column density using equation (7). This results in significantly more lower limits, as can be seen in the middle panel of Figure 14. Finally, the right panel shows the minimum flux.

Figure 14 shows a clear trend of increasing absorption strength with decreasing distance. While the scatter in the optical depth is large at all impact parameters, in terms of flux there is actually little variation within 100 pkpc. In particular, it is striking that all 10 galaxies with impact parameters smaller than 100 pkpc are associated with saturated absorbers ().

Within 100 pkpc 10 out of 10 galaxies have median , which corresponds to a covering fraction of . Within 200 pkpc this decreases to . If we use the maximum optical depth then the covering fraction of absorbers within 200 pkpc increases to . Note that the central optical depth is unity for lines with column density (see equation 7).

It appears that 1–5 out of 6 galaxies with impact parameters smaller than the virial radius of their host halos are associated with LLSs (the horizontal dashed line shows the central optical depth for a absorber assuming a line width of ), i.e. there is one secure LLS and 4 possible LLSs given that we have only lower limits on the optical depth of these absorbers. In addition, 1 out of 6 galaxies has a sub-DLA within . Within two virial radii 1–15 out of 21 galaxies are associated with LL systems and 1 out of 21 galaxies is associated with a sub-DLA (in addition there would have been 1 DLA if we had used a maximum velocity separation ). These numbers are higher than the predictions from Faucher-Giguère & Kereš (2011a) and Kimm et al. (2011), and also higher than Fumagalli et al. (2011) for DLAs (they are consistent with their results for LLSs), but the errors are too large for the discrepancy to be significant. We note that when it comes to measuring the column densities of strongly saturated systems (e.g. Figure 14 where our method yields a number of lower limits), it would be more appropriate to use Voigt profile decompositions, as are presented by Rudie et al. (2012). This other method gives results fully consistent with those presented here.

Even though these observations are consistent with what is seen in simulations, it is unclear whether the observed absorbing gas is inflowing or outflowing. Steidel et al. (2010) found, using 500 galaxy pairs sampling angular scales of 1–15”, that the circumgalactic medium within pkpc shows strong H i and low-ion metallic absorption, which is consistent with the findings in this paper. However, they also find that galaxy spectra exhibit kinematics consistent with radial flows with velocity increasing outward.

## 5. The distribution of galaxies around absorbers

In the previous sections we investigated the typical IGM environment of the galaxies in our sample. Here we address the complementary question: what is the galaxy environment for a pixel of a given optical depth?

Figure 15 shows the median 3-D Hubble distance to the nearest galaxy in our sample as a function of a pixel’s Ly optical depth, normalized by the median 3-D Hubble distance to the nearest galaxy for all pixels ( pMpc), irrespective of their optical depth (the unnormalized distance is shown on the right -axis). Note that because we divided by the median distance for all pixels, the result is not directly dependent on the completeness of our sample (the actual, unnormalized distances are, however, highly sensitive to the completeness). The error bars were determined by bootstrap resampling the QSO spectra using chunks of 500 pixels. The grey shaded area shows the confidence interval of results obtained for randomized galaxy samples. It shows what would be expected if absorbers and galaxies were distributed randomly with respect to each other.

We cut the plot at the low optical depth end at the value corresponding to a detection of absorption for a S/N of 50. Noise prevents us from distinguishing lower optical depths from each other, causing the trend to flatten. At the high optical depth end we cut the plot at . Although we can recover higher optical depths, we find that pixels with are biased: while the median redshift stays close to 2.36 for optical depths in the plotted range, it increases to about 2.5 for higher optical depths. Because of our selection function, the galaxy number density is lower for such high redshifts, causing an upturn of the distance to the nearest galaxy for .

The optical depth is strongly anti-correlated with the distance to the nearest galaxy. Pixels with , are closer to galaxies than a random place in the Universe is, while pixels with are on average farther from galaxies than a random location is.

To provide some physical interpretation of this observed trend we use the Jeans approximation described in Section 4.5 to convert optical depths into estimates for the gas overdensities (implicitly smoothed on the scale of the absorbers, i.e.  pkpc) (top -axis). Gas at the mean density of the Universe would produce log and is thus typically closer to galaxies than a random place in the Universe. Although it is unclear whether we can estimate the overdensity sufficiently accurately for this statement to be reliable, it would in fact not be a surprising result. Because voids take up most of the volume, a random place in the universe will be underdense.

If we cube the normalized distance to the nearest galaxy, then we get something close to the inverse of the overdensity of galaxies. However, in that case the scale over which this number density is measured, would be the distance to the nearest galaxy and would thus vary with the density itself. It is more instructive to measure the overdensity of galaxies on a fixed length scale, which is something we can also measure from our data.

In Figure 16 we show the mean overdensity of galaxies in our sample as a function of pixel optical depth, evaluated on (3-D Hubble) scales of 0.25, 0.5, 1, 2, 4, and 8 pMpc around them (tabulated in Table 4). Note that these galaxy number densities cannot be directly compared with the gas overdensities that we can estimate from the optical depths because they are evaluated over different length scales.

The galaxy number density increases with increasing optical depth, and the density contrast is stronger when it is measured on smaller scales. For example, pixels with (at ) see on average a galaxy overdensity within 8 pMpc, but an overdensity within 0.25 pMpc. Pixels with typically reside in regions where the galaxy number density is close to the cosmic mean on scales  pMpc. Because galaxies are clustered, this implies that for such pixels the distance to the nearest galaxy will typically be smaller than it is for a random place in the Universe (see Fig. 15).

Given that we have both measurements of the galaxy number density and estimates of the gas overdensities, we could attempt to estimate the bias of these two components relative to each other by computing the ratio of the two densities. However, the length scale over which gas densities corresponding to Ly absorbers are implicitly smoothed (i.e. the local Jeans scale in case of overdense gas; Schaye, 2001), varies with the density and hence the optical depth, while galaxy and gas density have to be evaluated on the same scale in order to calculate the relative bias. We could in principle estimate the galaxy overdensity on scales that vary with the optical depth, but for overdense gas the relevant scales are somewhat too small to get a robust estimate of the galaxy number density.