The Spatial Range of Conformity
Key Words.:
largescale structure of Universe – Galaxies: statistics – Galaxies: fundamental parameters – Galaxies: formationAbstract
Context:Properties of galaxies like their absolute magnitude and their stellar mass content are correlated. These correlations are tighter for close pairs of galaxies, which is called galactic conformity. In hierarchical structure formation scenarios, galaxies form within dark matter halos. To explain the amplitude and the spatial range of galactic conformity two–halo terms or assembly bias become important.
Aims:With the scale dependent correlation coefficients the amplitude and the spatial range of conformity are determined from galaxy and halo samples.
Methods:The scale dependent correlation coefficients are introduced as a new descriptive statistic to quantify the correlations between properties of galaxies or halos, depending on the distances to other galaxies or halos. These scale dependent correlation coefficients can be applied to the galaxy distribution directly. Neither a splitting of the sample into subsamples, nor an a priori clustering is needed.
Results:This new descriptive statistic is applied to galaxy catalogues derived from the Sloan Digital Sky Survey III and to halo catalogues from the MultiDark simulations. In the galaxy sample the correlations between absolute Magnitude, velocity dispersion, ellipticity, and stellar mass content are investigated. The correlations of mass, spin, and ellipticity are explored in the halo samples. Both for galaxies and halos a scale dependent conformity is confirmed. Moreover the scale dependent correlation coefficients reveal a signal of conformity out to 40 Mpc and beyond. The halo and galaxy samples show a differing amplitude and range of conformity.
Conclusions:
1 Introduction
The clustering of galaxies in space is an important observational constraint for models of structure formation in the Universe. Often galaxies are treated as points in space and one compares the clustering properties of this point distribution to models of structure formation. However galaxies are extended objects and come in different flavours. Their properties are categorised and quantified. One considers the luminosity, the shape, the substructure or spectroscopic features of a galaxy, to name only a few. As an extension, galaxies are still treated as points, but the properties of the galaxies are assigned to the points as marks. This establishes at each position of a galaxy a multidimensional space. Depending on the physical problem, different methods for the analysis of such a marked point set have been devised:
The concept of bias was developed to account for the stronger clustering of galaxy–clusters compared to the clustering of galaxies themselves (Kaiser, 1984; Bardeen et al., 1986). Currently bias is often used to describe the differences between the clustering of luminous and dark matter (see Desjacques et al. 2016 for a recent review)
With luminosity– and morphology–segregation one describes the differences in the spatial clustering of dim versus luminous galaxies, of early–type (e.g. ellipticals) versus late–type galaxies (e.g. spirals), or of red versus blue galaxies, etc. (Ostriker & Turner, 1979; Hamilton, 1988; Willmer et al., 1998). In most cases the ratios of the two–point correlation functions, determined from sub–samples of the galaxy distribution are used to quantify these segregation effect (see e.g. Zehavi et al. 2011).
The morphology density relation indicates that early type galaxies tend to reside in more dense environments compared to late type galaxies. There are numerous observations confirming this (Dressler, 1980; Postman & Geller, 1984; Andreon et al., 1997; van der Wel et al., 2010). Effects of the morphology density relation are typically confined to groups and clusters of galaxies (see however Binggeli et al. 1990).
Conformity is an expression from sociology, it is the act of matching attitudes and behaviours to group norms. With galactic conformity one is investigating how strongly the properties of galaxies conform with each other, if they are located in a group around a bright dominating galaxy or in a dark matter halo (Weinmann et al., 2006). Galactic conformity is typically quantified by first determining the central galaxy within a group of galaxies. Then e.g. the fraction of late type galaxies in the cluster is plotted against the mass of the group depending on the type of the central galaxy. Hence, galactic conformity is an extension of the morphology density relation, with the focus on the bright central galaxy as the determinant for the galactic properties. Not only the types but also the colours, the star formation rates, or other properties of the galaxies are being used. Kauffmann et al. (2013) plotted the fraction of star forming galaxies against the (projected) distance from the central galaxy, showing that conformity is scale dependent, at least on small scales. In hierarchical structure formation scenarios, galaxies form within dark matter halos. To explain the amplitude and the spatial range of galactic conformity two–halo terms or assembly bias becomes important. Using the halo model Hearin et al. (2015) were able to model such a scale dependence using 2–halo conformity from assembly bias. A comparison of semi–analytic models reveals different patterns in the scale dependence of halo conformity between the models (see the discussion in Lacerna et al. 2017 and the references therein). Quantitative scale dependent methods are needed to discriminate these different approaches. This is especially important if one wishes to quantify the influence of large–scale structures on the conformity. Then one needs measures of conformity wich are also sensitive on large scales.
As a new descriptive statistic based on mark correlation functions, the scale dependent correlation coefficients are introduced to quantify dependencies between properties of galaxies (or halos). The scale dependent correlation coefficients measure the strength of the correlations between the intrinsic properties of a galaxy and how these correlations on one galaxy depend on the presence of another galaxy at a distance of (similarly for halos). Hence they allow a scale dependent measurement of the conformity. To estimate the scale dependent correlation coefficients suitably weighted pair counts of all the galaxies are used. Conceptually this is a major benefit, all pairs are counted. The galaxy sample is not split into several parts, e.g. early type, late type, nor any grouping into clusters is necessary. No new (nuisance) parameters are introduced into the analysis.
In section 2 the scale dependent correlation coefficients are defined. They are used in section 3 to analyse galaxy samples from the Sloan Digital Sky Survey (SDSS), and in section 4 for halo samples from the MultiDark dark matter simulations. A summary and conclusion is given in section 5. In Appendix A the construction of the galaxy and halo samples is detailed, and a simple toy model is presented in Appendix B.
2 The method
The well known definitions of covariance and correlation coefficient are reviewed in the next subsection. This discussion serves as a blue–print for the definition of the scale dependent correlation coefficient in subsection 2.2. The definitions are given explicitly for galaxies with absolute r–magnitude and ellipticity . For the SDSS galaxies and the halo samples from the MultiDark simulations, also other properties are used as marks in the analysis below (see Appendix A for details). In the following the positions of the galaxies together with their properties are interpreted as a realisation of a marked point process (Beisbart et al., 2002). The twopoint theory of marked point processes was developed by Stoyan (1984) and is nicely reviewed in Stoyan & Stoyan (1994). First applications of mark correlation function to galaxy samples are discussed in Beisbart & Kerscher (2000); Szapudi et al. (2000); Beisbart et al. (2002) and to halo simulations in Gottlöber et al. (2002); Faltenbacher et al. (2002); Sheth & Tormen (2004).
2.1 Correlations between properties of galaxies or halos
In this subsection only the intrinsic properties of galaxies or halos will be of interest, irrespective of their position in space. The joint probability densities provides a suitable tool to describe the statistics of the galaxy (or halo) properties. is the probability density of finding a galaxy with absolute rmagnitude and with ellipticity in our sample. Marginalising , one obtains the probability density of the ellipticity and similarly the probability density of the absolute r–magnitude . The moments are defined in the usual way. E.g. the th–moment of the ellipticity–distribution is with the mean ellipticity and the variance . If and are independent , however in general this is not the case. To quantify the dependency the covariance and correlation coefficient of and are used. The covariance is defined as
(1) 
Suitably normalised one obtains the well known correlation coefficient
(2) 
By definition . The larger the modulus of , the stronger the (anti) correlation between and .
2.2 The scale dependent correlation coefficient
Calculating the above defined correlation coefficients under the condition that another galaxy is at a distance of one arrives at the desired statistic describing scale dependent correlations. To define these scale dependent correlation coefficients the flexible framework of mark correlation functions is used (Stoyan, 1984; Beisbart & Kerscher, 2000).
is the probability density of finding a galaxy at with an absolute magnitude and an ellipticity . For a homogeneous point distribution this splits into where denotes the mean number density of galaxies in space and , the already defined probability density of finding a galaxy with absolute r–magnitude and ellipticity . Slightly extending the notation from above, and are the absolute r–magnitude and ellipticity of the galaxy at the position . Accordingly, quantifies the probability density of finding two galaxies at and with the absolute magnitudes , and the ellipticities , , respectively. For an isotropic and homogeneous point set only depends on the separation and the spatial product density is then given by , with the well known two–point correlation function .
It is useful to consider the conditional mark probability density defined as
(3) 
is the probability density of the absolute magnitudes , , and ellipticities , under the condition that this pair of galaxies is separated by . We speak of mark–independent clustering, if factorises and does not depend on the pair separation . In such a case the absolute magnitudes and the ellipticities of galaxy pairs with a separation are not different from any other pair of galaxies. On the contrary, mark–dependent clustering or mark segregation implies that the marks on certain galaxy pairs show deviations from the global mark distribution.
The conditional probability density is used to calculate the scale–dependent correlation coefficient:
(4) 
with the abbreviation
Only the correlation coefficient between and on galaxy 1 is calculated, the marks on galaxy 2 are integrated out. One should compare this definition with Eqs. (1) and (2) to see the close analogy. quantifies the correlation between the absolute magnitude and the ellipticity on one galaxy under the condition that another galaxy is at a distance of . If there is an environmental dependency one expects . For large separations the environmental dependency has to vanish and one gets .
Similar to Eq. 4 one can define the scale dependent mean
(5) 
and with the scale dependent variance
(6) 
The scale dependent mean and and the scale dependent
variance are the mark correlation functions
and var() as defined in Beisbart & Kerscher (2000).
The scale dependent mean and variance allow the definition of
an alternative scale dependent correlation coefficient
(7) 
This defines the correlation coefficient relative to the mean and variance of galaxies with another galaxy at a distance of (cf. equation (4)). In Appendix B both and are calculated for a simple toy model with a built in scale. With the scale can be detected easily from the samples, whereas is not depending on the built in scale. As another example consider with Mpc, the correlation coefficient between and of all the galaxies with another galaxy at a distance Mpc. Then quantifies the deviation from the corresponding correlation coefficient of all galaxies as visible in Fig. 1 below.
It is straightforward to estimate mark correlation functions like from a galaxy catalogue. The basic idea derives from eqs.(3, 4): one adds up every pair of galaxies separated by weighted by . Then one divides by the number of pairs with separation . Suitably normalised one obtains an estimate of . Analog ideas apply for the estimation of . A more detailed discussion and a comparison of several estimators for mark correlation functions is given in the Appendix of Beisbart & Kerscher (2000).
The procedure offers a builtin significance test (Beisbart & Kerscher, 2000; Grabarnik et al., 2011). One can redistribute the galaxy properties within the sample randomly, holding the galaxy positions fixed. In that way one mimics a galaxy distribution with the same spatial clustering and the same onepoint correlations , but without any environmental dependency of these correlations. Given the original data set, such samples with mark–independent clustering can be simulated easily and the fluctuations around can be quantified.
3 Scale dependent correlation coefficients of galaxies from the SDSS DR12
The Sloan Digital Sky Survey (SDSS), data release 12 (DR12) includes a magnitude limited sample of galaxies, the main galaxy catalogue (Alam et al., 2015; Eisenstein et al., 2011). For these galaxies photometric and spectroscopic, as well as derived properties are available from the SDSS database. The scale dependent correlation coefficients are estimated from volume limited samples constructed from the main galaxy catalogue. The extinction and K–corrected absolute magnitude , the two dimensional ellipticity on the sky, the spectrally determined velocity dispersion , and the logarithmic stellar mass are assigned to each of the galaxies as marks. The construction of the volume limited samples and details on the estimation and normalisation of the marks , , , and are given in Appendix A.1.
Besides introducing the scale dependent correlation coefficients as a descriptive statistic for measuring conformity, the focus in this article is on the spatial range of conformity, i.e. from how far out the correlations between properties on one galaxy are influenced. The absolute magnitude, the stellar mass content, the velocity dispersion and ellipticity have been chosen as marks, because they show appreciable correlations already for the whole sample (see Table 1). The legitimate expectation is that a scale dependence of conformity can be resolved easily for these marks. With the absolute magnitude, the velocity dispersion and the stellar mass content different aspects of the unobservable overall mass of the galaxy are investigated. The ellipticity is used as a tracer of the shape of the galaxy. In the halo samples below analog parameters were chosen as marks.
cor  

1  0.15  0.84  0.49  
1  0.05  0.2  
1  0.65  
1 
The (onepoint) correlation coefficients (Eq. (2)) between the marks , , and in the volume limited galaxy sample with 600 Mpc depth are shown in Table 1. These sometimes strong (anti) correlations are expected. E.g. the absolute magnitude is the negative logarithm of the luminosity, hence a strong anti–correlation with the logarithmic stellar mass is anticipated.
This strong anti–correlation between and is also clearly visible from the 2d–histogram in Fig. 1. Moreover, galaxies in close pairs show an even stronger anti–correlation between and , as seen from the tighter histogram for the close pairs. Exactly this visual impression is quantified with the scale dependent correlation coefficient . In Fig. 2 the shows the tightened correlation for close pairs (small ), whereas the scale dependent correlation coefficient approaches the overall average for large . This increased correlation of compared to is the scale dependent signal of galactic conformity.
The scale dependent correlation coefficients are shown in Fig. 2 for the six combination of the marks , , and . In all cases the modulus of the scale dependent correlation coefficient, e.g. is significantly larger than the modulus of the overall correlation coefficient on small scales. On larges scales as expected. Randomising the marks, but keeping the positions fixed, allows us to quantify the fluctuations around the case of mark independent clustering. For smaller distances , the scale dependent correlation coefficients are well outside the fluctuations of the randomised samples – a clear signal of galactic conformity. This signal extends out to large scales, becoming consistent with mark independent clustering beyond 40 Mpc — a long range of galactic conformity.
3.1 Determining the range
8.89  9.35  13.0  11.7  16.9  9.76  
7.58  7.82  10.5  9.65  13.9  8.2 
To quantify the range of conformity, an exponential, a Lorentz
function and a power–law are fitted
(8) 
and are the scale parameters in the exponential– and Lorentz–model, the power–law is scale invariant. As can be seen from Fig. 2 in all six cases the exponential– and the Lorentz–fit perform similarly well, whereas the scale invariant power–law fit is significantly off. Quantitatively this can be seen from the summed residuals. For the exponential– and Lorentz–fit they are comparable in size, wheres for the powerlaw fit they are larger by an order of magnitude. The and determined from fits are ranging from 8 Mpc to 17 Mpc (see Table 2). This quantifies the visual expression from Fig. 2, that the range of conformity depends on the galactic properties under investigation. An exponential– or a Lorentz–distribution function allows signals on scales larger than and . And indeed significant scale dependent correlation coefficients are seen up to 40 Mpc and beyond (c.f. Fig. 2). The toy model in Appendix B further illustrates that a built in scale in the correlation pattern of the mark distribution can be determined unambiguously with the scale dependent correlation coefficients .
3.2 Alternative scale dependent correlation coefficients
In Fig. 2 also the results for the alternative definition of the scale dependent correlation coefficients are shown. The four combinations , , , and show a reduced amplitude compared to . With one is measuring the scale dependent correlation coefficients with respect to the mean and variance of the galaxies with another galaxy at a distance of (see eqs. (5) and (6)). With the correlations are calculated with respect to the mean and variance of all galaxies. It is well known that for galaxies the scale dependent mean and and variance are larger than the overall mean and variance for small distances (see e.g. Beisbart & Kerscher 2000). Hence a reduced amplitude should be expected from Eq. (7). Still the remaining signal traced by shows a similar long range of conformity outside the fluctuations. Also the combinations and show no significant deviation between and , both confirming the long range of conformity.
3.3 Systematics
The results discussed in the preceding section, were obtained from a volume limited galaxy sample from the SDSS DR12 with a limiting depth of 600 Mpc. In Fig. 3 similar patterns can be observed for the scale dependent correlation coefficients from samples with 300 Mpc and 900 Mpc depth. A more detailed look shows that the inclusion of less luminous galaxies in the 300 Mpc sample leads to a smaller amplitude of the scale dependent correlation coefficient and also a smaller estimate for the scale parameters, whereas an increased amplitude is observed for the more luminous galaxies in the 900 Mpc sample. The amplitude and range of conformity is not universal, it depends on the galactic properties considered and on the luminosity cut used for the construction of the sample.
The absolute magnitude is used as a mark but also used in the construction of the volume limited samples. Hence it is important to investigate how systematic changes in the calculation of influence the results. The analysis was repeated for absolute magnitudes derived from the model magnitudes with no extinction correction (dereddening) and / or without employing a –correction. As can be seen from Fig. 3, the results are very similar, only the results from samples with no extinction correction and no–K–correction show a significantly enhanced amplitude and an even longer range of conformity. To check for a special kind of Malmquist–bias (see Beisbart & Kerscher 2000, section 4.5), the analysis was repeated for galaxies with a distance up to 580 Mpc, selected from the volume limited sample with limiting depth of 600 Mpc and no significant deviations were seen.
The ratios of luminosities in different filters are called colours. It is well known, that colours are correlated with the morphological type and other properties of the galaxy. Hence colours should be natural candidates in the analysis presented above. However the scale dependent correlation coefficients for colours are sensitive to the extinction correction and the K–correction. Differences on small scales and residual correlations on large scale can be seen for the colour and the absolute magnitude in Fig. 4. The amplitude of the scale dependent correlation coefficient between , , , and , obtained from samples with different magnitude estimates, differ slightly, but a consistent picture for the conformity on large scales appears. Hence, the main results of this article, the long range of conformity, is not affected as can be seen from Fig. 2 and Fig. 3. Moreover, the sample using the extinction and K–corrected magnitudes gives the most conservative estimates for the scale dependent correlation coefficients with the lowest amplitude and the smallest range of conformity. Unfortunately this is not the case for the scale dependent correlation coefficients of colour and absolute magnitudes (see Fig. 4). It is not clear whether the extinction correction, the K–correction, or other currently unknown issues are responsible for these residuals and therefore colours are not considered any further in this work.
Instead of colours the spectral properties of the galaxies can be used directly. E.g. from the observed line–widths one estimates the velocity dispersion in a galaxies. Also the stellar mass estimates rely heavily on spectral properties of the galaxies, and one may think of the stellar mass estimate as a concise summary of the spectral properties of the galaxy. As briefly discussed in Appendix A.1 different methods employing different spectral libraries can be used to estimate the stellar mass content . Repeating the analysis for the three different stellar mass estimates from the SDSS database leads to very similar results.
4 Scale dependent correlation coefficients of halos from the MultiDark simulations
Dark matter simulations can be used to model the large scale distribution of matter in the universe. The dark matter concentrations in these simulations are called halos. A direct comparison of the result for galaxies to the results from halos is complicated by the fact that no luminous matter is included in the simulations. Still, analog properties of the halos can be used and the scale dependent correlation coefficients calculated from dark matter halos can be qualitatively compared to the results from the galaxies. The focus is on the range of these scale dependent correlation coefficients. A related motivation for investigating halo catalogs is coming from the observations in the galaxy catalog that there are residuals in the scale dependent correlation coefficients for colours which are not well understood (see Sect. 3.3). The scale dependent correlation coefficients for the other galactic properties do not show these residuals but still one wishes for an, at least qualitative cross check. Halo catalogs from dark matter simulations offer such clean well defined samples without observational biases.
From the MultiDark Simulations (MDPL2, Prada et al. 2012; Klypin et al. 2016) dark matter halos are identified using the Rockstar halo–finder (Behroozi et al., 2013). Halos with a virial mass (hence with at least 662 dark matter particles per halo) are selected from the MDPL2 simulations. The Rockstart halo–finder is able to determine sub–halos within halos. However in this analysis only distinct halos, i.e. halos which are not a sub–halo in any other halo are used. See Behroozi et al. (2013), Sect. 3.4 for a detailed description of how the substructure membership is determined. The virial mass and the dimensionless spin parameter of the halos are used as marks, and the ratio of the smallest axes to the largest axes in the mass ellipsoid (for details see Appendix A.2). No direct comparison of the scale dependent correlation coefficients from the dark matter halos and the galaxy distribution is attempted, but analog quantities are used as marks: For the dark matter halos from the simulations the mass is directly accessible, whereas for galaxies the absolute magnitude and the stellar mass content are biased tracers of the overall mass. The internal dynamical state is reflected in the spin of the halo and in the velocity dispersion of the galaxy. The shape of the halo is quantified from the 3d–mass ellipsoid, and the shape of a galaxy from the 2d–ellipticity obtained from the image of the galaxy.
cor  

1  0.03  0.12  
1  0.30  
1 
Table 3 summarises the correlation coefficients between , , and in the halo sample. Such correlations are expected. For a detailed study of these one point correlations see e.g. Knebe & Power (2008) and VegaFerrero et al. (2017).
8.6  11.6  19.5  
6.1  9.0  16.8 
Fig. 5 shows the corresponding scale dependent correlation coefficients. The overall appearance is similar to the scale dependent correlation coefficients observed in the galaxy distribution (Fig. 2) with some exceptions. The amplitude of the scale dependent correlation coefficients on small scales is stronger for the combinations , and compared to any of the results from the galaxy distribution. Also, the range of conformity is larger for the halos compared to the galaxies — see also the fitted scale parameters of the halo sample in Table 4 compared to the scale parameters of the galaxy sample in Table 2. Similar to the galaxy distribution, the alternative scale dependent correlation coefficients show a reduced amplitude. Still shows long range correlations out to 30 Mpc, but the signal in and is confined to scales below 10 and 15 Mpc.
4.1 Systematics
To investigate the dependence on the masscut, samples with , , and have been analysed. The scale dependent correlation coefficients show a similar shape and in most cases a similar amplitude between the halo sample. As can be seen in Fig. 6 the amplitude and range of conformity is increasing in the two samples with the mass cut from to . A similar behaviour can be observed in the galaxy samples including more luminous galaxies (see Fig. 3). The most massive sample with shows a dip in the scale dependent correlation coefficient on scales below 5 Mpc but very similar results compared to the sample with on large scale. Also the scale dependent correlation coefficients of halo samples from the BigMDPL simulations (box–size 2.5 Gpc/h) show a similar long range of conformity.
The Rockstar halo–finder is able to determine a halo hierarchy. In the analysis for Fig. 5 only distinct halos, i.e. halos which are not marked as a sub–halos, are used. The scale dependent correlation coefficients calculated from all the halos, including sub–halos and their parent halos, show a reduced amplitude as can be seen in Fig. 6. The Rockstar halo–finder uses phasespace information and an elaborate unbinding strategy to define the halos. The three–dimensional friend–of–friend (FoF) halo–finder operates only in position space to identify halos as linked particle over–densities (Riebe et al., 2013). The analysis with the scale dependent correlation coefficients is repeated for such FoF halo samples from the same MDPL2 simulation. Again the mass, the spin, and the axes ratios of the ellipsoidal shape are used as marks (see Riebe et al. 2013 for details). By comparing the corresponding scale dependent correlation coefficient of Rockstar and FoF halo samples, an increased amplitude can be seen in Fig. 6. Although the amplitude of the scale dependent correlation coefficients differ between “all halos”, “distinct halos”, and “FoFhalos”, the signal of a long range of conformity is clearly visible in all the samples.
5 Summary and Outlook
Properties of galaxies show scale dependent correlation coefficients out to large scales. Properties like mass and luminosity are significantly stronger (anti) correlated for close pairs compared to the correlation coefficients in the overall sample. A clear signal of conformity. The analysis was carried out with a new descriptive statistic, the scale dependent correlation coefficients. They quantify how the correlation coefficients between galactic properties vary under the condition that another galaxy (or halo) is at a distance of . This signal of galactic conformity extends to large scales, in several cases becoming consistent with mark independent clustering only beyond 40 Mpc. Several tests for systematic effects confirm the long range of conformity. Halo samples from dark matter simulations show a larger amplitude and an even longer range of conformity. The scale dependent correlation coefficients between e.g. mass and shape clearly deviates from the overall correlation coefficient beyond 40 Mpc. No universal range of conformity is found. The range varies for different properties under investigation and also depends on the luminosity– and mass–cut used in the construction of the samples. Such a long range of conformity goes well with the investigations of Faltenbacher et al. (2002), who found alignment correlations for cluster sized halos out to separations of 100 Mpc/h. The focus of the present investigation was on the introduction of the scale dependent correlation coefficients and on the detection of a long range of conformity. On small scales more complicated patterns are expected and further investigations of the scale dependent conformity should be accompanied by a detailed modelling.
Pure dark matter simulations capture only the gravitational part but allow for a large number of halos and convincing statistics. As shown by Gottlöber & Yepes (2007) and Teklu et al. (2015) there exists a complex interplay between spin, mass and morphology of the dark matter and the gas component within halos. It will be highly interesting to investigate the environmental dependence of such halos using the scale dependent correlation coefficients.
Empirical relations, like the Tully–Fisher or the fundamental plane relation are special correlations between the properties of a galaxy (see e.g. Kelson et al. 2000; Saulder et al. 2013 and references therein). These empirical relations, like the fundamental plane, depend on the amount of substructure in the objects (see Fritsch & Buchert. 1999 for galaxy clusters). Hence one can expect that an extended version of the scale dependent correlation coefficients could be used to investigate the spatial scale dependence of such empirical relations.
As already mentioned a detailed modelling of this signal of conformity is the next step. Purely geometric models, like the toy–model in Appendix B help us to appreciate the method, but often do not promote a physical understanding. Hence clearly more physically motivated models are needed.
Inspired by the ideas of hierarchical structure formation in dark matter models the halo model was designed to explain the clustering of galaxies (see Cooray & Sheth 2002 for a review). The halo model is able to reproduce the signal from the mark weighted correlation function out to 20 Mpc (Skibba et al. 2006, see also Paranjape et al. 2015; Pahwa & Paranjape 2016 for a more detailed model of galactic conformity). Within these models the contribution from the socalled 2–halo term seems necessary to explain conformity on large scales. A physical explanation of galactic conformity from structure formation is given by Hearin et al. (2016), also called assembly bias. Their explanation is elaborated for pair distances below 10 Mpc, but possibly their arguments could be extended to larges scales, too.
Another approach is based on the peak theory (Bardeen et al., 1986). Recently Verde et al. (2014) calculated the Lagrangian (formation) bias for a Gaussian density field. The matter density field can be approximated more reliable using a logarithmic transformation (Falck et al., 2012) which could serve as an improved starting point for such a bias calculation. Closely related to the lognormal density field, the log–normal model for the galaxy distribution (Coles & Jones, 1991; Møller et al., 1998) can be used as a stochastic model for the point and mark distribution. For such an intensity marked point process, the mark correlation functions can be calculated explicitly (Ho & Stoyan, 2008; Myllymäki & Penttinen, 2009). The adaption to the galaxy distribution will reveal whether a natural parametrisation is possible within this model.
Acknowledgements.
I would like to thank Stefan Gottlöber for the hospitality and the discussions at the AIP. I am grateful to Kristin Riebe and Ben Hoyle for support and information on using the CosmoSim– and the SDSS–database respectively. Special thanks to Claus Beisbart and Alex Szalay, some of the ideas for this analysis emerged from discussions now more than ten years ago. For comments on the manuscript I would like to thank Thomas Buchert, Stefan Gottlöber and Volker Müller. I appreciate very much comments by Simon White, who suggested the alternative definition of the scale dependent correlation coefficient in Eq. (7) to me. I would like to thank the anonymous referee for his constructive and helpful comments.Funding for SDSSIII has been provided by the Alfred P. Sloan Foundation, the Participating Institutions, the National Science Foundation, and the U.S. Department of Energy Office of Science. The SDSSIII web site is http://www.sdss3.org/. SDSSIII is managed by the Astrophysical Research Consortium for the Participating Institutions of the SDSSIII Collaboration including the University of Arizona, the Brazilian Participation Group, Brookhaven National Laboratory, Carnegie Mellon University, University of Florida, the French Participation Group, the German Participation Group, Harvard University, the Instituto de Astrofisica de Canarias, the Michigan State/Notre Dame/JINA Participation Group, Johns Hopkins University, Lawrence Berkeley National Laboratory, Max Planck Institute for Astrophysics, Max Planck Institute for Extraterrestrial Physics, New Mexico State University, New York University, Ohio State University, Pennsylvania State University, University of Portsmouth, Princeton University, the Spanish Participation Group, University of Tokyo, University of Utah, Vanderbilt University, University of Virginia, University of Washington, and Yale University. This research made use of the “Kcorrections calculator” service, especially the python code, available at http://kcor.sai.msu.ru/. The CosmoSim database used in this paper is a service by the LeibnizInstitute for Astrophysics Potsdam (AIP). The MultiDark database was developed in cooperation with the Spanish MultiDark Consolider Project CSD200900064. The Bolshoi and MultiDark simulations have been performed within the Bolshoi project of the University of California HighPerformance AstroComputing Center (UCHiPACC) and were run at the NASA Ames Research Center. The Multidark Planck (MDPL) and the BigMD simulation suite have been performed in the Supermuc supercomputer at LRZ using time granted by PRACE. In the numerical analysis Python with scipy and for the plotting R with ggplot2 have been used (Jones et al., 2001–2017; R Core Team, 2015; Wickham, 2009).
Appendix A Samples
a.1 Galaxay catalogues from the SDSS III, DR12
In the SDSS DR12 data release (Alam et al. 2015; Eisenstein et al. 2011) each galaxy comes with a wealth of properties. The galaxy samples for the analysis are built in in two stages. First, a basic galaxy sample is obtained from the SDSS database, then derived quantities are calculated and the volume limited samples are constructed. Our basic galaxy sample was extracted from the SDSS database, as provided via CasJobs: http://skyserver.sdss.org/CasJobs/, using the SQL script shown in Fig. 7. The query starts with the view SpecPhoto and joins it with Galaxy and SpecObj to gain access to further photometric and spectroscopic parameters. The joins with the tables stellarMassPCAWiscM11 / PCAWiscBC03 / stellarMassStarformingPort are used to obtain the stellar mass estimates. In the joins with the stellar mass tables some galaxies could not be matched and 0.11% of the galaxies are lost. The function fCosmoDl provided in the SDSS database is used to calculate the luminosity distance from the redshift, using a Planck–like cosmology consistent with the MultiDark simulations, see appendix A.2. The selection in the where clause is mostly the original selection as used for the SDSS main galaxy sample (Strauss et al. 2002). From this basic sample the following parameters are calculated for each galaxy.

Absolute magnitudes: The absolute magnitude in the –band is calculated from the extinction corrected (dereddened) model magnitude using , with the distance module and the luminosity distance in pc. The absolute magnitude is –corrected using the python code from http://kcor.sai.msu.ru/, version 2012, implementing the methods described in Chilingarian et al. (2010); Chilingarian & Zolotukhin (2012). See also the comparison of several –corrections in O’Mill et al. (2011)

Ellipticities: The ellipticities of the galaxies are calculated from the Stokes parameters and using . The Stokes parameters and have been estimated from the intensity profile of the galaxies in the –band using the adaptive moments and respectively. (Bernstein & Jarvis 2002). This ellipticity is an estimate of the observed 2D–ellipticity on the sky. No attempt is made to derive a 3D/de–projected ellipticity.

Stellar mass content: Using the photometry and the spectra one can estimate the stellar mass content of a galaxy. The following three mass estimates can be retrieved from the SDSS database. They use different stellar population synthesis models and different methods: The table stellarMassPCAWiscM11 provides stellar mass estimates using the method of Chen et al. (2012) with the stellar population synthesis models of Maraston & Strömbäck (2011). These are the stellar mass estimates used for the plots in Fig. 2.
The table stellarMassPCAWiscBC03 provides stellar mass estimates using the method of Chen et al. (2012) with the stellar population synthesis models of Bruzual & Charlot (2003), and the table stellarMassStarformingPort provides stellar mass estimates using the method of Maraston et al. (2006), see also Maraston et al. (2013).Irrespective of the method, is used as a mark in the analysis. is the stellar mass content and the solar mass. Both and the magnitude are logarithmic in mass and luminosity respectively.

Velocity Dispersion: The velocity dispersion inside the galaxy is estimated from the spectra as described in Bolton et al. (2012) and is directly read from the database view SpecObj.
The volume limited samples comprise galaxies with luminosity distance and absolute magnitude . The limiting absolute magnitude is with the (conservative) limiting magnitude and the limiting distance module ; also galaxies close by with luminosity distance Mpc are discarded. Mainly, the volume limited sample with and 201722 galaxies is used, but also samples with Mpc and are considered.
a.2 Halo samples from the MultiDark simulations
The halo catalogues are constructed from the so called MultiDark simulations — dark matter simulations as described in Prada et al. (2012) and Klypin et al. (2016). The MDPL2 and BigMDPL simulations have a box size of 1 Gpc/h and 2.5 Gpc/h respectively, with Planck–like cosmology , , , , . The dark matter halos were identified using the Rockstar halofinder (Behroozi et al. 2013). These halo samples can be downloaded from the CosmoSim database https://www.cosmosim.org/ as described in Riebe et al. (2013).
Fig. 8 shows the SQL–code used to extract one of the desired halo samples from the CosmoSim database. About distinct halos with a virial mass are selected. With snapnum=125 we select the samples and with pId=1 we ask for distinct halos only. The virial mass , the spin , and the shape are used as marks (see below). They can be accessed directly from the database. To facilitate the calculations of the scale dependent correlation functions a random subsample comprising 25% of the halos is used (about halos). A comparison with the results from 10% and 50% subsampling shows that the results for the scale dependent correlation coefficients clearly stabilise for 25% subsampling.

Mass: The mass within the virial radius is calculated from the number of bound particles in the halo. The major task of this phase–space halo finder is to reliably assign the dark matter particles to a halo, using several steps as detailed in Behroozi et al. (2013).

Spin: The dimensionless spin parameter is used to quantify the rotation of galactic systems (see e.g. Fall & Efstathiou 1980).

Shape: The axial ratios of the mass ellipsoid are determined according to the method of Allgood et al. (2006) using the eigenvalues of the (reduced) inertia tensor of the halo. The ratio of the smallest ellipsoid axes to the largest ellipsoid axes is then used as an overall shape parameter .
To investigate systematic effects also halo samples with mass cuts and have been extracted from the MDPL2 and BigMDPL simulations. Also the mass, the spin, and the axes ratios of the ellipsoidal shape determined from FoF–halos have been used (see Riebe et al. 2013 and https://www.cosmosim.org/ for details) .
Appendix B A toy model
The following model is a straightforward extension of the marked Poisson process discussed by Wälder & Stoyan (1996). This model will serve as an illustration that one is able to unambiguously extract a scale from a marked point distribution using the scale dependent correlation coefficient . It is not meant to be a viable model for the galaxy distribution.
One starts with a Poisson process, i.e. randomly distributed points with number density . As suggested by Wälder & Stoyan (1996), one assigns to each point the number of other points within a radius as a mark . This mark is a Poisson random variable with the mean mark and the variance . Therefore the probability of observing points in a sphere with radius is .
As an extension of this model, the second mark on a point is slaved to its first mark by . This construction leads to the covariance and perfect overall correlation . For the Poisson point process, one can calculate the desired scale dependent correlation coefficients easily. The point is marked with as described above. If a second point at is more distant than , the number of points inside the sphere around point is independent from the point at and . Under the condition that the second point at is closer than , at least one point is always in the sphere around the point at . Considering a Poisson point process, all the other points are still independent from this point at . Now the probability of observing points in the sphere around is and . This allows us to calculate
for . is the expectation with respect to the probabilites . Joining the results from above one obtains
(9) 
A similar reasoning allows the calculation of : If the second point at is farther away than , we get and and therefore . If the second point is closer than , one obtains and , and
Putting everything together for all radii . The scale cannot be resolved with .
In Fig. 9 the estimated for the marked Poisson process is compared to the theoretical expectation showing perfect agreement. The jump in is resolved, marking the builtin scale. As it should be is approximately 1 on all scales. This simple model illustrates that a built–in scale in the correlation pattern of the marks can be resolved unambiguously with , whereas the alternative definition does not allow this.
Footnotes
 email: martin.kerscher@lmu.de
 This alternative definition of was suggested to me by Simon White after the first submission of this article.
 The weighted leastsquare fit is performed with the function nls from the statistic package R using the inverse variance of the randomised sample as weights.
References
 Alam, S., Albareti, F. D., Allende Prieto, C., et al. 2015, ApJS, 219, 12
 Allgood, B., Flores, R. A., Primack, J. R., et al. 2006, MNRAS, 367, 1781
 Andreon, S., Davoust, E., & Heim, T. 1997, A&A, 323, 337
 Bardeen, J. M., Bond, J. R., Kaiser, N., & Szalay, A. S. 1986, ApJ, 304, 15
 Behroozi, P. S., Wechsler, R. H., & Wu, H.Y. 2013, ApJ, 762, 109
 Beisbart, C. & Kerscher, M. 2000, Ap. J., 545, 6
 Beisbart, C., Kerscher, M., & Mecke, K. 2002, in Morphology of Condensed Matter Physics and Geometry of Spatially Complex Systems, ed. K. R. Mecke & D. Stoyan, Lecture Notes in Physics No. 600 (Berlin: Springer Verlag), 358–390
 Bernstein, G. M. & Jarvis, M. 2002, AJ, 123, 583
 Binggeli, B., Tarnghi, M., & Sandage, A. 1990, A&A, 1228, 42
 Bolton, A. S., Schlegel, D. J., Aubourg, É., et al. 2012, AJ, 144, 144
 Bruzual, G. & Charlot, S. 2003, MNRAS, 344, 1000
 Chen, Y.M., Kauffmann, G., Tremonti, C. A., et al. 2012, MNRAS, 421, 314
 Chilingarian, I. V., Melchior, A.L., & Zolotukhin, I. Y. 2010, MNRAS, 405, 1409
 Chilingarian, I. V. & Zolotukhin, I. Y. 2012, MNRAS, 419, 1727
 Coles, P. & Jones, B. 1991, MNRAS, 248, 1
 Cooray, A. & Sheth, R. 2002, Phys. Rep, 372, 1
 Desjacques, V., Jeong, D., & Schmidt, F. 2016, ArXiv eprints [\eprint[arXiv]1611.09787]
 Dressler, A. 1980, ApJ, 236, 351
 Eisenstein, D. J., Weinberg, D. H., Agol, E., et al. 2011, AJ, 142, 72
 Falck, B. L., Neyrinck, M. C., AragonCalvo, M. A., Lavaux, G., & Szalay, A. S. 2012, ApJ, 745, 17
 Fall, S. M. & Efstathiou, G. 1980, MNRAS, 193, 189
 Faltenbacher, A., Kerscher, M., Gottlöber, S., & Mueller, V. 2002, Astron. Astrophys., 395, 1
 Fritsch, C. & Buchert., T. 1999, A&A, 344, 749
 Gottlöber, S., Kerscher, M., Kravtsov, A. V., et al. 2002, Astron. Astrophys., 387, 778
 Gottlöber, S. & Yepes, G. 2007, ApJ, 664, 117
 Grabarnik, P., MyllymÃ¤ki, M., & Stoyan, D. 2011, Ecological Modelling, 2324, 3888â3894
 Hamilton, A. J. S. 1988, ApJ, 331, L59
 Hearin, A. P., Behroozi, P. S., & van den Bosch, F. C. 2016, MNRAS, 461, 2135
 Hearin, A. P., Watson, D. F., & van den Bosch, F. C. 2015, MNRAS, 452, 1958
 Ho, L. P. & Stoyan, D. 2008, Statistics & Probability Letters, 78(10), 1194â1199
 Jones, E., Oliphant, T., Peterson, P., et al. 2001–2017, SciPy: Open source scientific tools for Python, [Online; accessed april 22, 2017]
 Kaiser, N. 1984, ApJ, 284, L9
 Kauffmann, G., Li, C., Zhang, W., & Weinmann, S. 2013, MNRAS, 430, 1447
 Kelson, D. D., Illingworth, G. D., Tonry, J. L., et al. 2000, ApJ, 529, 768
 Klypin, A., Yepes, G., Gottlöber, S., Prada, F., & Heß, S. 2016, MNRAS, 457, 4340
 Knebe, A. & Power, C. 2008, ApJ, 678, 621
 Lacerna, I., Contreras, S., González, R. E., Padilla, N., & GonzalezPerez, V. 2017, ArXiv eprints: 1703.10175
 Maraston, C., Daddi, E., Renzini, A., et al. 2006, ApJ, 652, 85
 Maraston, C., Pforr, J., Henriques, B. M., et al. 2013, MNRAS, 435, 2764
 Maraston, C. & Strömbäck, G. 2011, MNRAS, 418, 2785
 Møller, J., Syversveen, A. R., & Waagepetersen, R. P. 1998, Scand. J. Statist., 25, 451
 Myllymäki, M. & Penttinen, A. 2009, Statistica Neerlandica, 63(4), 450
 O’Mill, A. L., Duplancic, F., García Lambas, D., & Sodré, Jr., L. 2011, MNRAS, 413, 1395
 Ostriker, J. P. & Turner, E. L. 1979, ApJ, 234, 785
 Pahwa, I. & Paranjape, A. 2016, ArXiv eprints 1612.00464
 Paranjape, A., Kovač, K., Hartley, W. G., & Pahwa, I. 2015, MNRAS, 454, 3030
 Postman, M. & Geller, M. 1984, ApJ, 281, 95
 Prada, F., Klypin, A. A., Cuesta, A. J., BetancortRijo, J. E., & Primack, J. 2012, MNRAS, 423, 3018
 R Core Team. 2015, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria
 Riebe, K., Partl, A. M., Enke, H., et al. 2013, Astronomische Nachrichten, 334, 691
 Saulder, C., Mieske, S., Zeilinger, W. W., & Chilingarian, I. 2013, A&A, 557, A21
 Sheth, R. K. & Tormen, G. 2004, MNRAS, 350, 1385
 Skibba, R., Sheth, R. K., Connolly, A. J., & Scranton, R. 2006, MNRAS, 369, 68
 Stoyan, D. 1984, Math. Nachr., 116, 197
 Stoyan, D. & Stoyan, H. 1994, Fractals, Random Shapes and Point Fields (Chichester: John Wiley & Sons)
 Strauss, M. A., Weinberg, D. H., Lupton, R. H., et al. 2002, AJ, 124, 1810
 Szapudi, I., Branchini, E., Frenk, C., Maddox, S., & Saunders, W. 2000, MNRAS, 319, L45
 Teklu, A. F., Remus, R.S., Dolag, K., et al. 2015, ApJ, 812, 29
 van der Wel, A., Bell, E. F., Holden, B. P., Skibba, R. A., & Rix, H.W. 2010, ApJ, 714, 1779
 VegaFerrero, J., Yepes, G., & Gottlöber, S. 2017, MNRAS, 467, 3226
 Verde, L., Jimenez, R., Simpson, F., et al. 2014, MNRAS, 443, 122
 Wälder, O. & Stoyan, D. 1996, Biom. J., 38, 895
 Weinmann, S. M., van den Bosch, F. C., Yang, X., & Mo, H. J. 2006, MNRAS, 366, 2
 Wickham, H. 2009, ggplot2: Elegant Graphics for Data Analysis (SpringerVerlag New York)
 Willmer, C., da Costa, L. N., & Pellegrini, P. 1998, AJ, 115, 869
 Zehavi, I., Zheng, Z., Weinberg, D. H., et al. 2011, ApJ, 736, 59