Spectral classification

# Spectral classification and composites of galaxies in LAMOST DR4

Li-Li Wang, A-Li Luo, Shi-Yin Shen, Wen Hou, Xiao Kong, Yi-Han Song, Jian-Nan Zhang, Wu Hong, Zi-Huang Cao, Yong-Hui Hou, Yue-Fei Wang, Yong Zhang, and Yong-Heng Zhao
Key Laboratory of Optical Astronomy, National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100012, China
School of Information Management, Dezhou University, Dezhou 253023, China
University of Chinese Academy of Sciences, Beijing 100049, China
Key Laboratory for Research in Galaxies and Cosmology, Shanghai Astronomical Observatory, Chinese Academy of Sciences,
Shanghai 200030, China
Nanjing Institute of Astronomical Optics & Technology, National Astronomical Observatories, Chinese Academy of Sciences,
Nanjing 210042, China
E-mail: lal@nao.cas.cn
Accepted XXX. Received YYY; in original form ZZZ
###### Abstract

We study the classification and composite spectra of galaxy in the fourth data release (DR4) of the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST). We select 40,182 spectra of galaxies from LAMOST DR4, which have photometric information but no spectroscopic observations in the Sloan Digital Sky Survey(SDSS). These newly observed spectra are re-calibrated and classified into six classes, i.e. passive, H-weak, star-forming, composite, LINER and Seyfert using the line intensity (H, [OIII]5007, H and [NII]6585). We also study the correlation between spectral classes and morphological types through three parameters: concentration index, ( - ) color, and D4000 index. We calculate composite spectra of high signal-to-noise ratio(S/N) for six spectral classes, and using these composites we pick out some features that can differentiate the classes effectively, including H, Fe5015, H, HK, and Mg band etc. In addition, we compare our composite spectra with the SDSS ones and analyse their difference. A galaxy catalogue of 40,182 newly observed spectra (36,601 targets) and the composite spectra of the six classes are available online.

###### keywords:
techniques: spectroscopic – methods: data analysis – galaxies: statistics – catalogs
pubyear: 2015pagerange: Spectral classification and composites of galaxies in LAMOST DR4Spectral classification and composites of galaxies in LAMOST DR4

## 1 Introduction

There are various and complicated phenomena in galaxy formation and evolution. One of the major goals of extragalactic astronomy is to comprehend the nature of the diverse galaxies. The first step towards the goal is to classify them using some criteria and compare their properties among the classes. There are some frequently used criteria for classification: morphology, color and spectral features. The morphological classes include elliptical galaxies, lenticular galaxies, spiral galaxies, barred spiral galaxies and irregular galaxies (Hubble 1936, Lintott et al. 2008, Shimasaku et al. 2001). The classes by color are blue, green-valley and red (Morgan et al. 1957, Strateva et al. 2001, Martin et al. 2007). Based on spectral features of emission lines, the galaxies are mainly divided into star-forming galaxies and AGNs(Baldwin, Phillips & Terlevich. 1981, Kewley et al. 2001, Kauffmann et al. 2003a, Brinchmann et al. 2004, Kewley et al. 2006, Stasinska et al. 2006, Cid Fernandes et al. 2010). Some researchers also use more than one criterion, for example, in the work of Lee et al. (2008), galaxies were classified into 16 classes by morphology, color and spectral features, and Dobos et al. (2012) presented a refined classification using both color and spectral features for galaxies. For the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) survey, we use spectral features in galaxy classification in this paper .

Classifications driven by spectral features are focused on emission-line galaxies. The classical classification scheme pioneered by Baldwin, Phillips & Terlevich. (1981) (dubbed the BPT diagram) has been widely used over the last three decades. Based on the BPT diagnostic diagram, which uses [OIII]5007/H and [NII]6585/H line ratios, the emission-line galaxies are classified into star-forming galaxies, composite galaxies, LINERs and Seyferts. There are several empirical segregation curves on BPT diagrams for classification, such as Kewley et al. (2001)( hereafter K01), Kauffmann et al. (2003a)( hereafter K03) and Kewley et al. (2006)( hereafter K06). The curves defined by K03 and K01 represent the border lines of star-forming galaxies and AGNs, and the K06 criteria is used for separating Seyferts and LINERs. There are some other alternative diagnostic diagrams for classification. For example, in Stasinska et al. (2006), the DEW diagram was proposed to distinguish star-forming galaxies and AGNs, using the D4000 index vs. max(EW[OII], EW[NeIII]). Cid Fernandes et al. (2010) introduced a diagram (named WHAN diagram) based on equivalent width of H vs. the ratio of [NII] and H, which is able to cope with the large population of weak line galaxies in SDSS galaxies.

Galaxies with no or weak signal of emissions are referred to as passive galaxies. In Lee et al. (2008), passive galaxies were selected as galaxies with no or insufficient signal of H emission. In Dobos et al. (2012), passive galaxies were divided into two further classes: completely passive with no detectable emission lines and passive with weak H emission. In this paper, we make a classification scheme for LAMOST spectra referring to Dobos et al. (2012).

Composite spectra have been widely applied in researches of extragalactic objects (Vanden Berk et al. 2001, Eisenstein et al. 2003, Dobos et al. 2012). These high signal-to-noise ratio(S/N) composites reveal variations from general continuum and weak emission features that are rarely detectable in individual spectra. There are two methods for stacking spectra: mean spectrum and median spectrum which respectively obtain optimal measurements of the continuum and the emission lines (Vanden Berk et al., 2001). Specifically, the mean method further includes geometric mean and algebraic mean. The geometric mean is suitable for averaging the continuum of power-law spectra such as quasars and the algebraic mean is appropriate to the continuum of galaxy spectra, of which the continuum is basically a (linear) superposition of the spectra of various stellar populations. However the mean composite can not guarantee to preserve the real emission line ratios in case that the intensities of emission lines vary significantly in spectral bins (Stasinska et al., 2015) or there be noises in some emission lines. The median spectrum is robust against noise and preserves the relative fluxes of the emission features, but it might yield non-physical continua because it treats the spectral bins independently. In addition to the methods above, Yip et al. (2004), Dobos et al. (2012) compute the composites using principal component analysis, simultaneously dealing with noisy and gappy data where certain spectral bins are masked out due to bad observations or other reasons. The idea of the gap-correction process is to reconstruct the missing regions in the spectrum using its principal components.

With the LAMOST spectroscopic survey going on, a large data set is provided to study galaxies in our nearby universe(Luo et al. 2015). Several well-known surveys have been carried out such as the Sloan Digital Sky Survey (SDSS)(York et al. 2000) which observed the largest number of extragalactic targets. In LAMOST DR4, there are tens of thousands of galaxy spectra which have photometric information but no spectroscopic observations in SDSS Data Release Thirteenth, DR13 (Albareti et al., 2016). These newly observed spectra in LAMOST DR4 are our study objects in this paper. The primary goals of this paper are to (1) classify these newly observed spectra by using spectral line features(H, [OIII]5007, H and [NII]6585), presenting a catalogue with classification information and more accurate flux measurements of the nebular emission lines, and (2) compute the composite spectra of different classes to extract typical spectral features, and compare our composites with similar composite spectra from the SDSS DR7 (Dobos et al. 2012).

The outline of this paper is as follows. Section 2 shows the data set we used. Section 3 re-calibrates the fluxes of spectra of galaxies and presents the line measurement of the spectra. In section 4 we describe our classification scheme of galaxies for LAMOST and study the correlation between spectral classes and morphological types. In section 5 we compute the composite spectra of different classes to analyze the global spectral properties, and compare our composites with the SDSS ones. The summary is given in section 6. We assume the cosmological parameters with =100 km , =0.3, =0.7.

## 2 Data

### 2.1 Galaxies in LAMOST DR4

LAMOST is dedicated to a spectroscopic survey that covers celestial objects over the entire available northern sky. The telescope is characterized by both a large field of view and large aperture, with an effective aperture of 3.6–4.9m and 4,000 fibers mounted on its focal plane. Its spectral wavelength ranges from 3800Å to 9000Å  and spectral resolution is about R1800 (Cui et al. 2012, Zhao et al. 2012). The fourth data release (DR4) of LAMOST includes the pilot survey(2011 October to 2012 June) and the regular survey(2012 September to 2016 June), and will be public released in June 2018. The LAMOST spectroscopic classification system classifies the spectra into STAR, GALAXY, QSO and UNKNOWN. Most broad line AGNs are classified as ’QSO’, and some of them are in ’GALAXY’. In this paper, to avoid contaminating our sample with broad line AGNs, we remove the spectra that contain any strong broad line with FWHM>1000km (Vanden Berk et al., 2006). Therefore, a small number(1%) of galaxy spectra are excluded from our analysis.

The footprints of all galaxies in LAMOST DR4 in galactic coordinates are shown in Figure 1. There are two main regions of the extragalactic survey (Luo et al., 2015). One is in the Northern Galactic Cap region( ), and about 77,154 spectra are observed in this region. The other region is the Southern Galactic Cap region( ), and there are about 32,453 spectra. In the remaining region (), a small number of 8,899 spectra are obtained. The statistical result is shown in Table 1.

### 2.2 Sample selection

For the galaxies in LAMOST DR4, about 65% spectra have spectral counterparts in SDSS DR13, while there are 40,182 spectra of 36,601 targets which have photometrical data but no spectroscopic observations in SDSS DR13. We select these newly observed galaxy spectra by LAMOST as our analytical sample in our work, called ‘Main’ sample. In Figure 1, the red points mark the footprints of these spectra. The number of these spectra are detailed in Table 1. We classify these galaxies to archive them in a catalogue, and calculate the composite spectra for different classes comparing with ones from SDSS spectra.

The target selection of LAMOST galaxies is unique in both spatial distribution and specific science aims. The geneal principle is to observe targets that SDSS fibers did not visit. Thus a great many objects in LAMOST input catalogue locate in Southern Galactic Cap region, and the objects in Northern Galactic Cap region are rejected by SDSS due to target density.

In the Northern Galactic Cap region of our input catalogue, there are some targets with highest priority in the area ( and ), where all the galaxies with -band Petrosian magnitude (Galactic reddening corrected) are brighter than r = 17.77 (Shen et al., 2016). These targets are candidates of galaxy pairs. A galaxy pair is typically defined from the projected distance and recessional velocity difference || of two neighbouring galaxies, and it is useful to probe the process of galaxy interactions or galaxy mergers(Barton et al. 2000, Nikolic et al. 2004). A large number of galaxy pairs are identified from SDSS main galaxy sample (Ellison et al. 2008, 2011). However, a small fraction (< 10%) of the SDSS main galaxy sample has not been targeted with spectroscopy due to the effect of fiber collisions. These missed galaxies have a very high probability of being in galaxy pairs. In order to obtain more galaxy pair candidates, these missed galaxies have been compiled into the input catalogue of LAMOST. In our Main sample, 2,859 spectra are from the input catalogue targeted of the SDSS missed candidates of galaxy pairs.

In the Southern Galactic Cap region( ), there is a special survey strategy: The LAMOST Complete Spectroscopic Survey of Pointing Area(LCSSPA)(Lam et al., 2015), which is designed to complete the spectroscopic observations of all Galactic and extra-galactic sources which are selected from SDSS Data Release Nine, DR9(Ahn et al. 2012) using -band psf magnitudes and Petrosian magnitudes between 14.0 < r < 18.1 respectively in two selected fields of 20 square degrees. The central coordinates of the fields are (, ) = (37.88150939, 3.43934500) and (21.525988792, -2.200949833), respectively. In our Main sample, there are 4,493 galaxy spectra targeted from LCSSPA input catalogue. The scientific studies of the observed galaxies include galaxies, clusters of galaxies, and luminous infrared galaxies etc.

### 2.3 Properties of the Main Sample

Figure 2 displays the distribution features of our Main sample, including distributions of Petrosian magnitudes in , and band, redshift and signal-to-noise ratio as a function of wavelength. The median values of magnitudes in , and band are 18.0, 17.2, and 16.7 respectively. The mean redshift is 0.091 and about 56.9% of galaxies have redshift z 0.091, which is caused by target selection of LAMOST observations. The third panel presents the variation of S/N along with wavelength for three magnitude bins in band: [15.5,16.5), [16.5,17.5), [17.5,18.5). The median values of S/N in band for the three magnitude ranges are 17, 13, 7 respectively.

## 3 Line intensity measurements

Precise line intensity measurement plays the key role in the following classification since the method we use is based on spectral line features. As is well known, galaxies display a very rich stellar absorption-line spectrum. Although late-type galaxies tend to have stronger emission lines and spectra dominated by hotter, more featureless stars, stellar Balmer absorption can still be substantial( 2-4Å at H) (Kauffmann et al., 2003b). So the emission lines should be measured after the stellar absorption features subtracted. To address this problem, we can use population synthesis models by Bruzual & Charlot (2003), hereafter BC03, to fit the continuum using a non-negative linear-least-squares routine. The fitting procedure automatically accounts for weak metal absorption under the forbidden lines and for Balmer absorption (Bruzual & Charlot, 2003).

Population synthesis methods require accurate continua of galaxy spectra. However, there are some uncertainties in the shape of the continua of LAMOST spectra caused by the current method of relative flux calibration (Luo et al., 2015), which may lead to an unprecise result of continuum-fitting. For the current used flux calibration of LAMOST spectra, some F type dwarfs with high quality spectra are chosen as standards to get the total response curve and the reddening of the standard stars is uncertain. So the extinction uncertainty of the standard stars might induce errors to the derived response curve, and thus leads to some uncertainty in flux calibration. And then it may bias the continuum shapes of the resulting spectra, especially severely effecting the spectra in areas with low galactic latitude. A check on the uncertainty of flux calibration is made by comparing synthetic magnitudes of LAMOST spectra with SDSS photometric magnitudes of the , , bands( after correcting the Galactic extinction (Schlegel et al., 1998)). Figure 3 shows the color differences between them along with the galactic latitudes. The median color difference in high latitude() is found to be -0.1 and -0.08, and in low latitude() the median -0.16 and -0.11. The differences in both colors suggests that LAMOST spectra are bluer than SDSS photometry, and as the galactic latitude is lower, the difference is more and more greater. The most probable cause is that the uncertainty of reddening value of the standard stars contributes to some uncertainty in response curves(Du et al., 2016). And the lower latitude the spectra lie in, the more uncertainty the response curves have. So in order to obtain more accurate line intensity measurements, we re-calibrate the spectra of galaxies in LAMOST to correct the shape of continua of spectra.

### 3.1 Re-calibration of LAMOST spectra

We re-calibrate LAMOST fluxes using the , , and fiber magnitudes by cross-matching our sample with SDSS DR13 photometric catalogue. The original fluxes of LAMOST released spectra are corrected to the fluxes which are corresponding to photometric magnitudes by a low-order polynomial. To quantify the accuracy of our re-calibration, we calculate the difference between randomly selected 5,000 re-calibrated spectra of LAMOST and their SDSS spectral counterparts in observation wavelength. Before comparison, both spectra of LAMOST and SDSS are normalized by the median flux in the 4600-4800Å region that is chosen to be devoid of strong emission lines, and rebinned to 1Å. The difference is measured by the ratio between fluxes of LAMOST spectra and corresponding SDSS ones. In Figure 4, the upper panel shows the comparison of the fluxes of LAMOST and SDSS, and the lower panel displays the comparison of the median flux error of LAMOST and SDSS. On the upper panel, we can see the mean ratios (red solid line) are around 1.0 for the whole spectral wavelength coverage except for some region of sky emission lines and telluric bands attributed to the uncertainties of flat-fielding and sky-subtraction. The standard deviation (red dash curve) is less than 7% in the wavelength range from 4,500 Å to 8,000 Å , and increases to 9% in the band below 4,500 Å due to the rapid decline of the instrumental throughput. On the lower panel, the median errors in each wavelength bin of LAMOST flux are greater than those of SDSS, showing that LAMOST flux errors contribute more to the flux ratio.

### 3.2 Line measurements

After the re-calibration, we correct the Galactic extinction using the reddening maps of Schlegel et al. (1998) and the extinction law of Cardelli, Clayton & Mathis (1989). And then the spectra are shifted into the rest frame.

As to measuring the line intensity, we use the method described as follows.

(i) Fit the stellar absorption features and continua of the galaxies. We adopt the basic assumption that any galaxy star formation history can be approximated as a sum of discrete bursts. So the spectrum of a galaxy is supposed to be a linear combination of individual stellar spectra of various types taken from a comprehensive library. Here we use a stellar population synthesis program called STARLIGHT (Cid Fernandes et al., 2005), which includes a library of template spectra with different ages and metallicities from the evolutionary synthesis models of BC03. We use three sets of templates, which host three metallicities(0.5, 1, 2.5 ). Each set includes 11 different ages(0.005, 0.025, 0.1, 0.2, 0.6, 0.9, 1.4, 2.5, 5, 10, 13 Gyr). We fit galaxies with templates of single-metallicity populations (different ages) and obtain a best-fitting model spectrum that yields the minimum .

(ii) Subtract the best-fitting stellar population model. The subtracted spectrum consists of three components(Beck et al., 2016): the emission lines, the noise and a slowly changing background that originates from the imperfect models. Since the emission lines and noise are high-frequency components, the background can be easily eliminated by a high-pass filter. We remove the possible retained background with a sliding 200 pixel median.

(iii) Fit the lines H, [OIII]5007, H and [NII]6585 on the residual spectrum of step (ii), which are mainly used in classification. Gaussians are used to simultaneously fit lines with automatically adjusting of the centers and widths to avoid deviation caused by redshift measurement. H, [OIII]5007 are fitted with single Gaussian respectively. The H and [NII]6585 are fitted with three Gaussians because the three lines [NII]6549, H and [NII]6585 are blended in some galaxies, especially in LINER and Seyfert galaxies. Extensive visual inspection suggests that our line fitting method works well.

The MPA/JHU group provides publicly catalogue for flux measurements of the nebular emission lines of SDSS galaxies. It adopts a similar approach (Tremonti et al., 2004) as our method. To test the reliability of our method of line intensity measurements, we compare the fluxes of nebular emission lines from MPA/JHU catalogue with our own estimations for SDSS galaxies. Figure 5 shows comparisons between the fluxes of H, [OIII]5007, H and [NII]6585 measured by our code and those obtained from the MPA/JHU catalogue. The differences of all the four lines suggest that there is good agreement. The fluxes measured by our method are slightly larger than those of MPA/JHU catalogue( the median offset is 1-2 per cent), and the largest scatter (8 per cent) is found for the fluxes of H, probably due to different stellar population models for the estimates of underlying stellar absorption.

In addition, we compare the line intensities between LAMOST spectra reduced by our code and their SDSS counterparts obtained from the MPA/JHU catalogue. Table 2 shows the relative differences of the line intensities of H, [OIII]5007, H and [NII]6585. From the table we can see that there are no significant differences of line intensities between LAMOST and SDSS, and this means that the LAMOST measurements and reduction process could reproduce the SDSS results although there exist small differences.

## 4 Classification of galaxies

H line is the best quantitative indicator of the star formation rate in galaxies (Kennicutt 1992). Based on H together with other three lines: H, [OIII]5007 and [NII]6585, we classify the galaxies into six classes: passive, H-weak, star-forming, composite, LINER and Seyfert, similar to the work of Dobos et al. (2012), but with different boundaries of classes. The passive, H-weak galaxies are galaxies with no detectable emission lines or weak H emission, while the other classes are emission-line galaxies.

### 4.1 Classification schema

First, we separate emission-line galaxies by the criterion that H emission is detected at greater than 3 . For these emission-line galaxies, we use BPT diagram to classify them based on H, [OIII]5007, H and [NII]6585. Note that we only use 3 as the cut instead of 3 for all the four lines. In Cid Fernandes et al. (2010), they presented if they adopt a uniform cut 3 for all the four lines, many weak line galaxies that have H and/or [OIII] below the threshold( 3) are ignored in BPT diagram. So only the cut for H is kept in our classification schema.

The emission-line galaxies are divided into star-forming galaxies, composite galaxies, LINERs and Seyferts based on the BPT diagram. There are several widely used empirical segregation curves on BPT diagrams for classification, such as K01, K03 and K06. We use K03 and K01 as the border lines of star-forming galaxies and AGNs.

The empirical segregation curve of K03 is given in Equation (1). The star-forming galaxies locate below this curve.

 log10([OIII]/Hβ)=0.61/[log10([NII]/Hα)−0.05]+1.3. (1)

The empirical segregation curve of K01 is given in Equation (2). The pure-AGNs locate above this curve. The composite galaxies lie between K03 and K01.

 log10([OIII]/Hβ)=0.61/[log10([NII]/Hα)−0.47]+1.19. (2)

To distinguish Seyferts from LINERs, we use an alternative dividing line to K06. In K06, the emission-line ratios [OIII]/H versus [OI]/H or [OIII]/H versus [SII]/H are used to define Seyferts and LINERs. This means we should use extra lines [OI] or [SII] to classify Seyferts and LINERs besides H, H, [OIII] and [NII], which might loss more emission-line galaxies(Cid Fernandes et al., 2010). Cid Fernandes et al. (2010) transformed the K06 classification scheme into a simpler, more economic criterion (hereafter CF10). They still used H, H, [OIII] and [NII] to classify Seyferts and LINERs without using extra emission lines. The border line is shown as Equation (3).

 log10([OIII]/Hβ)=1.01∗log10([NII]/Hα)+0.48. (3)

In our work, we use Equation (3) as the segregation curve to separate Seyferts from LINERs. This criterion is not only more economic than K06, but also avoid ambiguous classification in K06 involving more than one diagnostic diagrams (Cid Fernandes et al., 2010).

Emission-line galaxies in our Main sample are classified into star-forming galaxies, composite galaxies, LINERs and Seyferts using BPT diagram, shown in Figure 6. The density plot is the number density of galaxies in our Main sample, and the three boundary lines are: the red solid line, blue dash line, and green dot dash line represent Equation (1), (2), and (3) respectively. The red stars mark the loci of composite spectra of our Main sample described in Section 5.3. The cyan triangles are the loci of the composite spectra of SDSS, as seen in Section 5.4 in detail.

Passive galaxies and H-weak galaxies are selected from all galaxies by 3. We consider the passive galaxies as galaxies with completely no H emission. The H-weak galaxies have measurable, but weak H emission.

In brief, we classify our sample into six classes.

(1) Passive galaxies: no measurable H emission.

(2) H-weak: H emission is measurable, but weak, not strong enough for classification using the BPT diagram.

(3) Star-forming galaxies: below the red line defined by Equation (1) in Figure 6. Galaxies in this class are with evident star formation signatures but no active nucleus.

(4) Composite galaxies: between the red line defined by Equation (1) and the blue line defined by Equation (2) in Figure 6. Spectra of this class are mixture of the features of AGN and star formation galaxy.

(5) LINER galaxies: above the blue line defined by Equation (2) and below the green line defined by Equation (3) in Figure 6. Galaxies in this class are AGNs in low ionization nuclear emission-line regions.

(6) Seyfert galaxies: between the blue line defined by Equation (2) and the green line defined by Equation (3) in Figure 6. Galaxies in this class are Seyfert II with narrow emission lines.

### 4.2 Classification results

The number and percentage of each galaxy class obtained with the above classification scheme are shown in Table 3. In our Main sample, the number of H-weak galaxies is the highest, while the number of LINER galaxies is the lowest. The distribution of percentage is a little different to the result of SDSS galaxies classes in Dobos et al. (2012): the percentages of emission-line galaxies (star-forming galaxies, composite galaxies, LINERs and Seyferts) in our classification are higher than those of Dobos et al. (2012), because we use the cut of emission-line galaxies only by 3 for H, while Dobos et al. (2012) uses 3 for the four lines. The ratio of Seyferts to LINERs is smaller than that in Dobos et al. (2012), which may be caused by the different methods of Seyfert/LINER separation. In Dobos et al. (2012), they used LINER/Seyfert separation presented by K06: [OIII]/H versus [OI]/H. If we classify the AGNs in our sample to Seyferts and LINERs by the same method as Dobos et al. (2012), the ratio of Seyferts to LINERs is simlar to Dobos et al. (2012).

### 4.3 Correlation with morphological class

Hubble type of a galaxy closely correlates with its spectrum (Kennicutt 1992). In Kennicutt (1992), he provided spectra of a set of 55 nearby galaxies with known Hubble types. The set contained all Hubble types, from giant ellipticals to dwarf irregulars. He presented that the elliptical galaxies are dominated by absorption features and nebular emission lines are absent or weak, while in Sbc–Sc galaxy spectra the principal nebular emission lines are apparent with intensities which are characteristic of star-forming HII regions. In this section, we roughly correlate our spectral classes with morphological classes quantitatively.

There are some frequently used parameters sensitive to morphology, such as concentration index(Shimasaku et al. 2001, Strateva et al. 2001, Park & Choi 2005), color(Morgan et al. 1957, Strateva et al. 2001), and D4000 index (Kauffmann et al. 2003c, Mateus et al. 2006). The concentration index is defined to be the ratio of the radii containing 90 and 50 percent of the Petrosian flux in the band (). Shimasaku et al. (2001) found a strong correlation between and morphological types and suggested 3 as a criterion of morphological classification of galaxies. However, they also noted that it is difficult to construct a pure early-type galaxy sample based only on the concentration index, since the resulting sample has 20 percent contamination by late-type galaxies. Strateva et al. (2001) have shown that there is a good correspondence between and Hubble type: elliptical galaxies have values around 3 and disc-dominated galaxies have values around 2–2.5, with 2.6 marking the boundary between early(elliptical and lenticular) and late types(spiral and irregular). They also analyzed the ( - ) color which is a more conventional estimator of galaxy types, finding that ( - ) color=2.22 clearly separated early and late morphological type. 4000 Å break is another parameter that can be used to classify early and late type galaxies, which is small for galaxies with younger stellar populations, and large for older galaxies(Kauffmann et al. 2003b, Mateus et al. 2006). Here we compute the three indicators of spectra in different classes and analyze the correlation with spectral classes and the morphological types.

We compute the concentration index of spectra in different classes using the values of petroR90 and petroR50 in the band which are cross-matched with the photometric catalogue of SDSS. The magnitudes of and band are also obtained from SDSS to compute the ( - ) color. For the 4000Å break, we use the narrow definition D4000 introduced by Balogh et al. (1999), which is the ratio of the average flux density in the bands 3850-3950 Å and 4000-4100 Å. In Figure 7, we show the distributions of concentration index, ( - ) color, D4000 index for each spectral class in our galaxy sample: passive, H-weak, star-forming, composite, LINER and Seyfert. Table 4 lists the median values of these three parameters for different spectral classes. From Figure 7 and Table 4, the star-forming galaxies are bluer, less concentrated and have smaller D4000 index; on the contrary, passive galaxies are redder, more concentrated and have larger D4000 index. We also note that the other four classes occupy an intermediate locus in color, concentration index and D4000 index distributions, showing the mix of morphological classes.

In Figure 7, the black dash line displays an optimal value that separates those two extreme classes: star-forming and passive galaxies. Following Mateus et al. (2006), we define two parameters: reliability and completeness, and then we find the optimal value that maximizes the product of reliability and completeness. The reliability is the fraction of galaxies from a given spectral class that are correctly classified by using the optimal value, and the completeness is the fraction of all galaxies given spectral class that are actually selected. , , and represent the reliability and completeness for star-forming galaxies and passive galaxies, respectively. We find that = 2.55, ( - ) = 2.3 and D4000=1.5 are the optimal separators among star-forming galaxies and passive galaxies, which are close to the values obtained by Mateus et al. (2006). The reliability and completeness of these separators are shown as Table 5. In addition, these three boundary values are also close to the values given in previous literatures to separate early types and late types( Strateva et al. 2001, Kauffmann et al. 2003a,b).

### 4.4 Galaxy catalogue

We present a catalogue of galaxies for our Main sample, which have photometric information without spectroscopic observations in SDSS. One aim of our work is to provide a catalogue for the newly observed galaxies in LAMOST, which have no spectral observations in SDSS. We give the fluxes of nebular emission lines and the classification information in this catalogue. There are 40,182 entries (36,601 targets) in the catalogue, and Table 6 lists a part of this catalogue. For each item in Table 6, there are five kinds of information: basic information retrieved from the LAMOST catalogue; Petrosian magnitudes cross-matched with the photometric catalogue of SDSS; three parameters of correlation with morphological types; the fluxes of lines we computed in Section 3; and spectral class using our classification scheme. Note that, the morphological types are not directly given, instead, three parameters related to the morphological types: concentration index, ( - ) color, D4000 index are provided. Records are marked with ‘’ at the upper right corner of the designation if the spectra are from a subsample in Northern Galactic Cap which includes 2,859 spectra complementary to SDSS main galaxy sample in the study of galaxy pairs. While records marked with ‘’ represent the spectra targeted from The LAMOST Complete Spectroscopic Survey of Pointing Area at Southern Galactic Cap, including 4,493 spectra. The complete catalogue is available at http://sciwiki.lamost.org/downloads/wll.

## 5 Composite spectra

It is generally known that the composite spectra are of high S/N, which are suitable to detect spectral features. And we can also use composites as galaxy classification templates. In this section, we calculate the composite spectra of different classes to analyze the global continuum and spectral features, and further explore what spectral features are sensitive to different classes. We use median calculation to generate the composite spectra, because median spectrum can preserve the relative fluxes of the emission features and not alter the emission-line ratios (Vanden Berk et al. 2001), which is very important for classification.

### 5.1 Constructing the Composite spectra

We select spectra in our Main sample using the redshift and absolute magnitude cuts in order to eliminate the evolution effect and Malmquist bias to the composite spectra. The redshift and magnitude ranges for each class in our volume-limited sample are listed in Table 7. The redshift and absolute magnitude ranges are the same as Dobos et al. (2012).

We pre-process the spectra before calculating the composites of different classes. First, the spectra are normalized by the median flux in the 4600-4800Å region in which there are no strong emission lines. And then spectra are shifted to rest frame and rebinned to 1Å. The final median composite spectra are plotted for different classes in Figure 8. In each panel, the yellow shaded spectra are random selected spectra from the class and the red spectrum is the composite spectrum of galaxies in this class. From this figure, the composite spectrum for passive has no emission lines totally, and the composite spectrum for H-weak has weak H emission, no H, [OIII], and [NII] emission lines. The other four composite spectra of emission-line classes have emissions with different intensities. In general, the composite spectrum for LINER is similar to the one of H-weak except for the [OIII] emission presence in LINER’s composite spectrum. In fact some ‘retired galaxies’ (RGs) lie in the region of LINER on BPT diagram (Cid Fernandes et al., 2011), i.e. galaxies that have stopped forming stars and are ionized by their hot low-mass evolved stars. Cid Fernandes et al. 2011 deemed that RGs and passive galaxies are very similar objects.

The error of the median composite spectrum is calculated using the method of Vanden Berk et al. 2001, which is computed by dividing the 68% semi-interquartile range of the flux densities by the square root of the number of spectra contribution to each bin. The wavelength, flux and uncertainty of flux of median spectra are available online.

### 5.2 The continuum

We fit the continua of composite spectra with population synthesis models of BC03 via the same method described in Section 3.2. For each composite spectrum, a best-fitting model is determined by best linear combination of models. Using the best-fitting models, we extend the wavelength coverage of our composite spectra in order to avoid the limited wavelength coverage because of the redshift range of our spectra sample. With these extended composites, we can explore more global features. And we can also use these composites with wider wavelength as templates for spectral classification in LAMOST.

### 5.3 Emission and absorption lines

The high S/N of the composites allows us to detect emission and absorption features. Most emission lines and absorption lines are identified in the composite spectra. We present the emission-line equivalent widths of all composites in Table 8, which are measured using the method described in Section 3.2 and the continua fits in Section 5.2. The BPT diagram for the composites of the star-forming galaxies, composite galaxies, LINERs and Seyferts are in Figure 6 with red star markers. The four markers locate their corresponding classes, which suggests the median method for composites does not alter the line ratios.

We calculate absorption-line indices of the composite spectra, which are summarized in Table 9. These indices are adopted from Lick indices (Worthey et al. 1994; Worthey G. & Ottaviani D. L. 1997), BH indices( Huchra et al. 1996), DTT indices( Diaz et al. 1989), and D4000( Balogh et al. 1999). The values for atomic indices are expressed in angstroms of equivalent width, while those for molecular indices are expressed in magnitude.

Using the absorption line indices, we can explore what other features can separate different classes of galaxies besides the four emission features we use in classification schema. Figure 9 shows the capability of line indices to differentiate the spectral class. It illustrates that some spectral lines are sensitive to the galaxy classes: G4300, Fe4383, H, Fe5015, Mgb, H, H, which are measured as atomic absorption lines, and CN, CN, Mg, Mg, CNB, HK, G, MH, which are molecular bands. On one hand, these features are good class tracers so that they can be taken into account for spectral classification. On the other hand, these features can be used for comprehending the physical parameters of different classes such as age and metallicity. Fe5015, Mgb, Mg, Mg are metal-sensitive indicators according to the definitions in Bruzual & Charlot (2003) and Thomas et al. (2003b), where three indices are defined as metal-sensitive indices: [MgFe], [MgFe] and [MgFe]. The three indices of six composite spectra are calculated, which suggest that the metallicity of composite spectrum of star forming galaxies is the lowest, and that of passive galaxies is the highest. Age-sensitive indices H, H and H (Bruzual & Charlot 2003,Gallazzi et al. 2003)are significantly different for six galaxy classes in Figure 9. The composite spectrum of star forming has the lowest value of H index, while the one of passive has the highest value, which suggests that the age of galaxies is older as H index is larger. Besides these indicators, there are some other lines(G band, CN, CN, CNB, HK, MH) with obvious difference among classes in Figure 9, which are potential indicators for the study of stellar population.

### 5.4 Comparison with composite spectra of SDSS

In Dobos et al. (2012), they classified spectra of galaxies in SDSS DR7 into the six classes similar to ours, and also presented a set of composite spectra of different classes. In this section, we compare our composite spectra with theirs. The comparison of composites from LAMOST and SDSS is shown in Figure 10. The spectra in black are from LAMOST, and red are from SDSS. From this figure, the continua and spectral lines of the two sets of composites are slightly different.

The continua of LAMOST composites can be compared with the SDSS ones by color-color diagram. We compute the synthetic magnitudes of the composites in SDSS , , bands. Figure 11 displays the comparison of the and color-color loci of LAMOST composites and SDSS composites. The composites of LAMOST spectra are bluer than those of SDSS by about 0.02 mag in color, while in color the LAMOST composites are redder than SDSS ones by about 0.01 mag. These color differences mainly come from calibration effects.

The relative difference of equivalent widths of main emission lines and absorption-line between LAMOST composites and SDSS ones is summarized in Table 10. From the table, we can see the emission-line equivalent widths of LAMOST composites are less than those of SDSS within 40%, and the absorption-line indices of LAMOST composites are less than those of SDSS within 10%. These differences agree with the comparison between their averaging method for composites and median composites in Dobos et al. (2012) (seen Figure 15, 16 in their paper). In addition, on BPT diagram in Figure 6, we plot SDSS composites of four classes: star-forming, composite, LINER and Seyfert with the cyan triangle markers. The differences of line ratios are small except for the composites of LINERs, which might be due to the different methods of classifying Seyferts and LINERs used to calculate the composite spectra.

## 6 Summary

One goal of this paper is to provide a spectral classification of galaxies in LAMOST DR4 according to spectral line features and present a catalogue with these classification information and more accurate flux measurement of the nebular emission lines. From all spectra in LAMOST DR4, we focus on 40,182 spectra of 36,601 targets that have photometric information but no spectroscopic observations in SDSS DR13. Emission line is a key separation of galaxies, and accurate measurement of emission line intensities requires subtracting the best-fitting stellar population model. To overcome the error of population synthesis caused by the uncertainty of the continua, re-calibration of galaxy spectra in our sample is implemented through SDSS DR13 photometry. We then classify the galaxies into six classes: passive, H-weak, star-forming, composite, LINER and Seyfert based on four well measured lines: H, [OIII]5007, H and [NII]6585. A preliminary analysis for the results of classification is carried out, including statistical distributions, correlation with morphological types by three parameters: concentration index(), ( - ) color, D4000 index. A galaxy catalogue with classification information is provided. From the catalogue, we can also obtain the spectra of two special subsamples: a subsample in Northern Galactic Cap which includes 2,859 spectra complementary to SDSS main galaxy sample in the study of galaxy pairs, and the other subsample including 4,493 spectra targeted from The LAMOST Complete Spectroscopic Survey of Pointing Area at Southern Galactic Cap. In this work, we have presented a glimpse of classification of galaxies by spectral lines, and following work will bring more insight to physical properties of these different classes.

The other goal is to create a set of composite spectra for various galaxy classes from LAMOST spectra. The continua of the composite spectra are fitted with stellar population synthesis models to extend the wavelength coverage. From the spectral features of composites, we extract some features sensitive to classes such as H, Fe5015, H, HK, and Mg band, and investigate the correlation of some features with age and metallicity of each class. The comparison of our composite spectra with SDSS ones (Dobos et al., 2012) indicates that they are roughly in agreement except for the emission line regions.

## Acknowledgements

We thank anonymous referees for valuable suggestions and comments. Thanks for helpful discussions from Du Bing toward the flux re-calibration and drawing Figure 4. And thanks Budavári et al. for the helps about their code in Budavári et al. (2009). This work is supported by the National Key Basic Research Program of China (Grant No. 2014CB845700), and the National Natural Science Foundation of China (Grant Nos. 11390371 and 11233004).

The Guo Shou Jing Telescope (the Large Sky Area Multi-Object Fiber Spectroscopic Telescope, LAMOST) is a National Major Scientific Project built by the Chinese Academy of Sciences. Funding for the project has been provided by the National Development and Reform Commission. LAMOST is operated and managed by National Astronomical Observatories, Chinese Academy of Sciences.

## References

• Albareti et al. (2016) Albareti F.D., Prieto C.A., Almeida A., et al., 2016, arXiv:1608.02013
• Abazajian et al. (2009) Abazajian K. N., Adelman-McCarthy J. K., Agüeros M. A., et al., 2009, ApJS, 182, 543
• Ahn et al. (2012) Ahn C. P., Alexandroff R., Allende Prieto C., et al., 2012, APJS, 203, 21
• Baldwin, Phillips & Terlevich. (1981) Baldwin J. A., Phillips M. M., Terlevich R., 1981, PASP, 93, 5
• Balogh et al. (1999) Balogh M. L., Morris S. L., Yee H. K. C., Carlberg R. G., Ellingson E., 1999, ApJ, 527, 54
• Barton et al. (2000) Barton E. J., Geller M. J., Kenyon S. J., 2000, ApJ, 530, 660
• Beck et al. (2016) Beck R., Dobos L., Yip C.-W., Szalay A. S., Csabai I., 2016, MNRAS, 457,362
• Brinchmann et al. (2004) Brinchmann J., Charlot S., White S. D. M., et al., 2004, MNRAS, 351, 1151
• Bruzual & Charlot (2003) Bruzual G., Charlot S., 2003, MNRAS, 344, 1000
• Budavári et al. (2009) Budavári T., Wild V., Szalay A.S., Dobos L., Yip C.W., 2009, MNRAS, 394, 1496
• Cardelli, Clayton & Mathis (1989) Cardelli J.A., Clayton, G.C., & Mathis J., 1989, ApJ 345, 245
• Cid Fernandes et al. (2005) Cid Fernandes R., Mateus A., Sodré L., Stasińska G., Gomes J. M., 2005, MNRAS, 358, 363
• Cid Fernandes et al. (2011) Cid Fernandes R., Stasińska G., Mateus A., Vale Asari N., 2011, MNRAS, 413, 1687
• Cid Fernandes et al. (2010) Cid Fernandes R., Stasińska G.,Schlickmann M.S., et al., 2010, MNRAS, 403, 1036
• Cui et al. (2012) Cui X.Q., Zhao Y.H., Chu Y.Q., et al., 2012, RAA (Research in Astronomy and Astrophysics), 12, 1197
• Diaz et al. (1989) Diaz A. I., Terlevich E., Terlevich R., 1989, MNRAS, 239, 325
• Dobos et al. (2012) Dobos L., Csabai I., Yip C.W., et al., 2012, MNRAS, 420, 1217
• Dressler et al. (2004) Dressler A., Oemler A., Jr., Poggianti B. M. et al., 2004, ApJ, 617, 867
• Du et al. (2016) Du B., et al., 2016, APJS, 227, 27
• Eisenstein et al. (2003) Eisenstein D. J., Hogg D. W., Fukugita M., et al., 2003, ApJ, 585, 694
• Ellison et al. (2008) Ellison S. L., Patton D. R., Simard L., McConnachie A. W., 2008, AJ, 135, 1877
• Ellison et al. (2011) Ellison S. L., Patton D. R., Mendel J. T., Scudder J. M., 2011, MNRAS, 418, 2043
• Gallazzi et al. (2003) Gallazzi A., Charlot S., Brinchmann J., White S. D. M., Tremonti C. A., 2005, MNRAS, 362, 41
• Hubble (1936) Hubble E. P., 1936, in Realm of the Nebulae ( New Haven: Yale Univ. Press)
• Huchra et al. (1996) Huchra J. P., Brodie J. P., Caldwell N., Christian C., Schommer R., 1996, ApJS, 102, 29
• Kauffmann et al. (2003a) Kauffmann G., Heckman T. M., Tremonti C. A., et al., 2003, MNRAS, 346, 1055
• Kauffmann et al. (2003b) Kauffmann G., Heckman T. M., White S.D.M., et al., 2003, MNRAS, 341, 33
• Kauffmann et al. (2003c) Kauffmann G., Heckman T. M., White S.D.M., et al., 2003, MNRAS, 341, 54
• Kennicutt (1992) Kennicutt R. C., Jr., 1992, ApJS, 79, 255
• Kewley et al. (2001) Kewley L. J., Dopita M. A., Sutherland R. S., Heisler C. A., Trevena J., 2001, ApJ, 556, 121
• Kewley et al. (2006) Kewley L. J., Groves B., Kauffmann G., Heckman T., 2006, MNRAS, 372,961
• Lam et al. (2015) Lam M.I., Wu H., Yang M., et al., 2015, RAA (Research in Astronomy and Astrophysics), 15, 1424
• Lee et al. (2008) Lee J. H., Lee M. G., Park C., Choi Y.Y., 2008, MNRAS, 389, 1791
• Lintott et al. (2008) Lintott C. J., Schawinski K., Slosar A., et al., 2008, MNRAS, 389, 1179
• Luo et al. (2015) Luo A.L., Zhao Y.H., Zhao G., et al., 2015, RAA (Research in Astronomy and Astrophysics), 15, 1095
• Martin et al. (2007) Martin D.C., Wyder T.K., Schiminovich D., et al., 2007, ApJS, 173, 342
• Mateus et al. (2006) Mateus A., Sodré L., Cid Fernandes R., et al., 2006, MNRAS, 370, 721
• Morgan et al. (1957) Morgan W. W., Mayall N. U., 1957, PASP, 69, 291
• Nikolic et al. (2004) Nikolic B., Cullen H., & Alexander P., 2004, MNRAS, 355, 874
• Park & Choi (2005) Park C., & Choi Y.Y., 2005, ApJ, 635, L29
• Pieri et al. (2014) Pieri M. M., Mortonson M. J., Frank S., et al., 2014, MNRAS, 441, 1718
• Schlegel et al. (1998) Schlegel D. J., Finkbeiner D. P., Davis M., 1998, ApJ, 500, 525
• Shen et al. (2016) Shen S.Y., Argudo-Fernández M., Chen L., et al., 2016, RAA (Research in Astronomy and Astrophysics), 16, 43
• Shimasaku et al. (2001) Shimasaku K., Fukugita M., Doi M., et al., 2001, AJ, 122, 1238
• Stasinska et al. (2006) Stasińska G., Cid Fernandes R., Mateus A., Sodré L., Asari N. V., 2006, MNRAS, 371, 972
• Stasinska et al. (2015) Stasińska G., Costa-Duarte M. V., Vale Asari N., Cid Fernandes R., Sodré L., 2015, MNRAS, 449, 559
• Strateva et al. (2001) Strateva I., Ivezić ., Knapp G. R., et al., 2001, AJ, 122,1861
• Thomas et al. (2003b) Thomas D., Maraston C., Bender R., 2003b, MNRAS, 339, 897
• Tremonti et al. (2004) Tremonti C. A., Heckman T. M., Kauffmann G., et al., 2004, ApJ, 613, 898
• Vanden Berk et al. (2001) Vanden Berk D. E., Richards G. T., Bauer A., et al., 2001, AJ, 122, 549
• Vanden Berk et al. (2006) Vanden Berk D. E., Shen J., Yip C. W., et al., 2006, AJ, 131, 84
• Worthey et al. (1994) Worthey G., Faber S. M., Gonzalez J. J., Burstein D., 1994, ApJS, 94, 687
• Worthey G. & Ottaviani D. L. (1997) Worthey G., Ottaviani D. L., 1997, ApJS, 111, 377
• Yip et al. (2004) Yip C. W., Connolly A. J., Vanden Berk D. E., et al. 2004, AJ, 128, 585
• York et al. (2000) York D. G., et al. 2000, AJ, 120, 1579
• Zhao et al. (2012) Zhao G., Zhao Y.H., Chu Y.Q., et al., 2012, RAA (Research in Astronomy and Astrophysics), 12, 723
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters