RAVE 2nd data release

# The Radial Velocity Experiment (RAVE): second data release

T. Zwitter 1 1affiliation: University of Ljubljana, Faculty of Mathematics and Physics, Ljubljana, Slovenia , A. Siebert 2 2affiliation: Astrophysikalisches Institut Potsdam, Potsdam, Germany 3 3affiliation: Observatoire de Strasbourg, Strasbourg, France , U. Munari 4 4affiliation: INAF, Osservatorio Astronomico di Padova, Sede di Asiago, Italy , K. C. Freeman 5 5affiliation: RSAA, Australian national University, Canberra, Australia , A. Siviero 4 4affiliation: INAF, Osservatorio Astronomico di Padova, Sede di Asiago, Italy , F. G. Watson 6 6affiliation: Anglo Australian Observatory, Sydney, Australia , J. P. Fulbright 7 7affiliation: Johns Hopkins University, Baltimore MD, USA , R. F. G. Wyse 7 7affiliation: Johns Hopkins University, Baltimore MD, USA , R. Campbell 8 8affiliation: Macquarie University, Sydney, Australia 2 2affiliation: Astrophysikalisches Institut Potsdam, Potsdam, Germany , G. M. Seabroke 9 9affiliation: Institute of Astronomy, University of Cambridge, UK 10 10affiliation: e2v Centre for Electronic Imaging, School of Engineering and Design, Brunel University, Uxbridge, UK , M. Williams 5 5affiliation: RSAA, Australian national University, Canberra, Australia 2 2affiliation: Astrophysikalisches Institut Potsdam, Potsdam, Germany , M. Steinmetz 2 2affiliation: Astrophysikalisches Institut Potsdam, Potsdam, Germany , O. Bienaymé 3 3affiliation: Observatoire de Strasbourg, Strasbourg, France , G. Gilmore 9 9affiliation: Institute of Astronomy, University of Cambridge, UK , E. K. Grebel 11 11affiliation: Astronomisches Rechen-Institut, Center for Astronomy of the University of Heidelberg, Heidelberg, Germany , A. Helmi 12 12affiliation: Kapteyn Astronomical Institute, University of Groningen, Groningen, the Netherlands , J. F. Navarro 13 13affiliation: University of Victoria, Victoria, Canada , B. Anguiano 2 2affiliation: Astrophysikalisches Institut Potsdam, Potsdam, Germany , C. Boeche 2 2affiliation: Astrophysikalisches Institut Potsdam, Potsdam, Germany , D. Burton 6 6affiliation: Anglo Australian Observatory, Sydney, Australia , P. Cass 6 6affiliation: Anglo Australian Observatory, Sydney, Australia , J. Daweaffiliationmark: 6 6affiliation: Anglo Australian Observatory, Sydney, Australia , K. Fiegert 6 6affiliation: Anglo Australian Observatory, Sydney, Australia , M. Hartley 6 6affiliation: Anglo Australian Observatory, Sydney, Australia , K. Russell 6 6affiliation: Anglo Australian Observatory, Sydney, Australia , L. Veltz 3 3affiliation: Observatoire de Strasbourg, Strasbourg, France 2 2affiliation: Astrophysikalisches Institut Potsdam, Potsdam, Germany , J. Bailin 14 14affiliation: Centre for Astrophysics and Supercomputing, Swinburne University of Technology, Hawthorn, Australia , J. Binney 15 15affiliation: Rudolf Pierls Center for Theoretical Physics, University of Oxford, UK , J. Bland-Hawthorn 16 16affiliation: Institute of Astronomy, School of Physics, University of Sydney, NSW 2006, Australia , A. Brown 17 17affiliation: Sterrewacht Leiden, University of Leiden, Leiden, the Netherlands , W. Dehnen 18 18affiliation: University of Leicester, Leicester, UK , N. W. Evans 9 9affiliation: Institute of Astronomy, University of Cambridge, UK , P. Re Fiorentin 1 1affiliation: University of Ljubljana, Faculty of Mathematics and Physics, Ljubljana, Slovenia , M. Fiorucci 4 4affiliation: INAF, Osservatorio Astronomico di Padova, Sede di Asiago, Italy , O. Gerhard 19 19affiliation: MPI fuer extraterrestrische Physik, Garching, Germany , B. Gibson 20 20affiliation: University of Central Lancashire, Preston, UK , A. Kelz 2 2affiliation: Astrophysikalisches Institut Potsdam, Potsdam, Germany , K. Kujken 12 12affiliation: Kapteyn Astronomical Institute, University of Groningen, Groningen, the Netherlands , G. Matijevič 1 1affiliation: University of Ljubljana, Faculty of Mathematics and Physics, Ljubljana, Slovenia , I. Minchev 21 21affiliation: University of Rochester, Rochester NY, USA , Q. A. Parker 8 8affiliation: Macquarie University, Sydney, Australia , J. Peñarrubia 13 13affiliation: University of Victoria, Victoria, Canada , A. Quillen 21 21affiliation: University of Rochester, Rochester NY, USA , M. A. Read 22 22affiliation: University of Edinburgh, Edinburgh, UK , W. Reid 8 8affiliation: Macquarie University, Sydney, Australia , S. Roeser 11 11affiliation: Astronomisches Rechen-Institut, Center for Astronomy of the University of Heidelberg, Heidelberg, Germany , G. Ruchti 7 7affiliation: Johns Hopkins University, Baltimore MD, USA , R.-D. Scholz 2 2affiliation: Astrophysikalisches Institut Potsdam, Potsdam, Germany , M. C. Smith 9 9affiliation: Institute of Astronomy, University of Cambridge, UK , R. Sordo 4 4affiliation: INAF, Osservatorio Astronomico di Padova, Sede di Asiago, Italy , E. Tolstoi 12 12affiliation: Kapteyn Astronomical Institute, University of Groningen, Groningen, the Netherlands , L. Tomasella 4 4affiliation: INAF, Osservatorio Astronomico di Padova, Sede di Asiago, Italy , S. Vidrih 11 11affiliation: Astronomisches Rechen-Institut, Center for Astronomy of the University of Heidelberg, Heidelberg, Germany 9 9affiliation: Institute of Astronomy, University of Cambridge, UK 1 1affiliation: University of Ljubljana, Faculty of Mathematics and Physics, Ljubljana, Slovenia , E. Wylie de Boer 5 5affiliation: RSAA, Australian national University, Canberra, Australia
###### Abstract

We present the second data release of the Radial Velocity Experiment (RAVE), an ambitious spectroscopic survey to measure radial velocities and stellar atmosphere parameters (temperature, metallicity, surface gravity, and rotational velocity) of up to one million stars using the 6dF multi-object spectrograph on the 1.2-m UK Schmidt Telescope of the Anglo-Australian Observatory (AAO). The RAVE program started in 2003, obtaining medium resolution spectra (median R=7,500) in the Ca-triplet region ( 8,410–8,795 Å) for southern hemisphere stars drawn from the Tycho-2 and SuperCOSMOS catalogues, in the magnitude range . Following the first data release (Steinmetz et al., 2006) the current release doubles the sample of published radial velocities, now containing 51,829 radial velocities for 49,327 individual stars observed on 141 nights between April 11 2003 and March 31 2005. Comparison with external data sets shows that the new data collected since April 3 2004 show a standard deviation of 1.3 km s, about twice better than for the first data release. For the first time this data release contains values of stellar parameters from 22,407 spectra of 21,121 individual stars. They were derived by a penalized method using an extensive grid of synthetic spectra calculated from the latest version of Kurucz stellar atmosphere models. From comparison with external data sets, our conservative estimates of errors of the stellar parameters for a spectrum with an average signal to noise ratio of are 400 K in temperature, 0.5 dex in gravity, and 0.2 dex in metallicity. We note however that, for all three stellar parameters, the internal errors estimated from repeat RAVE observations of 822 stars are at least a factor 2 smaller. We demonstrate that the results show no systematic offsets if compared to values derived from photometry or complementary spectroscopic analyses. The data release includes proper motions from Starnet2, Tycho2, and UCAC2 catalogs and photometric measurements from Tycho-2 USNO-B, DENIS and 2MASS. The data release can be accessed via the RAVE webpage: http://www.rave-survey.org and through CDS.

catalogs, surveys, stars: fundamental parameters
$\dagger$$\dagger$affiliationtext: deceased

## 1 Introduction

This paper presents the second data release from the Radial Velocity Experiment (RAVE), an ambitious spectroscopic survey of the southern sky which has already observed over 200,000 stars away from the plane of the Milky Way () and with apparent magnitudes . The paper follows the first data release, described in Steinmetz et al. (2006), hereafter Paper I. It doubles the number of published radial velocities. For the first time it also uses spectroscopic analysis to provide information on values of stellar parameters: temperature, gravity, and metallicity. Note that the latter in general differs from iron abundance, because metallicity is the proportion of matter made up of all chemical elements other than hydrogen and helium in the stellar atmosphere. Stellar parameters are given for the majority of the newly published stars. This information is supplemented by additional data from the literature: stellar position, proper motion, and photometric measurements from DENIS, 2MASS and Tycho surveys.

Scientific uses of such a data set were described in Steinmetz (2003). They include the identification and study of the current structure of the Galaxy and of remnants of its formation, recent accretion events, as well as discovery of individual peculiar objects and spectroscopic binary stars. Kinematic information derived from the RAVE dataset has been used (Smith et al., 2007) to constrain the Galactic escape speed at the Solar radius to (90 percent confidence). The fact that is significantly greater than (where is the local circular velocity) is a model-independent confirmation that there must be a significant amount of mass exterior to the Solar circle, i.e. it convincingly demonstrates the presence of a dark halo in the Galaxy. A model-dependent estimate yields the virial mass of the Galaxy of  M and the virial radius of  kpc (90 per cent confidence). Veltz et al. (2008) discussed kinematics towards the Galactic poles and identified discontinuities that separate thin disk, thick disk and a hotter component. Seabroke et al. (2008) searched for in-falling stellar streams on to the local Milky Way disc and found that it is devoid of any vertically coherent streams containing hundreds of stars. The passage of the disrupting Sagittarius dwarf galaxy leading tidal stream through the Solar neighborhood is therefore ruled out. Additional ongoing studies have been listed in Paper I.

The structure of this paper is as follows: Section 2 is a description of the observations, which is followed by a section on data reduction and processing. Data quality is discussed in Section 4, with a particular emphasis on a comparison of the derived values of stellar parameters with results from an analysis of external data sets. Section 5 is a presentation of the data product, followed by concluding remarks on the results in the context of current large spectroscopic surveys.

## 2 Observations

RAVE is a magnitude limited spectroscopic survey. For this reason it avoids any kinematic bias in the target selection. The wavelength range of 8410 to 8795 Å overlaps with the photometric Cousins band. However the DENIS and 2MASS catalogs were not yet available at the time of planning of the observations we present here. So this data release uses the same input catalog as Paper I: the bright stars were selected using magnitudes estimated from the Tycho-2 and magnitudes (Høg et al., 2000), and the faint ones were chosen by their magnitudes in the SuperCOSMOS Sky Survey (Hambly et al., 2001), hereafter SSS. Transformations to derive the magnitude and its relation to the DENIS magnitude values are discussed in Paper I. There we also comment on the fact that SuperCOSMOS photographic magnitudes show an offset with respect to DENIS magnitudes (Fig. 1). So, although the initial magnitude limit of the survey was planned to be 12.0, the actual limit is up to one magnitude fainter.

The survey spans a limited range in apparent magnitude, still it probes both the nearby and more distant Galaxy. Typical distances for K0 dwarfs are between 50 and 250 pc, while the K0 giants are located at distances of 0.7 to 3 kpc.

The instrumental setup is similar to the one used in Paper I. Two field plates with robotically positioned fibers are used in turn in the focus of the UK Schmidt telescope at the Anglo-Australian Observatory. A field plate covers a field of view and feeds light to up to 150 fibers each with an angular diameter of 6.7” on the sky. One should be careful to avoid chance superpositions with target stars when using such wide fibers. As a precaution we avoid regions close to the Galactic plane () or dense stellar clusters. Also, all candidate stars are visually checked for possible contamination prior to observing using the 1-arcmin SSS thumbnails from the on-line SSS R-band data.

Each field plate contains 150 science fibers, with additional bundles used for guiding. A robot positioner configures the plate for each field by moving each fiber end to the desired position. The associated mechanical stress occasionally causes the fiber to break, so it needs to be repaired. A typical fiber is broken after every 2 years of use on average, and is repaired in the next 8 months. Figure 2 shows the number of fibers which were used successfully to collect star light for each of the 517 pointings. The number varies with time. A period of decline is followed by a sharp rise after the repair of broken fibers on the corresponding field plate. Each pointing was typically used to successfully observe 106 stars. An additional 9 or 10 fibers were used to monitor the sky background.

The light is dispersed by a bench-mounted Schmidt-type spectrograph to produce spectra with a resolving power of . The main improvement introduced since the first data release is the use of a blue light blocking filter (Schott OG531) which blocks the second order spectrum. This allows for an unambiguous placement of the continuum level and so permits the derivation of values of stellar parameters, in addition to the radial velocity. The introduction of the blocking filter lowers the number of collected photons by only %, so we decided to keep the same observing routine as described in Paper I. The observation of a given field consists of 5 consecutive 10-minute exposures, which are accompanied by flat-field and Neon arc calibration frames.

Note that we use two field plates on an alternating basis (fibers from one fiber plate are being configured while we observe with the other field plate). So fibers from a given field plate are mounted to the spectrograph slit prior to observation of each field. To do this the cover of the spectrograph needs to be removed, so its temperature may change abruptly. The associated thermal stress implies that it is best to use the flatfield and Neon arc lamp exposures obtained immediately after the set of scientific exposures when the spectrograph is largely thermally stabilized. For all data new to this data release we ensured that such flatfield and arc lamp exposures have been obtained and used in the data reduction.

Observations were obtained between April 11 2003 and March 31 2005. The observations obtained since April 3 2004 yielded data which were not published in Paper I, so they are new to this data release. Statistics on the number of useful nights, of field centers and of stellar spectra are given in Table 1. These numbers make the present, second data release about twice as large as the one presented in Paper I. Stars were mostly observed only once, but 75 stars from the field centered on R.A. = 16 07, Dec.  were deliberately observed 8 times to study their variability.

Observations are limited to the southern hemisphere and have a distance of at least 25 degrees from the Galactic plane (except for a few test fields). Their distribution is plotted in Figure 21. The unvisited area is concentrated around the Galactic plane and in the direction of the Magellanic clouds.

## 3 Data reduction and processing

The data reduction is performed in several steps:

1. Quality control of the acquired data.

2. Spectra reduction.

3. Radial velocity determination and estimation of physical stellar parameters.

In the first step the RAVEdr software package and plotting tools are used to make a preliminary estimate of data quality in terms of signal levels, focus quality and of possible interference patterns. This serves two goals: to quickly determine which observations need to be repeated because of unsatisfactory data quality, and to exclude any problematic data from further reduction steps. For the first data release 17% of all pointings were classified as problematic, while in this data release the overall dropout rate fell to 13%. Problematic data are kept separately and are not part of this data release. The next two steps of the data reduction process are described below.

### 3.1 Spectra reduction

We use a custom set of IRAF routines which have been described in detail in Paper I. Here we highlight only the improvements introduced for reduction of data new to this data release.

The use of the blue light blocking filter permits a more accurate flatfielding of the data. The spectra have a length of 1031 pixels, and are found to cover a wavelength interval of  Å. The resolving power is the same as estimated in Paper I, we use the value of throughout. The camera of the spectrograph has a very fast focal ratio (F/1). The associated optical aberrations at large off-axis angles imply that the central wavelength of the spectrograph is not constant, but depends on the fiber number (Figure 3). This means that the wavelengths covered by a spectrum depend on its fiber number. Also any residual cross–talk between the spectra in adjacent fibers is generally shifted in wavelength. This makes an iterative procedure to remove illumination from adjacent fibers even more important (see Paper I for details). The peak of central wavelengths around the half-point of their distribution shows that our instrumental setup remained quite stable for one year when the data new to this data release were obtained.

The determination of radial velocity and stellar parameters is based on the 788 pixels of the central part of the wavelength range only ( Å  Å). This avoids telluric absorption lines and a ghost image caused by internal reflections of non-dispersed light at the borders of the wavelength range which are occasionally present and could jeopardize the results, as described in Paper I. The edges of the spectral interval are avoided also because of a poorer focus, lower resolving power and a lower quality of the wavelength calibration.

Figure 4 plots the average ADU count level of the central part of the final 1-D spectrum, and per one hour of exposure time, as a function of Denis magnitude. Only data new to this data release are plotted. The line follows the relation

 Ncounts=10−0.4(IDENIS−20.25) (1)

where the constant term is the mode of the magnitude corrected count distribution. These count levels are 0.25 mag below those in Paper I. The difference is due to the 2nd order blocking filter. Note however that the filter allowed for a more accurate flatfielding, and so better determined count levels. This information has been used in data quality control.

The general routine stayed the same as described in detail in Paper I. Radial velocities are computed from sky-subtracted normalized spectra, while sky unsubtracted spectra are used to compute the zero-point correction. The latter is needed because of thermal variations of the spectrograph which cause a shift of the order of one tenth of a pixel or 1.5 km s. Radial velocities are computed from cross-correlation with an extensive library of synthetic spectra. A set of 57,943 spectra degraded to the resolving power of RAVE from Munari et al. (2005) is used. It is based on the latest generation of Kurucz models. It covers all loci of non-degenerate stars in the H-R diagram, with metallicities in the range of . Most spectra have a microturbulent velocity of 2 km s (with additional entries for 1 and 4 km s), while the enhancements of and are used. The use of the blue blocking filter simplifies the computations, as no contribution from the 2nd order spectrum needs to be considered. Both the observed spectra and theoretical templates are normalized prior to the radial velocity measurement. We use IRAF’s task continuum with a two-piece cubic spline. The rejection criteria used in 10 consecutive iterations of the continuum level are asymmetric (1.5- low and 3- high).

Kurucz synthetic spectra used in cross-correlation do not include corrections of radial velocity due to convective motions in the stellar atmosphere or due to a gravitational redshift of light leaving the star (F. Castelli, private communication). The combined shift is in the range of –0.4 km s for F dwarfs to +0.4 km s for K dwarfs (Gullberg & Lindegren, 2002), while the near absence of gravitational redshift in giants causes a  km s shift between giants and dwarfs. The exact value of these corrections is difficult to calculate, so we follow the Resolution C1 of the IAU General Assembly in Manchester (Rickman, 2002) and report the heliocentric radial velocities without corrections for gravitational or convective shifts in the stellar atmosphere. Note however that these values may be different from the line-of-sight component of the velocity of the stellar center of mass (Lindegren, 1999; Latham, 2001).

In the final data product we report the heliocentric radial velocity and its error, together with the value of the applied zero-point velocity correction, the radial velocity of sky lines and their correlation properties. A detailed description of the data release is given in Section 5.

### 3.3 Stellar parameter determination

The name of the survey suggests that RAVE is predominantly a radial velocity survey. However, the spectral type of the survey stars is generally not known and the input catalog does not use any color criterion, so RAVE stars are expected to include all evolutionary stages and a wide range of masses in the H-R diagram. The properties of the stellar spectra in the wavelength interval used by RAVE strongly depend on the values of the stellar parameters (Munari et al., 2001). While the Ca II IR triplet is almost always present, the occurrence and strength of Paschen, metallic and molecular lines depends on temperature, gravity and metallicity (see e.g. Figure 4 in Zwitter et al. (2004)). So we cannot adopt the common practice of using a small number of spectral templates to derive the radial velocity alone, as it has been commonly done at, e.g., the ELODIE spectrograph at OHP. We therefore construct the best matching template from a large library of synthetic Kurucz spectra (see Sec. 3.2). The parameters of the best matching spectrum are assumed to present the true physical parameters in the stellar atmosphere.

Two comments are in order before we outline the template spectrum construction method. First, the template library only covers normal stars. So peculiar objects cannot be classified correctly. Such objects include double lined spectroscopic binaries and emission line objects. Sometimes a peculiar nature of the spectrum can be inferred from a poor match of the templates, despite a high S/N ratio of the observed spectrum.

The second important point concerns the non-orthogonality of the physical parameters we use. This is demonstrated in Figure 5: the wavelength ranges with flux levels sensitive to a change in temperature overlap with those sensitive to metallicity and the rotational velocity. On the other hand sensitivity to changes in both gravity and temperature depend on spectral type and class. The intermittent lines in Figure 5 mark wavelengths where the normalized flux level changes for at least % if the value of one of the parameters is modified by a given amount (temperature by 500 K, gravity or metallicity by 0.5 dex, or rotational velocity by 30 km s). We note that a 3% change is marginally detectable in a typical RAVE spectrum with S/N , but the non-orthogonality of individual parameters can present a serious problem (see also Figure 1 in Zwitter (2002)). If the temperature or gravity would be known a priori, the ambiguities would be largely resolved. An obvious idea is to use photometric colors to constrain the value of stellar temperature. Unfortunately the errors of current photometric surveys are too large: a change of 0.03 mag in corresponds to a shift of 230 K in temperature in a mid-G main sequence star. Also, stellar colors may be seriously compromised by interstellar extinction or by stellar binarity. We therefore decided not to use any outside information but to base our estimates of stellar parameters exclusively on spectral matching. This may change in the future when results of multicolor and multi-epoch all-sky photometric surveys such as SkyMapper (Keller et al., 2007) will become available.

Our parameter estimation procedure makes use of a full set of theoretical templates. They span a grid in 6 parameters: temperature, gravity, metallicity, enhancement, microturbulent and rotational velocity. The sampling in gravity, metallicity, and temperature is very good, with tabulated values for the former two and even more for the temperature. On the other hand the current synthetic library contains only one non-Solar enhancement value () and only up to 3 values of microturbulent velocity (1, 2, 4 km s, but only 2 km s is available for the whole grid). So we decided to publish values of temperature, gravity and metallicity. The alpha enhancement values are also listed but they should be interpreted with caution, as they are derived from 2 grid values only. These two values may not span the whole range of enhancement which is present in nature. Also the error of enhancement can be comparable to the whole range of the grid in this parameter (see Sec. 3.3.5). Microturbulent velocity values are not published, because their errors are typically much larger than the range of microturbulent velocities in the grid. Similarly, the rather low resolving power of RAVE spectra does not allow the determination of rotational velocities () for slow rotators which represent the vast majority of RAVE stars. Hence the rotational velocity is not published, but fast rotators will be discussed in a separate paper. So we aim at the estimation of three stellar parameters: effective temperature (), gravity (), and metallicity (). The adopted reference system of these parameters is the latest set of Kurucz template spectra. Next we describe the inverse method used to derive values of stellar parameters.

#### 3.3.1 Method

To derive the stellar parameters, we use a penalized technique to construct a synthetic spectrum matching the observed spectrum (for other uses of similar methods see for example Pichon et al. (2002), Ocvirk et al. (2006a)). The observed spectrum is modelled as a weighted sum of template spectra with known parameters and it is assumed that the stellar parameters follow the same weight relation. The continuous problem is therefore written as

 {FP′(λ)=∫~w(P)S(λ,P)d6PP′=∫~w(P)Pd6P, (2)

where is the spectrum we want the stellar parameters for, are the template spectra with known stellar parameters , is the stellar parameter set we want to measure and is the weight function we try to recover. In the perfect case, where we have an infinite number of template spectra and the observed spectrum depends only on the stellar parameters (perfect match between the observed and model spectra), . In a real case where noise plays an important role and a real spectrum can not be perfectly reproduced, is not a Dirac function but a smooth function which is non-zero on a limited range. Also, we have the additional constraint .

In the more general case, we have access to a limited number of templates and the problem becomes discrete. The problem then can be rewritten as

 {SP(λ)=∑iwi.SPi(λ)P=∑iwi.Pi, (3)

where is the discrete form of .

This problem is ill-conditioned, the number of template spectra being larger than the number of pixels, and the information contained in a spectrum being largely redundant. Therefore, we make use of penalization terms to regularize the solution. Also, the recovered weights must be positive to have a physical meaning, which changes the problem from linear to non-linear. The following paragraphs will present briefly the linear problem which has a well defined solution before entering the realm of the non-linear problem. For a full discussion and description of the method, the reader is referred to Pichon et al. (2002), Ocvirk et al. (2006a, b) and references therein.

#### 3.3.2 Linear inverse problem

The discrete problem of Eq. 3 can be written in a matrix form. Calling the observed spectrum, the array of weights, the library of template spectra and the array of parameters, the problem then reads

 {~y=a⋅x+eP=b⋅x, (4)

where accounts for the noise in the observed spectrum. is also referred to as the model matrix or kernel.

Using Bayes theorem, solving equation 3 or 4 is equivalent to maximizing the a posteriori conditional probability density defined as

 fpost(x|~y)=L(~y|x)fprior(x). (5)

Here is our prior on the stellar parameters and is the likelihood of the data given the model.

In the case of Gaussian errors, the likelihood is

 L(~y|x)∝exp(−12(~y−a⋅x)⊤⋅W⋅(~y−a⋅x)), (6)

where the expression in the exponent is the operator

 χ2(~y|x)=(~y−a⋅x)⊤⋅W⋅(~y−a⋅x), (7)

is the inverse of the covariance matrix of the noise; . Maximizing is equivalent to minimizing the penalty operator given by

 Q(x) = χ2(~y|x)−2log(fprior(x)) (8) = χ2(~y|x)+λR(x), (9)

where in the second form, the a priori probability density has been rewritten as a penalization or regularization operator and is a Lagrange multiplier.

When is a quadratic function, e.g. and , the problem has a well defined solution

 x=(a⊤⋅W⋅a+λK)−1⋅a⊤⋅W⋅~y, (10)

and the optimal is given by the generalized cross validation (GCV): , where .

Using equation 10, and the weights can have negative values. Negative weights have no physical meaning and will result in non-physical solutions. We therefore require that which leads to the non linear problem discussed below.

#### 3.3.3 Non-linear extension

Unfortunately, there is no simple extension from the analytic linear problem to the non linear case, and there is no analytic solution for the minimum of . In the non linear regime, the minimum of must be obtained using efficient minimization algorithms and can be computer intensive.

Nevertheless, as stressed by Ocvirk et al. (2006a), solving the non linear case has also advantages. First, we will obtain a physically motivated solution (with positive or null weights everywhere) then, imposing positivity reduces significantly the allowed parameter space and reduces the level of Gibbs phenomenon (or ringing artifacts) in the solution. This comes at the price of a higher computing time and asymmetric (non Gaussian) errors.

To ensure that the weights are positive, we pose and solve Eq. 3 for . The exponential transform has the property that while , which ensures that the weights are strictly positive. Equation 9 can be rewritten as

 Q(α)= (~y−a⋅expα)⊤⋅W⋅(~y−a⋅expα) (11) +λ1P1(α)+λ2P2(α)…,

and the problem now is to find the minimum of for . Note that in the last equation, the regularization operator has been split in a set of regularization operators each with its own Lagrange parameter. The penalization operators will be discussed in the next section.

We mentioned that in the linear case, the GCV provides an optimal value for the Lagrange parameter however, in the non-linear case, this definition is no longer valid. Also, no method is known that allows a quick estimate of the optimal s for the non linear problem. In our case, we estimate the proper Lagrange parameter values by means of numerical simulations using synthetic spectra and Gaussian noise. The s used in the pipeline were chosen to optimize the computation time and the accuracy (highest possible accuracy in a minimal computation time). It must be stressed here that the Lagrange parameters, obtained from numerical simulations, may not be optimal as the simulations can not cover all the parameter space and as the idealized simulations do not incorporate all the ingredients of a real spectrum. Nevertheless, the simulations allow us to find a solution for the Lagrange parameters matching predefined requirements.

Finally, using the exponential transform can cause the solution to be unbound. For example, we expect the weights of spectra far away from the true solution to be zero. In this case, for , and the solution is unbound. This problem can be solved using an additional term in the regularization, penalizing solutions where becomes lower than a predefined threshold. For example, in the case of continuum subtracted spectra, the threshold can be set from to , being the number of spectra in the library, ensuring that the contribution of a template spectrum away from the solution is negligible.

#### 3.3.4 Penalization

The problem of determining the stellar parameters from a RAVE spectrum is ill-conditioned and requires regularization in order to recover a physically meaningful solution. Also, the size of the synthetic spectra library we are using is too large to enable us to process a RAVE spectrum within a realistic time frame considering the number of spectra to process.

Our first operation reduces the size of the parameter space by selecting templates according to a criterion. We use the transform

 exp(α′i)=exp(αi)θ(Pi), (12)

and solve Eq. 11 replacing by . is a gate function in the 6D stellar parameter space. In the 1D case it reads

 θ(Pi)= {1if\,\,−12

At each point on the grid defined by the library, the derivative is 0. Therefore, solving Eq. 11 for is equivalent to solving the same equation for but on a reduced subset of matching the condition, and we shall drop the prime in the following.

We choose to use a criterion to select the subset in order to include local minima with a value close to the minimum . This selection criterion avoids potential problems where the noise, ghost or cosmic rays create spurious minima which could lead to biases in the estimated stellar parameters. Care must be taken when selecting the limit as, if the number of spectra in the subset of templates is not large enough, biases can be introduced in the solution. The limit was chosen according to numerical simulation using the synthetic template library. Simulations have shown that, using Eq. 11, at least the 150 template spectra from the lowest must be used to minimize the reconstruction errors and avoid biases. As those simulations were run using idealized spectra, in practice is set to the 300 lowest for a given spectrum. This leads to a subsample of the library containing between 2 and 4 values per parameter, depending on the location in the parameter space. This number is lower than which would be the number of spectra used for a quadratic interpolation on a complete 6D grid, and is due to the fact that the stellar parameter space is not evenly covered by the library. The average number of direct neighbors (on a grid point next to a given parameter) is 85, varying between 1 and 314.

Reducing the number of template spectra does not solve the ill-conditioned nature of the problem, even if the number of templates becomes lower than the number of pixels. This is due to the fact that the pixel values are not independent and the information on effective temperature, gravity etc. is redundant in a spectrum. To regularize the problem we use the property that, in the idealized continuous case, the solution is expected to be close to a Gaussian function centered on the true solution. Therefore, we expect the discrete solution to follow the same behaviour and we require the solution to be smooth in the parameter space. Nevertheless, as in the real case the solution might have local minima because of the noise or features in the spectrum, we do not impose any particular shape for the solution and we keep the method non parametric111The method is non parametric in the sense that no functional form is imposed for the array of parameters.. We only require that the variation of the weights in the parameter space is locally smooth. We define the penalization operator as

 (14)

where

 Li,j∝⎧⎪ ⎪⎨⎪ ⎪⎩−1Nii≠j,i∈Ni1i=j0otherwise. (15)

is the distance in the parameter space defined as

 d(Pi,Pj)= ⎷∑k(Pi,k−Pj,k)2σ2k, (16)

being an index over the dimensions of the stellar parameter space, the mean distance over a fixed neighborhood of the point defined by the index in the parameter space, and the dispersion in the stellar parameter . In practice, the neighborhood is set to the 40 closest points in which is approximately half the average number of neighbors. The fact that not the entire neighborhood is used to compute the average distance does not introduce errors as the operator is local and all the templates will contribute as the operator is applied over the entire set of templates. Note here that is always lower than 1. With this definition, will be large for , negative in the surrounding of in the parameter space and 0 outside. is then large when a large value for a given template is not balanced by its neighborhood, penalizing strong local variations like peaks of width lower than the of the library.

To derive the stellar parameters, the method presented above is applied on continuum subtracted spectra and to recover the proper continuum level we have the additional constraint that . Therefore, we add a third penalization term to ensure that the sum of the weights is one. For clean spectra this last penalization can be omitted. But in the case of RAVE a ghost can affect the blue part of some spectra and there may be residuals of cosmic ray strikes. So imposing the continuum level enables us to avoid potential problems in automatic processing. The operator is defined as

 P3(α)=1.−exp(−(1−∑exp(α))22σ23). (17)

This operator has an inverted Gaussian behaviour around , with for and away from this value. To estimate the stellar parameters, we use or 1% of the continuum value.

#### 3.3.5 Validation of the method

To establish the validity of our approach to recover the stellar parameters in the RAVE regime, we tested the algorithm on a series of 20,000 synthetic spectra built using the same template library. As the accuracy of the method depends on the resolution and wavelength interval (the Lagrange parameters must be defined separately for each instrument and library), we do not try to validate the method outside of our observational regime and this section will focus on an idealized case mimicking the RAVE spectra. A complete discussion of error estimates and zero point offsets, comparing our measurements to other sources, is presented in Sec. 4.2.2.

The synthetic spectra are randomly generated from a linear interpolation of the library and three ingredients are added :

• a Gaussian white noise with SNR in the range [10..40]

• a RV mismatch up to 5 km/s

• continuum structures amounting to up to 5% of the continuum level.

These three ingredients were added to mimic observed and expected features in the RAVE spectra: random noise level typical of the RAVE spectra, a mean internal RV error of 5 km/s after the 1 RV estimation222The spectra used for parameter estimation are RV corrected after a first RV estimation using a reduced set of templates (see paper I), a better template with proper parameters is then generated using the algorithm and is then used for the final RV calculation. and residual continuum features that can be left after data reduction. The residual continuum features are included using 1 to 5 cosine functions each with arbitrary phase, frequency (between 0.5 and 5 periods on the wavelength interval) and normalization (within 0 and 5% of the continuum level).

Figure 6 presents the reconstruction error (RAVE-true) as a function of the various parameters released in DR2 for the 20000 simulated spectra. As mentioned before, the rotational velocity will be discussed in another paper, and microturbulence can not be recovered in the RAVE regime. Therefore, these two parameters are not presented here. Nevertheless, we stress that all 6 parameters were used in the simulations, the same as was done in the standard pipeline on observed spectra. The left panel represents the spectra with effective temperature below 8000 K (CaII lines dominated) while the right panel presents the hotter spectra that are dominated by the hydrogen Paschen lines. The number of simulated spectra in the left panel is 17,500 while the right panel contains simulated spectra. This is an effect of the template library, the cool part of the library having a denser grid of spectra than the hot side. Also, a smoothing was applied to the right panel for the visualization, to lower the effect of the noise.

These simulations enable us to assess the expected dependencies of our errors as a function of the various stellar parameters. The main characteristics we observe are:

• Below 8000 K, there is little dependence of the recovered parameters on but for itself with an overestimation that increases as the effective temperature becomes larger.

• [M/H] is the main driver for the errors in the low metallicity regime ([M/H]) with all parameters but being overestimated, while is underestimated. This indicates that for the metallicity, the true metal content can only be recovered when both [M/H] and are considered (see discussion Sec. 4.2.2).

• , as expected, is not properly recovered in the RAVE regime as shown by the upper right panels.

• is better constrained in hot spectra than in cool spectra.

• is systematically underestimated for hot spectra.

The overall accuracy we can expect for the stellar parameters in the RAVE regime ranges then from 200 K to 500 K for , 0.2 to 0.5 dex for and 0.1 to 0.4 dex for [M/H] (depending on the value of enhancement) while alone is not recovered.

A better understanding of the relations and mutual influences of the errors on the stellar parameters is gained from the correlations between the reconstruction errors. These are presented in Fig. 7 where the different behavior of the hot stars and of the cool stars is apparent. The upper triangle presents the correlations between the errors for the cool stars while the lower triangle shows the correlation for the hot spectra (the lower triangle has been smoothed for visual rendering).

It is clear in this figure that in the cool spectra regime, the errors on the parameter reconstruction are strongly correlated which indicates that an error on one parameter results in errors on the other parameters. There is however an exception for which is only anti-correlated to [M/H] and not correlated to the other parameters which further indicates that only a combination of [M/H] and is recovered, and that these two quantities cannot be uniquely separated.

The situation is different for the hot stars, where the only visible correlation is between and [M/H] and only for large errors on [M/H]. Otherwise, no correlation is seen indicating that the system is better constrained. Nevertheless, typical errors for hot stars are larger than for cool stars with similar noise levels.

Overall, the method presented allows us to recover the stellar parameters with a good accuracy knowing that our wavelength interval is small and our resolution is limited (R7500). The expected correlations between the reconstruction errors for the different parameters are well behaved (simple one mode correlations) if one is able to distinguish a posteriori the two cases, hot and cool stars.

### 3.4 Estimate of the ratio of signal to noise

The initial estimate of the signal to noise (S/N) comes from comparison of 1-D spectra derived from typically 5 subexposures of a given field (see Paper I for details). This estimate is model independent and readily available for the calculation of for the radial velocity and stellar parameter determination routines. However any change of observing conditions during the observing run may contribute to differences of subexposure spectra and therefore render the value of S/N too low. We therefore wrote a procedure which calculates the S/N from the final spectrum only. We refer to it as the S2N value in the data release, while the one calculated from subexposure variation is labeled SNR.

Line-free regions in observed spectra are very scarce. Moreover, the spectra are quite noisy, so one does not know a priori if an apparently line-free region does not hide weak absorption lines. So it seems obvious that suitable regions should be chosen by comparison of the observed spectrum to the best matching template.

The procedure is as follows:

1. The normalized final observed spectrum (shifted and resampled to the rest frame) is compared with the synthetic library template with the best correlation. The two spectra are not identical for two reasons: noise in the observed spectrum and systematic deviations (due to observational or theoretical computation deficiencies). We want to avoid the latter. The difference between the observed and theoretical spectrum often alternates in sign between consecutive wavelength pixels if it is due to noise. But systematics usually affect several adjacent wavelength bins, so the sign of the difference does not vary so frequently. We therefore decided to use only those pixels for which the difference changes sign from the previous or towards the next adjacent pixel. This selection scheme retains 75% of all pixels if the reason for variation is just noise. This seems a reasonable price to pay in order to avoid systematics. Note that we impose restrictions only on the sign of the difference, not on its absolute value, so noise properties are not affected.

2. Regions of strong spectral lines are prone to systematic errors. So we discard any pixel for which the flux of the template would be less than 0.9 of the continuum flux. Strong spectral lines span a small fraction of the entire spectral range, except in high temperature objects. The derived S/N estimate is representative of the continuum S/N, but the value generally does not differ by more than 5-10% from the S/N averaged over the whole spectrum.

3. Next we calculate the difference between the observed and theoretical spectrum, and divide it by the theoretical spectrum flux. The final S2N estimate is an inverse of its standard deviation, of course only using the pixels retained in the steps above.

4. The observed spectra we used for the three steps above are shifted to the rest frame and resampled with respect to the original ones given in observed wavelengths. This is important, as resampling damps the pixel to pixel variation and therefore artificially increases the measured value of the signal to noise. So we need to take it into account. The resampled and the original spectrum have the same number of pixels, so resampling can be characterized by its average fractional-pixel shift. A zero shift obviously does nothing, but a shift of half a pixel means that the S/N estimate measured in previous steps needs to be multiplied by a factor of . If the shift is a fraction of the pixel separation, the expression for the damping factor . The S/N value calculated by the 3 steps above needs to be multiplied by this damping factor to obtain the final S2N estimate of the observed spectrum. In the case of RAVE data the fractional shift of pixels at both edges of the spectrum is zero, while the pixels in-between are resampled from an observed non-linear to a linear increase of wavelength with pixel number. Because this non-linearity is always very similar also the resulting damping factor turns out to be well constrained: . This value is actually very close to 0.805 obtained for a uniform distribution of fractional pixel shifts in the interval.

The first two points limit the fraction of pixels used in the S2N estimate to . This is true also for hot stars, so the selection outlined above does not seem to be too constraining. Note that step 4 means that the S2N values are lower than the ones calculated by e.g. the splot package of IRAF, because the latter does not take into account the effects of resampling. The SNR estimate is very sensitive to variations of atmosphere transparency and instrumental effects during the observing sequence while the S2N is not. So S2N values are similar to the SNR ones, with the average value of S2N being % higher. We propose to use the S2N as the final S/N estimate for the spectrum. So the quantity S/N below always refers to the S2N value.

Fig. 8 plots the signal to noise ratio (S2N) as a function of the Denis magnitude and average number of counts per pixel. The latter was calculated in the central part of the spectrum (). The straight line in the magnitude graph (Fig. 8a) follows the relation

 S/N=10−0.2(IDENIS−19.1) (18)

while the one in Fig. 8b is obtained from combining it with equation (1). The constant term in eq. 18 is the mode of the magnitude corrected S/N distribution. The magnitude graph shows that the signal to noise can be predicted from DENIS magnitude with an average error of %. The dependence of S/N on the count level is much better determined, with a dispersion of the central ridge of only %. The difference is due to an uneven transparency of the Earth’s atmosphere and of optical fibers which have a stronger effect on the magnitude graph. Sky background as well as light scattered within the spectrograph are of increasing relative importance for faint objects. They cause the deviation from a straight line seen in both panels at faint count or magnitude levels.

## 4 Data quality

The distribution of the internal radial velocity errors is presented in Fig. 9. These are the estimated uncertainties of fitting a parabola to the top of the correlation peak (Paper I). The top panel shows the histogram of the radial velocity error in 0.1 km s bins, while the bottom panel is the cumulative distribution. Results for the spectra new to this data release and for the ones from Paper I are shown separately. In the former case we also add the results for spectra for which we are publishing the values of stellar parameters (see below). These are spectra of sufficiently high quality and without peculiarities (binarity, emission lines etc.). Table 2 summarizes the values of the most probable and average internal velocity errors.

The blue light blocking filter, which cuts the second order light and was used for data new to this release, clearly improves the match between theoretical templates and observed spectra. This is mostly a consequence of the more accurate flatfielding of a rather narrow spectral range of the first order light, compared to a mix of relative contributions of the first and the second order spectra which emphasizes any differences in the color temperature between the star and the calibration lamp. Also the level of the continuum is much easier to determine if the blocking filter is used. The most probable value of the internal velocity error is 0.9 km s for the data new to this data release, compared to 1.7 km s in Paper I. On the other hand the possibility of a better match also increases the chances to identify any types of peculiarities. So there is a rather notable tail of large internal velocity errors if we consider all data new to this data release (dashed line in Fig. 9). If only normal stars are plotted (solid line in Fig. 9) large velocity errors are much less common. This is reflected also in the average errors reported in Table 2.

Internal velocity errors are useful, but they do not include possible systematic effects. As mentioned in Sec. 3.2 the reported radial velocities do not allow for shifts due to non-vanishing convective motions in the stellar atmosphere and for gravitational redshift of the light leaving the stellar surface. This is the case also with other spectroscopically determined radial velocities. We compared RAVE radial velocities with those obtained from external datasets. 255 stars from 4 different datasets were used to assess the accuracy of radial velocities of stars new to this data release. From these, 213 stars turn out to have normal spectra without emission lines, strong stellar activity or stellar multiplicity and have radial velocity errors smaller than 5 km s, so that they are retained for further analysis. They include 144 stars from the Geneva Copenhagen Survey (GCS), and three datasets observed specifically to check RAVE radial velocities: 33 stars were observed with the Sophie and 15 with the Elodie spectrograph at the Observatoire de Haute Provence, and 21 stars with the echelle spectrograph at the Asiago observatory. Stars observed in Asiago span the whole range of colors, while most other datasets and especially GCS focus on yellow dwarfs. The whole survey includes a larger number of red stars (Fig. 10) which are mostly giants, as can be seen from temperature-gravity distributions derived by RAVE for the whole survey (Fig. 22). A smaller fraction of giants in the reference datasets does not present a real problem, as radial velocities for giants tend to be more accurate than for dwarfs.

A comparison of radial velocities obtained by RAVE and by the reference datasets is presented in Figure 11 and summarized in Table 3. is the number of objects in each dataset, is the mean of differences between RAVE and reference measurements and is their standard deviation. We note that mean zero point offsets are non-zero and of different size and sign for separate datasets. So the difference in zero point is likely due to a different zero point calibration of each instrument. Most of the reference stars are taken from the GCS. So the large number of dwarfs from the GCS drive also the final value of the zero point offset and its dispersion. If one omits those the mean zero point difference is only 0.1 km s and the dispersion () is 1.30 km s. These estimates neglect the intrinsic measurement errors of each reference dataset. A typical error of 0.7 km s and a zero point offset of 0.3 km s for the GCS suggest that the RV error of RAVE is  km s. This is also the value derived from the other datasets. Figure 12 shows that the standard deviation stays within  km s even at low ratios of signal to noise. This value decreases to  km s if one omits the stars from the GCS dataset.

Most of the stars in external datasets are dwarfs with a metallicity close to the Solar one. The midpoint of stays at  km s  for temperatures lower than 5800 K, increasing to  km s for stars with 6800 K. There is no significant variation of the radial velocity difference with metallicity in the range covered by the external datasets.

One can also use repeated observations of RAVE stars to assess the internal consistency of the measurements. Section 4.4 shows that radial velocity from a pair of measurements of a single star differ by  km s in 68.2% of the cases. This corresponds to an error of 1.3 km s for a single measurement.

We conclude that the typical RV error for data new to this data release is  km s. For the measurements with a high value of S/N the error is only 1.3 km s with a negligible zero point error.

### 4.2 Accuracy of stellar parameters

For the vast majority of the stars in this data release, there is no prior spectroscopic information available. Some photometric information is available (see Section 5.2) but after a detailed investigation we concluded that this external information is not of sufficient quality to be used as a prior on any of the stellar parameters. Unknown extinction presents a further problem. This situation is expected to continue until high quality multi-epoch photometry becomes available for the Southern sky from the SkyMapper project (Keller et al., 2007). RAVE is therefore the first large spectroscopic survey to use only spectroscopic data to derive the values of stellar parameters. So it is appropriate to make a detailed check of the results with external datasets coming both from the literature and our own custom observations.

#### 4.2.1 External datasets

RAVE stars are generally too faint to have data available in the literature, so we obtained a separate set of RAVE observations of stars from three reference sets in the literature. In addition we obtained custom observations of regular RAVE targets with two Northern hemisphere telescopes, at Observatory in Asiago and at Apache Point Observatory. In the coming months we plan to expand the comparison using the observing time allocated at UCLES at Siding Spring and at ESO. Here we describe the presently available datasets which contain altogether 331 stars. In all cases the corresponding RAVE observations were obtained and processed in the same way as for the other stars in this data release.

The ARC echelle spectrograph at Apache Point Observatory (APO) 3.5-m telescope was used to observe 45 RAVE stars. These spectra cover the entire optical wavelength range (3,500 – 10,000 Å) in 107 orders with an effective resolving power of about 35,000. The spectra were extracted using standard IRAF routines incorporating bias and scattered light removal plus flat-field corrections. The wavelength calibration was obtained using ThAr hollow cathode lamps. Temperature and gravity were derived by a least squares fit to the same library of stellar spectra (Munari et al., 2005) as used for RAVE catalog, but using a procedure (Munari et al., 2005a) independent from the RAVE method described in Sec. 3.3. The results are consistent with those resulting from the analysis using the excitation temperature and equivalent widths of Fe I and Fe II lines to derive iron abundance, temperature and gravity (Fulbright et al. 2006, 2007). The metallicity was derived by both the least square fit of the whole spectrum and by the method based on equivalent widths of the Fe lines. The latter yields an iron abundance, but the metallicity can be calculated assuming that the ratio, which influences the even-Z elements between O and Ti, increases linearly from zero for stars with [Fe/H] = 0 to for stars with [Fe/H] , and stays constant outside these ranges. In Table 10 we list the temperature and gravity as derived by the least square fit method, and metallicity from the Fe line method.

The echelle spectrograph at the 1.8-m telescope, operated by INAF Osservatorio Astronomico di Padova on top of Mt. Ekar in Asiago was used to observe 24 RAVE stars. These spectra cover the range from 3,300 to 7,300 Å, but the analysis was limited to the three echelle orders around 5,200 Å with the highest signal. The resolving power was around 20,000. The spectra were carefully treated for scattered light, bias and flat-field and reduced using standard IRAF routines. They were analyzed with the same least square procedure as the APO data. The results are given in Table 11.

The RAVE spectrograph was used to observe three additional sets of stars with parameters known from the literature. We observed 60 stars from the Soubiran & Girard (2005) catalog and obtained 49 spectra useful to check the metallicity and the temperature values. The reported gravity values were not used for checks as the catalog does not estimate their accuracy. Soubiran & Girard (2005) do not report metallicity, so its value was derived from a weighted sum of quoted element abundances of Fe, O, Na, Mg, Al, Si, Ca, Ti, and Ni, assuming Solar abundance ratios from Anders & Grevesse (1989), in accordance with classical Kurucz models. The choice of a reference Solar abundance model is not critical. Newer Solar abundance scales introduce only a small shift in the mean metallicity of the Soubiran & Girard (2005) stars if compared to typical errors of RAVE observations: for Solar abundances given by Grevesse & Sauval (1998) and for Asplund et al. (2006) Solar abundances. The standard deviation of metallicities, derived from new compared to classical abundances, is 0.005 for Grevesse & Sauval (1998) and 0.012 for Asplund et al. (2006). The parameter values as derived from the literature and from RAVE spectra are listed in Table 12.

We also observed 12 members of the M 67 cluster (Table 13) for which we adopted the metallicity of . This value of metallicity is a weighted sum of its modern metallicity determinations (Randich et al. (2006) and references therein). Finally Table 14 reports on the comparison of temperatures for 201 stars from the Geneva Copenhagen Survey (Nordström, et al., 2004). This catalog does not include metallicities but only iron abundances. The two values are not identical, so a comparison on a star by star basis could not be made (but see below for a general comparison of the two values).

Table 4 summarizes the properties of individual datasets. is the number of stars in a given dataset and the sign marks parameters that could be checked. Temperatures, gravities and metallicities of stars in these datasets are plotted in Fig. 13. The values are those determined from RAVE spectra, as some parameter values are not known for the datasets from the literature. The distributions of external dataset objects in the temperature–gravity–metallicity space can be compared to the ones of the whole data release (Fig. 22).

#### 4.2.2 Comparison of external and RAVE parameter values

The first property to check is the consistency of values derived by the RAVE pipeline with those from the reference datasets. Table 5 lists mean offsets and dispersions around the mean for individual stellar parameters.

The temperature shows no offsets when the reference sets of echelle observations at APO and Asiago are used, together with our observations of Soubiran & Girard (2005) stars. However if the GCS dataset is included the RAVE temperatures appear too hot on average, and also the dispersion is increased. We believe this is a consequence of somewhat larger errors introduced by the photometric determination of the temperature in the GCS and not a consequence of errors of the RAVE pipeline. Gravity shows a negligible offset and a dispersion of 0.4 dex. However the metallicity as derived by the RAVE pipeline (in Table 5 we refer to it as ’uncalibrated’) appears to have a significant offset. The values derived by the RAVE pipeline are generally more metal poor than those obtained by measurement of equivalent widths of absorption lines in APO observations, as derived from the Soubiran & Girard (2005) catalog, and also compared to the metallicity of M67. So it seems worthwhile to explore the possibility of a calibration that would make metallicities derived by RAVE consistent with the values in these reference datasets.

#### 4.2.3 Calibrating metallicity

The RAVE pipeline derives metallicity as any other parameter, i.e. by a penalized technique finding an optimal match between the observed spectrum and the one constructed from a library of pre-computed synthetic spectra. The results match even for the metallicity if a similar analysis method is used. This is demonstrated by Figure 14. The results of the analysis using an independent procedure (Munari et al., 2005a) yield metallicities which are entirely consistent with the RAVE pipeline results (mean offset of dex and a standard deviation of dex). RAVE metallicities as derived from the RAVE pipeline are part of a self-consistent native RAVE system of stellar parameters which is tied to a analysis using a library of Kurucz template spectra. The system is unlikely to change in the future. So metallicities, as derived by the pipeline, are also a part of the final data release.

However other spectral methods, which derive metallicities from the strengths of individual spectral lines and not from a match of synthetic and observed spectra, do not yield results so consistent with those of the RAVE pipeline. Figure 15 shows some obvious trends:

• the difference between the RAVE and the reference metallicity increases with an increased enhancement, in the sense that RAVE values become too metal poor;

• the difference is also larger at lower metallicities;

• the difference is larger for giants than for main sequence stars, though the variation is much weaker than for enhancement or metallicity;

• the difference does not seem to depend on temperature.

The aim of this section is to provide a calibration relation that transforms the uncalibrated metallicities, derived by the method, to the calibrated ones, which are in line with the metallicity system of the above mentioned datasets. The trends can be represented with a linear relationship, there is no indication of quadratic terms. So we assume that the calibrated metallicity is given by the relation

 [M/H]=c0 [m/H]+c1 [α/Fe]+c2 logg+c3 Teff+c4 (19)

where all parameters on the right refer to the values derived by the RAVE pipeline (Section 3.3) and are constants. Figure 15 contains a few outliers, so there is a danger that the fit is driven by these points and not by general trends. The fit is therefore performed twice and after the first fit we reject 5% of the most deviating points. Such a clipping does not decrease the number of calibration points significantly, still it effectively avoids outliers.

It is not obvious whether all parameters in equation 19 need to be used. So we tested a range of solutions, using between 1 and 5 free parameters. It turns out that the main parameters are metallicity, enhancement, and gravity, while for the temperature parameter () improvement of the goodness of fit is not significant. Also, the calibrating datasets cover a limited range in temperature, so this parameter is not sampled over its whole physical span. So we decided not to use temperature for the calibration of the metallicity. The final form of the calibration relation is

 [M/H]=0.938 [m/H]+0.767 [α/Fe]−0.064 logg+0.404 (20)

where and denote the calibrated and the uncalibrated metallicities, respectively. This convention shall be used throughout the paper. Calibration nicely removes the trends mentioned before. Note that the gravity term nearly cancels the constant offset for main sequence stars. Its inclusion in the relation 20 is further justified by the fact that larger discrepancies in metallicity are constrained to lower gravities.

Inclusion of enhancement () in the calibration relation may seem a bit problematic. Its value is not known a priori, and we said in Sec. 3.3.5 that it cannot be accurately recovered by the RAVE pipeline (see the upper right panel of Fig. 6). A typical recovery error of up to 0.15 dex makes values derived by RAVE hardly useful to decide if a certain star has an enhanced abundance of elements produced by capture of particles or not. The reason is that the whole range of this parameter amounts to only 0.4 dex, i.e. not much larger than the recovery error. On the other hand the values derived by RAVE are not random, so they statistically improve the accuracy of derived metallicity. A factor of 0.767 implies that they increase it by up to 0.3 dex in extremely enhanced stars. So, even though an accurate value of cannot be derived by RAVE, we know that its value is changing from star to star. In fact the enhancement of elements is the first improvement on the abundance modelling of stars which reaches past the uniform scaling of Solar abundances. RAVE stars are expected to show much of a variation in this parameter, as we are covering a wide range of stars from local dwarfs to the rather distant supergiants well above the Galactic plane. This is also the reason we included variation of enhancement in the method to determine stellar parameters. If the value of were held fixed, or if it were calculated by some arbitrary relation, the resulting metallicity would be biased, with values shifted by up to 0.3 dex. We try to avoid such biases, so is part of the spectral processing, even though it cannot be accurately recovered.

The need for a metallicity calibration can be partly also due to our choice of the wavelength range. The largest contributors of strong absorption lines in RAVE spectra (for stars dominating the observed stellar population) are Ca II, Si I, Mg I, Ti I, and Fe I. All but the last one are produced by the capture of particles. For the spectral type K0 III we have 54 prominent spectral lines of 3 -elements (Si I, Mg I, and Ti I) and 60 Fe I lines of similar strength. So -elements produce a similar number of lines as iron, not counting the very strong lines of -element Ca II which actually dominate any fit. So, when the RAVE pipeline tries to match the metallic content, the fits pointing to an enhanced abundance or an increased metallicity are similar. As a result the pipeline may split the effect of metallicity in two parts, in the sense that it partly modifies the metallicity and partly adjusts the enhancement. This may explain the large correlation between the enhancement and metallicity, reflected in a large value of the coefficient in the calibration relation (eq. 20). The ambiguity could be broken only by a higher S/N spectra covering a wider spectral range. This is also the reason why analysis methods involving equivalent widths of individual lines, could not be used on a vast majority of RAVE spectra. A method described in Sec. 3.3 was chosen because it uses the whole spectrum and so makes the best use of the available information.

Figure 16 shows the situation after application of the calibration relation (20). All trends and offsets in the metallicity values have disappeared and the scatter between the derived and the reference metallicity is reduced from 0.37 to 0.18 dex (Table 5).

We used Soubiran stars, APO observations and M67 members to derive the calibration relation. The GCS stars can be used to check what we obtained. The GCS does not report metallicity () but only iron abundance ([Fe/H]). As mentioned before the two are not identical. A substantial scatter in the metallicity vs. iron abundance relation (as demonstrated in Fig. 17 for the Soubiran stars) prevents us from deriving a unique iron abundance to metallicity relation in the absence of additional information, as is the case with the GC survey. In Figure 17 we therefore plot RAVE metallicity vs. iron abundance from the GCS catalog. The uncalibrated RAVE metallicities (top panel) make the Soubiran and Geneva Copenhagen surveys occupy different regions of the metallicity/iron abundance diagram. But the calibrated RAVE metallicities (bottom panel) provide an excellent match. As said before the GC survey stars were not used in derivation of the calibration relation. The match is therefore a further evidence that the relation (20) can be trusted.

The calibrated metallicity can be checked also against predictions of semi-empirical models. Figure 18.a plots the distribution of the calibrated metallicity determined from RAVE spectra, while 18.b is an empirical prediction of the distribution of iron abundance. The latter was calculated using the Besançon Galactic model (Robin et al., 2003) with the apparent magnitude distribution of RAVE stars and a random sample of objects more than from the Galactic plane, except for the inaccessible region . The observed distribution in metallicity is more symmetric than its theoretical iron–abundance counterpart. The reason lies in the differences of the two quantities. Figure 17 shows that the metallicity is usually higher than iron abundance due to an enhanced presence of elements. APO observations of RAVE stars (Table 10) yield both iron abundance and metallicity, so they allow us to fit a statistical relation between metallicity and iron abundance

 [M/H]=[Fe/H]+0.11[1±(1−e−3.6|[Fe/H]+0.55|)] (21)

where the plus sign applies for and the minus sign otherwise. The relation is plotted with a dashed line in Fig. 17. It makes the metallicity 0.22 dex larger than the iron abundance for very metal poor stars with , while the difference vanishes when approaching the Solar metallicity. The relation is very similar to the one of Salaris et al. (1993). If this relation, together with metallicity errors typical for the RAVE observations (equation 22 and figure 19), is used, the resulting histograms (Fig. 18.c) are very similar to the observed ones (Fig. 18.a). Peaks of the histograms match to within 0.06 dex, while the width is  % larger in the model compared to the observations. A somewhat larger width of the model histograms suggests that the error estimates for the RAVE metallicity are conservative. Note however that the Besançon model predicts a smaller fraction of low gravity stars () than observed.

The description of stellar chemical composition by metallicity and -enhancement values is a simplification. Generally, the individual stellar elemental abundances (including those of the alpha elements) do not scale linearly or in a constant ratio with those of the Sun, and spectral lines of some elements are not present in the RAVE wavelength range. Individual element abundances frequently scatter by 0.2 or 0.3 dex if compared to the iron abundance (Soubiran & Girard, 2005). This fact of nature is also the cause of a large scatter of metallicity vs. iron abundance in the Soubiran sample, depicted by grey points in Figure 17. The metallicity change of 0.2–0.3 dex, as introduced by the calibration relation, is therefore comparable to the intrinsic scatter of individual element abundances in stars. So it would be very difficult to provide a detailed physical explanation for the calibration relation between the metallicities derived by equivalent width or photometry methods and those obtained by a analysis. Equation 21 therefore reflects only approximate general trends. Nevertheless it allows us to check that the distribution of the calibrated metallicities derived by RAVE is consistent with the predictions of the Besançon Galactic model.

#### 4.2.4 Method for stellar parameter error estimation

Errors associated with a given stellar parameter depend on the S/N ratio of the spectrum and on the spectral properties of the star. We discuss them in turn. The calibration data have very different values of S/N, in general higher than typical RAVE survey data. The average S/N ratio for the survey stars for which we publish values of stellar parameters is 41. So we choose S/N = 40 as the reference value of S/N. The error estimate below therefore refers to a star with S/N = 40. Extensive Monte Carlo simulations show that the error for a stellar parameter has the following scaling with the S/N of the observed spectrum:

 σ=rkσ40 (22)

where

 r={(S/N)/40,ifS/N<80;80/40,otherwise, (23)

and the coefficient has the value of for temperature, for gravity and for metallicity. The simulations used 63 high S/N spectra observed by RAVE for which also high resolution echelle spectroscopy has been obtained in Asiago or at the APO. We assumed that the analysis of echelle spectroscopy yields the true values of the parameters for these stars and studied how the values derived by the RAVE pipeline would worsen if additional Gaussian noise was added to the RAVE spectra. We found that the offsets in mean values of stellar parameters appear only at (an offset in temperature at S/N=6 is 100 K) and disappear at higher S/N ratios. Gaussian noise is not the only source of the problems with weak signal spectra. Systematic effects due to scattered light, fiber crosstalk, and incomplete removal of flatfield interference patterns are preventing a reliable parameter determination in a large fraction of spectra with . So we decided to publish radial velocities down to , while stellar parameter values are published only for spectra with . The latter decision influences % of RAVE spectra which have .

Simulations also show that the errors on the parameters do not continue to improve for stars with , because systematic errors tend to dominate over statistical noise in such low–noise cases. So we flatten out the error decrease for in equation (23).

The choice of the reference signal to noise ratio of 40 means that the errors discussed below should be about twice larger for the noisiest spectra with published parameters, and about twice smaller for spectra with the largest ratio of signal to noise.

The calibration datasets (Table 4) contain only stars hotter than 4000 K and cooler than 7500 K. The majority of these stars are on or close to the main sequence with a metallicity similar to the Solar value. Many of the RAVE program stars are of this type, but not all. For example one cannot judge the errors of hot stars or very metal poor stars from these datasets. So we need to use simulations to estimate the value of in Eq. 22, i.e. how the error depends on the type of star that is observed. Relative errors are estimated from a theoretical grid of Kurucz models, but the observed calibration datasets are used for the scaling of the relative to absolute error values and for verifying the results.

We start with a theoretical normalized spectrum from the pre-computed Kurucz grid and investigate the increase of the root mean square difference (RMS) when we compare it with grid-point spectra in its vicinity in the 5-dimensional space of , , , , and . If we denote the values of five parameters for the initial spectrum as (), and if () denote their values at a grid point in its vicinity, the estimate of an error of parameter for the initial spectrum can be obtained from the minimum of . We assume that an increase of RMS has a similar effect on the parameter estimation as an increase of a noise level. So is inversely proportional to the S/N of the normalized spectrum, but the dependence of the error of the parameter on the S/N ratio is given by the value of the coefficient in the equation 22. The only remaining factor is the proportionality constant. It is derived by the assumption that 68.2% of all calibration spectra should have the value of the parameter determined by the RAVE pipeline within of the reference value.

The scheme allows us to estimate errors in all corners of the parameter space covered by Kurucz models, i.e. even in parts where we lack any calibration spectra. Calibration spectra are used exclusively for scaling of the value of a given stellar parameter in eq. 22. This scaling was done assuming that of RAVE calibration objects should have a given parameter within one standard deviation of the true value obtained from high resolution observations. So we can check if the relative number of calibration objects within e.g. 0.5 or 2 standard deviations conforms to the normal distribution. A positive answer would support the results. Next we discuss the accuracy of each stellar parameter in turn.

#### 4.2.5 Temperature accuracy

The top panel of Figure 19 plots the standard deviation of temperature as a function of temperature for stars with S/N = 40. The value of the standard deviation is divided by temperature. So an ordinate value of 0.05 at 6000 K denotes a standard deviation of 300 K. The nine curves are errors for three values of gravity and three values of metallicity. Light grey curves are for supergiants (), grey ones for subgiants (), and black ones for MS stars (). Solid lines are for Solar metallicity, while long dashed ones are for and short dashed ones for .

Typical errors for stars cooler than 9000 K are around 400 K. The errors are the smallest for supergiants. Their atmospheres are the most transparent ones, so that a wealth of spectral lines arising at different optical depths can improve the temperature accuracy. Understandably the errors for metal poor stars are larger than for their Solar counterparts. The errors get considerably worse for hot stars ( K) where most metal lines are missing and the spectrum is largely dominated by hydrogen lines. All these trends can be seen from Figure 5 where wavelength intervals affected by temperature change are marked by red lines.

These error estimates are rather conservative because we assumed that any discrepancy arises only because of RAVE errors, i.e. that the calibration datasets are error free. As mentioned already in the discussion on zero point offsets (Sec. 4.2.2) this is not always the case. In particular, the errors in temperature would be 20% smaller if we did not use the GCS stars in error estimation.

Figure 20a plots the cumulative distribution of errors for the calibration stars used to derive the temperature errors. Line types and greyscale tones are the same as in Figure 19. We see that 68% of our stars have their error within one sigma, a condition we used for scaling. But also the distribution of stars along the error curve closely follows the normal distribution. This supports the error estimates given in Fig. 19.

#### 4.2.6 Gravity accuracy

The middle panel of Figure 19 plots errors in gravity as a function of temperature. Strong wings of Hydrogen lines which are sensitive to gravity allow small gravity errors in hot stars (see blue marks in Figure 5 which mark gravity–sensitive regions). On the other hand rather narrow metallic lines in the RAVE wavelength range, including the ones of Ca II, do not allow an accurate determination of gravity in cool stars. The gravity error in cool stars has a strong gravity dependence: in dwarfs it is large, but the rather transparent atmospheres of giant stars still allow for a reasonably accurate gravity determination. In any case the errors in gravity do not exceed 0.8 dex, which still allows determination of a luminosity class.

Figure 20b is similar to Figure 20a in the sense that it plots the errors of calibration stars. Again we have 68% of the stars with errors smaller than the standard deviation, a condition used to calibrate the errors in Figure 19. Departures from the normal distribution of errors can be explained by a rather small number of spectra used to determine the gravity errors.

#### 4.2.7 Metallicity accuracy

The bottom panel of Figure 19 plots standard deviations of the calibrated metallicity (). The typical error for stars cooler than 7000 K is 0.2 dex. The error for hotter stars is understandably much larger, as these stars lack most of the metallic lines in their spectra (lack of green marks in hot spectra in Figure 5). Figure 20c shows that the distribution of errors is very close to the normal one.

#### 4.2.8 Errors on other parameters

The rotational velocity will be a topic of a separate paper which will discuss fast rotating stars, so we do not estimate its error here. The enhancement value is part of this data release, but given the fact that the Kurucz grid covers only two values (0.0 and 0.4) it is very hard to estimate its error. We note that our metallicity has a typical error of 0.2 dex, so it seems likely that the statistical error on enhancement is larger. Note that this is comparable to the value of reached in typical metal poor stars. So although the parameter is useful to improve the accuracy of derived metallicities (eq. 20) its value is not accurate enough to be trusted for individual stars.

### 4.3 Detection of peculiar and problematic spectra

Errors on temperature, gravity and metallicity have been presented for a range of normal stars. We estimate that these errors are statistically accurate to %. Errors for other normal stars could be derived by linear interpolation. But not all stars have normal spectra. RAVE observed a number of binaries, emission type objects and other peculiar stars, while occasionally a spectrum of a normal star is jeopardized by systematic errors. So it is vital to identify such objects.

Radial velocity information is present in all spectral lines. Still, a very noisy spectrum, too uncertain a wavelength solution or other systematic errors could lead to unreliable results. The simulations showed that the radial velocity is not systematically affected by noise if the S/N ratio is larger than 6. At lower S/N ratios the best template identified by our matching method would be systematically offset (for 100 K or more in temperature) therefore affecting RV accuracy. This effect is not present at higher S/N ratios. So we calculated the S/N ratio for each spectrum and visually checked if the calcium lines (and at higher S/N ratios also others) show a mutually consistent radial velocity. 698 spectra were rejected, mostly because their , and are not part of this data release.

The measurement of stellar parameters requires a higher S/N ratio. We adopted a limit of as the minimum value. Note that this limit is still quite conservative as it corresponds to % error in the flux of each pixel. So the published values of stellar parameters are statistically correct, but parameters for individual stars with should be considered as preliminary. This data release contains 3411 such relatively noisy spectra.

All spectra new to this release were visually checked. The goal is to avoid systematic errors, as well as to identify types of objects which are not properly covered by our grid of theoretical models. In the latter case large and arbitrary errors in values of stellar parameters could result. Such objects include double lined spectroscopic binary stars, emission type objects and other peculiar stars. We do not publish values of stellar parameters for such objects, but only the values of their radial velocity which is calculated in the same way as for normal stars. So we are consistent with the first data release. We also avoid arbitrary decisions in cases of undetected or marginally detected binaries. Their published radial velocity is somewhere between the instantaneous velocities of the two components and does not correspond necessarily to the barycentric one. The physical analysis of detected double lined spectroscopic binaries will be presented in a separate paper. But Seabroke et al. (2008) showed that they do not affect statistical kinematic Galactic studies significantly.

The first data release contained 26,079 spectra for which we published radial velocities but no stellar parameters. Also in the data new to this data release there are 3,343 spectra without published stellar parameters. From these there are 140 emission type spectra, 135 double lined binary spectra and 86 spectra of peculiar stars. Other spectra without published parameters have the or are affected by systematic problems. Table 6 summarizes the results. The last column quotes the number of different objects with a given classification. Some stars occasionally show normal spectra and we publish the values of their stellar parameters, but in other occasions they show some kind of peculiarity or systematic problem, so that their parameter values are not published. So the first number in the last column is not an exact sum of the two numbers below it.

### 4.4 Repeated observations

Most stars are observed by RAVE only once, but some observations are repeated for calibration purposes. 1,893 objects in the present data release have more than one spectrum. Table 6 explains that the present release contains 51,829 spectra of 49,327 different stars. Note that the latter number is smaller than the number of stars of individual types. This is a consequence of the fact that a spectrum of a star may appear as a double lined binary star in one spectrum, and as an entirely normal single star in another one (taken close to conjunction). So the star would be counted as a member of two types. A definite classification of all stars in this data release is beyond the scope of this paper. We plan to pursue follow-up studies for particular types of objects, like spectroscopic binaries, and present them in separate papers.

Repeated observations allow a comparison of the measured properties of these stars. If we assume that values for a given star do not change with time, the scatter can be used to estimate errors on radial velocity and the values of the stellar parameters. This assumption may not always be true, for example in the case of binaries or intrinsically variable stars. So we assumed that the sigma of a parameter is the value which comprises 68.2% of the differences between the measured values of a parameter and its average value for a given star. This way we minimize the effect of large deviations of (rare) variable objects and measure an effective standard deviation of a given parameter.

The data release contains 1,893 objects with 2 or more measurements of radial velocity. The dispersion of measurements for a particular object is smaller than 1.80 km s in 68.2% of the cases, and smaller than 7.9 km s in 95% of the cases.

For 822 objects we have also 2 or more spectra with published stellar parameters. In this case the dispersion of velocities is within 1.66 km s (68.2% of objects) and 6.1 km s (95% of cases). The corresponding scatter in the temperature is 135 K (68.2%) and 393 K (95%), for gravity 0.2 dex (68.2%) and 0.5 dex (95%), and for the calibrated metallicity 0.1 dex (68.2%) and 0.2 dex (95%).

Spectra of repeated objects share the same distribution of S/N ratio as all RAVE stars. Their typical S/N ratio of 40 is smaller than for the reference datasets (see Fig. 12), still the above quoted value for the dispersion of radial velocities is similar to the errors of the reference datasets (Table 3). Also the dispersions of stellar parameter values as derived from the repeated observations are smaller than the dispersions for the reference datasets (Table 5). One expects a higher internal consistency of the repeated observations, as these are free from zero point errors. But the zero point errors are very small for both radial velocity and stellar parameters (Tables 3 and 5). Note that our error estimates of radial velocity and stellar parameter values are derived assuming that the reference values from the external datasets are error free, and this may not always be the case. We conclude that the error estimates on radial velocity (Sec. 4.1) and stellar parameters (Sec. 4.2) are quite conservative.

## 5 Second data release

### 5.1 Global properties

The second public data release of the RAVE data (RAVE DR2) is accessible online. It can be queried or retrieved from the Vizier database at the CDS, as well as from the RAVE collaboration website (www.rave–survey.org). Table 9 describes its column entries. The tools to query and extract information are described in Paper I.

The result of the RAVE survey are radial velocities and values of stellar parameters (temperature, gravity and metallicity). Metallicity is given twice: as coming from the data reduction pipeline ([m/H]) and after application of calibration equation 20 ([M/H], see Section 4.2.3 for details). The latter includes also the value of enhancement. So the catalog includes also the estimated values of . As explained in Section 4.2.8 this is provided mainly for calibration purposes and is not intended to infer properties of individual objects.

Figure 21 plots the general pattern of (heliocentric) radial velocities. The dipole distribution is due to Solar motion with respect to the Local Standard of Rest. Spatial coverage away from the Galactic plane is rather good, with the exception of stars at small Galactic longitudes. These areas have already been observed and will be part of the next data release.

The investigation of properties of the stellar parameters and their links to Galactic dynamics and formation history are beyond the scope of this paper. To illustrate the situation we outline just two plots. Figure 22 shows the location of all spectra on the temperature–gravity–metallicity wedge. Note the main sequence and giant groups, their relative frequency and metallicity distribution for three bands in Galactic latitude. Figure 23 plots histograms of the parameters, again for different bands in Galactic latitude. The fraction of main sequence stars increases with the distance from the Galactic plane. This can be understood by the fact that the RAVE targets have rather similar apparent magnitudes (Figure 1). Giants therefore trace a more distant population, and those at high latitudes would be already members of the (scarcely populated) Galactic halo.

### 5.2 Photometry

The data release includes cross-identification with optical and near-IR catalogs (USNO-B, DENIS, 2MASS) where the nearest neighbor criterion was used for matching. Similar to the first data release we provide the distance to the nearest neighbor and a quality flag on the reliability of the match. Note that this is important as RAVE uses optical fibers with a projected diameter of 6.7 arc sec on the sky. Table 7 shows that nearly all stars were successfully matched for the 2MASS and USNO-B catalogs, while only about 3/4 of the stars lie in the sky area covered by the DENIS catalog. For the matched stars we include USNO-B B1, R1, B2, R2 and magnitudes, DENIS I, J, and K magnitudes, and 2MASS J, H, and K magnitudes. As mentioned our wavelength range is best represented by the filter. With the publication of the second release of the DENIS catalog we decided to use the DENIS magnitude as our reference in planning of future observations.

We note here that the DENIS magnitudes appear to be affected by saturation for stars with . Following a comment from a member of the DENIS team, we compared the DENIS and 2MASS magnitude scales. 2MASS does not provide an magnitude. However the transformation

 I2MASS=J2MASS+1.103(J−K)2MASS+0.07 (24)

gives an approximate magnitude on the DENIS system from the 2MASS photometry for giants and dwarfs with . First we confirmed that the colors are consistent with the temperature derived by RAVE for all objects. We then compared the DENIS and 2MASS magnitudes for all stars in the current data release having errors in both of these magnitudes. For most stars with , the magnitudes agree within the expected errors. However we note that (1) the relation between the two magnitudes becomes non-linear for the % of the brightest stars with , and (2) about 8% of the fainter stars with apparently well-determined magnitudes from both catalogs have differences . Some stars have differences greater than magnitudes. We therefore propose to avoid to use magnitudes when the condition is not met. Figure 4 follows this advice and avoids the scatter due to some problematic magnitude values.

### 5.3 Proper motions

Similarly to the first data release the proper motions are taken from Starnet 2.0, and Tycho-2 catalogs (see Paper I for a complete discussion). These values are however not available for of the spectra and in Paper I we bridged the gap with proper motions from the SSS catalog. The SSS catalog suffers from substantial uncertainties, so we now attempted a cross-identification with the UCAC2 catalog (Zacharias et al., 2004). RAVE coordinates were used to search for the nearest two neighbors in the UCAC2 catalogue. It turned out that it suffices to use the data for the first next neighbor, as there were no cases where the matching distance to the first neighbor was less than 3 arcsec while that to the second one was less than 6 arcsec. The UCAC2 counterpart within 3 arcsec search radius was identified for 94% of the spectra, many of the remaining objects have large errors in reported proper motion. Note that UCAC2 values are systematically offset from the Starnet 2 measurements. The difference is  mas yr in right ascension and  mas yr in declination (with the UCAC2 values being smaller than the Starnet 2 ones). The final catalog therefore includes the UCAC2 proper motion if the Starnet 2 or Tycho-2 values are not available (% of cases). The source of proper motions is flagged, so the systematic differences could be taken into account. Table 8 gives details on the use of proper motion catalogs in the present data release and their reported average and 90 percentile errors. In all cases this data release includes proper motion from the source with the best value of reported accuracy.

## 6 Conclusions

This second data release reports radial velocities of 51,829 spectra of 49,327 different stars, randomly selected in the magnitude range of and located more than away from the Galactic plane (except for a few test observations). It covers an area of square degrees. These numbers approximately double the sample reported in Paper I. Moreover, this data release is the first to include values of stellar parameters as determined from stellar spectra. We report temperature, gravity and metallicity for 21,121 normal stars, all observed after the first data release. Stars with a high rotational velocity or peculiar type (e.g. binary stars and emission stars) will be discussed separately.

Radial velocities for stars new in this data release are more accurate than before, with typical errors between 1.3 and 1.7 km s. These values are confirmed both by repeated observations and by external datasets and have only a weak dependence on the S/N ratio. We used five separate external datasets to check values of stellar parameters derived from RAVE spectra. These included observations with different instruments at different resolving powers and in different wavelength regimes, as well as data from the literature. The uncertainty of stellar parameter values strongly depends on stellar type. Despite considerable effort our calibration observations do not cover (yet) the entire parameter space. We plan to improve on this using dedicated calibration observations with at least 4 telescopes. For this data release we had to resort to extensive simulations which are however tuned by calibration observations. A typical RAVE star has an uncertainty of 400 K in temperature, 0.5 dex in gravity, and 0.2 dex in metallicity. The error depends on the signal to noise ratio and can be times better/worse for stars at extremes of the noise range. Repeated observations show that these error estimates are rather conservative, possibly due to intrinsic variability of the observed stars and/or non-negligible errors of reference values from the calibration datasets.

Future data releases will follow on an approximately yearly basis. They will benefit from our considerable and ongoing effort to obtain calibration datasets using other telescopes and similar or complimentary observing techniques. Notably we expect that SkyMapper (Keller et al., 2007), an all-southern-sky survey just starting at the Siding Spring Observatory, will provide accurate photometry and temporal variability information for all RAVE stars.

RAVE is planned to observe up to a million spectra of stars away from the Galactic plane. It represents an unprecedented sample of stellar kinematics and physical properties in the range of magnitudes probing scales between the very local surveys (Geneva Copenhagen Survey and Famaey et al. (2005)) and more distant ones (SDSSII/SEGUE), complementing the planned AAOmega efforts closer to the Galactic plane. So it helps to complete our picture of the Milky Way, paving the way for the next decade endeavors, like Gaia.

Acknowledgments We are most grateful to our referee, prof. David W. Latham, for his detailed and very relevant comments which improved the quality of the presentation of the paper. Funding for RAVE has been provided by the Anglo-Australian Observatory, the Astrophysical Institute Potsdam, the Australian Research Council, the German Research foundation, the National Institute for Astrophysics at Padova, The Johns Hopkins University, the Netherlands Research School for Astronomy, the Natural Sciences and Engineering Research Council of Canada, the Slovenian Research Agency, the Swiss National Science Foundation, the National Science Foundation of the USA (AST-0508996), the Netherlands Organisation for Scientific Research, the Particle Physics and Astronomy Research Council of the UK, Opticon, Strasbourg Observatory, and by the Universities of Basel, Cambridge, and Groningen. The RAVE web site is at www.rave-survey.org. KCF, QAP, BG, RC, WR and ECW acknowledge support from Australian Research Council grants DP0451045 and DP0772283. A. Siviero and EKG are supported by the Swiss National Science Foundation under the grants 200020-105260 and 200020-113697. JPF acknowledges support from the Keck Foundation, through a grant to JHU. RFGW acknowledges seed money from the School of Arts and Sciences at JHU, plus NSF grant AST-0508996. GMS was funded by a Particle Physics and Astronomy Research Council PhD Studentship. OB acknowledges financial support from the CNRS/INSU/PNG. PRF is supported by the European Marie Curie RTN ELSA, contract MRTN-CT-2006-033481. This research has made use of the VizieR catalogue access tool, CDS, Strasbourg, France. This publication makes use of data products from the Two Micron All Sky Survey, which is a joint project of the University of Massachusetts and the IPAC/Caltech, funded by NASA and NSF. The results are based partly on observations obtained at the Asiago 1.82-m telescope (Italy) and at the Observatoire de Haute Provence (OHP) (France) which is operated by the French CNRS. The cross-identification of the RAVE data release with the UCAC2 catalogue was done using the electronic version of the UCAC2 kindly provided by Norbert Zacharias.

## References

• Anders & Grevesse (1989) Anders, E. & Grevesse, N. 1989, Geochimi. Cosmochim. Acta, 53, 197
• Asplund et al. (2006) Asplund, M., Grevesse, N. & Sauval, J. 2006, Comm. in Astroseismology, 147, 76
• Famaey et al. (2005) Famaey, B., Jorissen, A., Luri, X., Mayor, M., Udry, S., Dejonghe, H. & Turon, C. 2005, A&A, 430, 165
• Fan et al. (1996) Fan, X., Burstein, D., Chen, J. S., et al. 1996, AJ, 112, 628
• Fagerholm (1906) Fagerholm, E. 1906, PhD Thesis, Uppsala Univ.
• Fulbright et al.  (2006) Fulbright, J. P., McWilliam, A. & Rich, R. M. 2006, ApJ, 636, 821
• Fulbright et al.  (2007) Fulbright, J. P., McWilliam, A. & Rich, R. M. 2007, ApJ, 661, 1152
• Grevesse & Sauval (1998) Grevesse, N. & Sauval, A. J. 1998, Space Sci. Rev. 85, 161
• Gullberg & Lindegren (2002) Gullberg, D., Lindegren, L. 2002, A&A, 390, 383
• Hambly et al.  (2001) Hambly, N. C., MacGillivray, H. T., Read, M. A., Tritton, S. B., Thomson, E. B., Kelly, B. D., Morgan, D. H., Smith, R. E., Driver, S. P., Williamson, J., Parker, Q. A., Hawkins, M. R. S., Williams, P. M. & Lawrence, A. 2001, MNRAS, 326, 1279
• Høg et al.  (2000) Høg, E., Fabricius, C., Makarov, V. V., Urban, S., Corbin, T., Wycoff, G., Bastian, U., Schwekendiek, P., & Wicenec, A. 2000, A&A, 355, L27
• Keller et al.  (2007) Keller, S. C.; Schmidt, B. P., Bessell, M. S., Conroy, P. G., Francis, P., Granlund, A., Kowald, E., Oates, A. P., Martin-Jones, T., Preston, T., Tisserand, P., Vaccarella, A. & Waterson, M. F. 2007, Publ. Astron. Soc. of Australia, 24, 1
• Latham (2001) Latham, D. 2001, Radial Velocities, In Encyclopedia of Astron. and Astrophys., P. Murdin (ed.), IOP: Bristol, article 1864
• Lindegren (1999) Lindegren, L. 1999, In Precise Stellar Radial Velocities, J.B. Hearnshaw and C. D. Scarfe (eds.), ASP Conf. Ser., 185, 73
• Montgomery et al. (1993) Montgomery, K. A., Marschall, L. A., Janes, K. A. 1993, AJ, 106, 181
• Munari et al.  (2001) Munari, U., Agnolin, P. & Tomasella, A. 2001, BaltA, 10, 613
• Munari et al.  (2005) Munari, U., Sordo, R., Castelli, F., & Zwitter, T. 2005, A&A, 442, 615
• Munari et al.  (2005a) Munari, U., Fiorucci, M. and the RAVE collaboration 2005a, BAAS, 37, 1367
• Nordström, et al. (2004) Nordström, B., Mayor, M., Andersen, J., Holmberg, J., Pont, F., Jorgensen, B. R., Olsen, E. H., Udry, S. & Mowlavi, N. 2004 A&A, 418, 989
• Ocvirk et al.  (2006a) Ocvirk, P., Pichon, C., Lançon, A., & Thiébaut, E. 2006, MNRAS, 365, 46
• Ocvirk et al.  (2006b) Ocvirk, P., Pichon, C., Lançon, A., & Thiébaut, E. 2006, MNRAS, 365, 74
• Pichon et al.  (2002) Pichon, C., Siebert, A., & Bienaymé, O. 2002, MNRAS, 329, 181
• Randich et al.  (2006) Randich, S., Sestito, P. Primas, F., Pallavicini, R., Pasquini, L. 2006, A&A, 450, 557
• Rickman (2002) Rickman, H. 2002, IAU Inf. Bull., 91
• Robin et al.  (2003) Robin, A.C., Reylé, C., Derrière, S., & Picaud, S. 2003, A&A, 409, 523
• Salaris et al.  (1993) Salaris, M., Chieffi, A., Straniero, O. 1993, ApJ, 414, 580
• Seabroke et al.  (2008) Seabroke, G. M., Gilmore, G., Siebert, A., et al. 2008, MNRAS, 384, 11
• Smith et al.  (2007) Smith, M. C., Ruchti, G. R., Helmi, A., et al. 2007, MNRAS, 379, 755
• Soubiran & Girard (2005) Soubiran, C., Girard, P. 2005, A&A, 438, 139
• Steinmetz (2003) Steinmetz, M. 2003, ASP Conf. Ser., 298, 381
• Steinmetz et al.  (2006) Steinmetz, M., Zwitter, T., Siebert, A., et al. 2006, AJ, 132, 1645, (paper I)
• Veltz et al.  (2008) Veltz, L., Bienaymé, O., Freeman, K. C., et al. 2008, A&A, 480, 753
• Zacharias et al.  (2004) Zacharias, N., Urban, S. E., Zacharias, M. I., Wycoff, G. L., Hall, D. M., Germain, M. E., Holdenried, E. R., Winter, L. 2004. AJ, 127, 3043
• Zwitter  (2002) Zwitter, T. 2002 A&A, 386, 748
• Zwitter et al.  (2004) Zwitter, T., Castelli, F., & Munari, U., 2004, A&A, 417, 1055

## Appendix A

Table 9 describes the contents of individual columns of the second data release catalog. The catalog is accessible online at www.rave–survey.org and via the Strasbourg astronomical Data Center (CDS) services.

## Appendix B: External data

Tables 1014 compare the results of RAVE observations with those from the external datasets. The latter are discussed in Sec. 4.2.1.