1712.0
1712.0
ABSTRACT
We define and characterise a sample of 1.3 million galaxies extracted from the first year of
Dark Energy Survey data, optimised to measure Baryon Acoustic Oscillations in the presence
of significant redshift uncertainties. The sample is dominated by luminous red galaxies located
at redshifts z & 0.6. We define the exact selection using color and magnitude cuts that balance
the need of high number densities and small photometric redshift uncertainties, using the
corresponding forecasted BAO distance error as a figure-of-merit in the process. The typical
photo-z uncertainty varies from 2.3% to 3.6% (in units of 1+z) from z = 0.6 to 1, with number
densities from 200 to 130 galaxies per deg2 in tomographic bins of width ∆z = 0.1. Next we
summarise the validation of the photometric redshift estimation. We characterise and mitigate
observational systematics including stellar contamination, and show that the clustering on
large scales is robust in front of those contaminants. We show that the clustering signal in the
auto-correlations and cross-correlations is generally consistent with theoretical models, which
serves as an additional test of the redshift distributions.
Key words: cosmology: observations - (cosmology:) large-scale structure of Universe
0.50
0
0.45
10
0.40
20 0.35
ng [arcmin 2]
0.30
DEC
30
0.25
40
0.20
50 0.15
0.10
120 90 60 30 0 330 300 270
RA
Figure 1. Angular distribution and projected density of the DES-Y1 red galaxy sample described in this paper, and subsequently used for BAO measurements.
The unmasked footprint comprises the two largest compact regions of the dataset: one in the southern hemisphere of 1203 deg2 , overlapping South Pole
Telescope observations (SPT; Carlstrom et al. 2011), and 115 deg2 near the celestial equator, overlapping with Stripe 82 (S82, Annis et al. 2014). The sample
consists of about 1.3 million galaxies with photometric redshifts in the range [0.6 − 1.0] and constitutes the baseline for our DES Y1 BAO analysis.
Table 1. Complete description of the selection performed to obtain a sample dominated by red galaxies with a good compromise of photo-z accuracy and
number density, optimal for the BAO measurement presented in DES-BAO-MAIN. The redshifts of the resulting catalogue are then computed using different
codes (BPZ and DNF) as described in Sec 2. Therefore, any subsequent photo-z selection can be done either with zphoto from BPZ or DNF.
avoid imaging artifacts and pernicious regions with foreground ob- Additionally, we remove the most luminous objects by making the
jects using the cuts on flags badregion and flags gold de- cut iauto > 17.5 . The cut of Eq. (1) is chosen as a compromise be-
scribed therein. In the rest of this section we go into finer details on tweensurvey area, given that we need to achieve an homogeneous
the flux, color and star-galaxy separation selection. depth, and the number of galaxies in that area. For a given overall
In Table 1, we summarise this sample selection, including ref- flux limit of the galaxy sample (e.g. all galaxies with i 6 22) we
erences to the sections where these cuts are explained. select the regions of the survey that are deeper than that limit (e.g.
i-band 10σ limit depth > 22) and mask everything brighter. In this
way that sample selection should be complete over such footprint.
Clearly, for fainter selections more objects are incorporated into
3.1 Completeness and color outliers cuts
the sample but the area of the survey reaching that depth homoge-
The overall flux-limit of the sample is set as neously is also smaller. Hence there is a compromise between area
and number of objects. In Fig. 2 we show the normalized counts as
iauto < 22. (1)
25
MODEST_CLASS (default classifier in Y1GOLD)
BAO classifier > 0.005 (Extra 3% galaxies vs MODEST)
20 BAO classifier > 0.007 (loss of 3% galaxies vs MODEST)
10
0
0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00
zphoto
Figure 3. Contamination of galaxy sample from stars as a function of red-
Figure 2. Measurement of the trade off between area and number of objects shift and star-galaxy separation threshold, as measured using galaxy density
as a function of magnitude limit and sample flux limit in Y1GOLD and SV. vs stellar density plots (from a pure stellar sample). The MODEST classifier
For a given iauto -band “threshold” value we select all regions which have is defined in Drlica-Wagner et al. (2017) as the default star galaxy classi-
a deeper limiting magnitude that this value (10σ depth limit > “threshold”) fier (based on spread model and wavg spread model). ‘BAO classifier’
and count the galaxies brighter than the “threshold” value over those re- stands for a cut in spread model i + (5.0/3.0)spreaderr model i. A
gions. These should be complete samples at each threshold value. Number threshold of 0.007 provides an important decrease of contamination with a
counts are shown normalized to their maximum in the figure. minor adjustment in the number of galaxies, which becomes significantly
more severe at higher thresholds for a very similar purity. The redshift bin-
ning here uses zBPZ−AUTO .
a function of the magnitude limit cut. For comparison we include
the same quantity in Science Verification Data, which is deeper than
Y1 but has much smaller area, see Crocce et al. (2016). We would
like to select a sample and footprint that are at once homogeneous
20.0
and with the highest possible number of galaxies. The curve shows Stars selected morphologically
a plateau in the range 22 . iauto . 22.3 where the number counts 17.5
is maximized, with variations of about 5%. But the figure does 15.0
not account for photo-z performance, which degrades rapidly for 12.5
fainter objects (particularly at high redshift) and is of key relevance
10.0
for BAO measurements, as shown below in Sec. 3.4. Therefore we
decided to stay at the bright end of this range (iauto = 22) as an 7.5
overall flux limit of the sample. 5.0
Color outliers which are either unphysical or from special 2.5
samples (Solar System objects, high redshift quasars) are removed 0.0
as well, to avoid extraneous photo-z populations in the sample (see 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00
Table 1).
zphoto
Figure 4. Photometric redshift distribution of stars selected morphologi-
cally and passing the same cuts described in Table 1.The redshift value
3.2 Star-Galaxy Separation zphot is the mean from the pdf of zBPZ−AUTO , which was used for the
overall sample selection in Section 3.
Removing stars from the galaxy sample is an essential step to avoid
the dampening of the BAO signal-to-noise (Carnero et al. 2012)
or the introduction of spurious power on large scales (Ross et al.
2011a). Stellar contamination affects the broad shape of the mea- A detailed follow up analysis of star-galaxy separation is given in
surement and so we want to minimise it to be able to fit the BAO Sevilla-Noarbe et al. (2018). Here instead we decided to modify
template properly. However, it does not appreciably affect the lo- slightly this proposed cut in order to increase the purity of the sam-
cation of the BAO feature, so we do not need to push for 100% ple (from 95% to 97 − 98%), at the cost of losing approximately
purity. Any residual contamination is then taken care of by using 3% of the objects, by making the following selection:
the weighting scheme detailed in Section 6.
spread model i + (5.0/3.0)spreaderr model i > 0.007.
In this work we have used the default star-galaxy clas-
sification scheme described in detail in Sevilla-Noarbe et al. In Fig. 3 we show the estimated star sample contamination
(2018), see also Drlica-Wagner et al. (2017), which is based on for different thresholds of this cut, using the relation between
the i-band coadd magnitude spread model i and its associated galaxy density and a map of stellar density built from Y1GOLD
error spreaderr model i, from SExtractor. This classifier (a methodology that is described in detail in section 6). The error
was developed using as truth tables data from COSMOS (Leau- bars displayed are the fitting errors obtained for the intercept when
thaud et al. 2007), GOOD-S (Giavalisco et al. 2004) and VVDS parametrizing the contamination level using a linear relationship
(Le Fèvre et al. 2005) overlapping Y1GOLD, and subsequently between the galaxy density as a function of stellar density. Note
tested against CFHTLenS (Erben et al. 2013). The combination that a threshold of 0.007 reduces the contamination level to less
spread model i + (5.0/3.0)spreaderr model i > 0.005 is than 5% across the redshift range of interest. In Table 3 we re-
suggested for high-confidence galaxies as a baseline for Y1GOLD. port a consistent or smaller level of stellar contamination, using a
z
r
i
0.5 0.5 0.5
z = 0.6
0.0 0.0 0.0
color error color error color error
0.5 0.5 0.5
0.5 0.0 0.5 1.0 1.5 2.0 2.5 0.5 0.0 0.5 1.0 1.5 2.0 1.0 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0
g r r i r z
Figure 5. Evolution of BPZ templates in color-color space. Each dot corresponds to a different redshift in steps of 0.1, ranging from z = 0.0 to z = 2.0. The
shadowed region in the central panel is excluded from the sample. The black dots indicate the position of z = 0.6 (triangles), and z = 1.0 (squares) for the
two reddest templates. Also shown, for reference, is the stellar locus as a purple dashed line. The inset crosses indicate an estimate of the error in the colors,
arising from photometric errors, from a sub-sample of DES Y1 galaxies selected in the range 21 < iauto < 22 (see text for more details).
similar estimation, in the catalogues with MOF photometry, both use them to define cuts in color-color space intended to isolate the
for BPZ and DNF (see Sec. 6). In Fig. 5 we also include in the red templates.
middle figure the track from the stellar locus, which showcases the In real data galaxy colors have an uncertainty due to photo-
reason why the first two redshift bins are more affected by stellar metric errors, which effectively thicken those tracks. In order to
contamination, as it crosses the elliptical templates at these red- provide an estimate for this we computed the errors in the colors
shifts. To further illustrate this, in Fig. 4 we show the distribution for a sub-sample of Y1GOLD galaxies with 21 < iauto < 22 (the
of the mean photometric redshifts for stars (selected using the cri- typical range of magnitudes that we explore below to define the
terion |wavg spread model i| < 0.002, a more accurate variant BAO sample). For each galaxy we estimate the color error adding
of spread model i using single-epoch, suitable for moderate to in quadrature the corresponding magnitude errors1 . The average er-
bright magnitude ranges) showcasing how they will contaminate ror in each corresponding color is shown with a cross at the bottom
preferentially the second redshift bin, following the same trend as right inset label of the three panels of Fig. 5. Their values are 0.128,
shown in Table 3. 0.073, 0.067, 0.076 for (g-r, r-i, i-z, r-z) respectively.
In addition, a model of a red elliptical galaxy spectrum is
shown in Figure 6, redshifted to z = 0.4, 0.8, 1.15, where the
3.3 Selecting Red Luminous Galaxies notable 4000 Å break crosses from g → r, r → i and i → z. This
Next we want to select from Y1GOLD a sample dominated by lu- suggests that for z > 0.6 the strongest evolution in color will be for
minous red galaxies, because their typical photo-z estimates are i−z and r −i, and hence we will focus in these color combinations
more accurate than for the average galaxy population, thanks to the in what follows (that moreover have the smallest error).
4000 Å Balmer break in their spectra. This feature makes redshift Note how the transition of the 4000 Å break from one band to
determination easier even with broad-band photometry (Padman- another abruptly bends the color-color tracks in Figure 5. However,
abhan et al. 2005). In addition we want our BAO sample to cover this applies mainly to elliptical templates, and recent star formation
redshifts larger than 0.6 as there are already very precise BAO mea- will dampen this effect.
surements for z < 0.6, see e.g. Cuesta et al. (2016); Ross et al.
(2017b); Beutler et al. (2017).
We have tested that, while a very stringent selection can be 3.4 Optimization of the color and magnitude cuts for BAO
done to yield minimal photo-z errors, e.g. with the redMaGiC
algorithm (Rozo et al. 2016), it does not lead to optimal BAO Optimizing the actual sample selection for the measurement of
constraints because the sample ends up being very sparse, with BAO in imaging data is considerably different that doing so for
∼ 200, 000 galaxies in Y1GOLD at z > 0.6 (Elvin-Poole et al. spectrospopic data. In the later case one basically needs to maxi-
2017). Instead we will follow an alternative path and apply a stan- mize the area (or volume) provided that n̄P > 1 (where n̄ is the
dard selection in color-color space to isolate red galaxies at high galaxy density and P the power spectrum). For imaging data the
redshift, balancing photo-z accuracy and number density with a photometric redshift accuracy plays a vital role. Worse photo-z er-
BAO figure-of-merit in mind. ror degrades the signal as the galaxy radial separations are smeared
In Figure 5 we show the evolution in redshift of the eight out (this also complicates the definition of survey volume). In turn,
spectral templates used in BPZ, which includes one typical red el- the best photo-z’s are typically obtained for very bright, and low
liptical galaxy, two spirals and five blue irregulars/starbursts (color density, samples. Therefore there is a non-trivial. interplay to max-
coded) based on Coleman, Wu & Weedman (1980) and Kinney imise BAO signal to noise.
et al. (1996). We compute the expected observed DES broad-band In DES-BAO-s⊥ -METHOD we discussed in detail how to
magnitudes for these templates as a function of redshift and show
them in different color-color combinations.The tracks are evolved
from z = 0 to z = 2.0 in steps of 0.1 (marked with dots). We will 1 In turn computed as merr = −2.5(F luxerr /F lux)/ log(10)
g r i z
property variation forecasted BAO distance error
Sec. 3.3 (see Fig. 5), as it allows us to select more likely the red-
g r i z dest galaxies which are the ones with lower uncertainties in their
photometric redshift determination and still present a high enough
4000 5000 6000 7000 8000 9000 10000
Wavelength [Å]
number density.
Samples were produced across a grid of a1 and a2 values,
Figure 6. Elliptical model spectrum used in template-based fitting code calculating the number of galaxies Ngal and a mean width of the
BPZ. Overplotted are the DES response filters g,r,i,z. The template has been photo-z distribution σz /(1 + z) for each sample, after splitting the
redshifted to z = 0.4, 0.8, 1.15, where the notable 4000 Å break crosses galaxy in tomographic bins. For BPZ we estimated σz averaging in
from g → r, r → i and i → z. each tomographic bin the width of the individual redshifts posterior
distributions (PDFs) provided per galaxy.
The BAO forecast using the algorithm of Seo & Eisenstein
fold in the photo-z accuracy into an effective n̄eff 2 . However com- (2007) is then run for the Ngal and σz /(1 + z) of each sample and
puting n̄eff is cumbersome and as complicated as doing an actual final values of a1 and a2 are selected to minimise the forecasted
BAO forecasting. Therefore we decided to follow this later path and BAO uncertainty, finding a balance between galaxy number density
rely on the Fisher matrix forecast formalism described in Seo & and redshift uncertainty. In order to give a sense for the sensitivity
Eisenstein (2007). Provided with a concrete set of color-magnitude of such process, we note there is a slight degeneracy when increas-
cuts we measure in the data the number density and redshift uncer- ing a1 and a2 simultaneously, resulting in similar forecasted BAO
tainty in several tomographic bins within 0.6 6 photo-z 6 1.0, and uncertainties. However deviations from this degeneracy direction
assume a clustering amplitude. We then use the formulae from Seo lead to significant degradation in the forecasted error. For example,
& Eisenstein (2007) to predict the precision that one can achieve doubling a1 leads to a degradation of the forecasted error by ap-
with that set of galaxy data properties. We repeat this process for a proximately 0.01 (from 5% to 6% roughly). The values used in this
different set of cuts until an optimal BAO distance error is achieved. analysis are a1 = 2.0, a2 = 1.7. Figure 5 shows the color cut in
Through this process we fix the clustering amplitude, assum- the central panel, where the shadowed region is excluded from the
ing a galaxy bias of b = 1.6 for all calculations. This is the bias sample.
found in Crocce et al. (2016) for a flux limited sample (i < 22.5)
at redshifts z ∼ 0.9, selected from DES Science Verification (SV)
data. Since that redshift and magnitude are compatible with what 3.4.2 Optimization of the magnitude cut
we expect in this paper, we consider b = 1.6 a representative value.
More precise measurements are expected for more biased samples, To further minimize the forecasted BAO uncertainty, an additional,
but the galaxy bias for any given sample is not known a priori and redshift dependent magnitude cut is applied to the sample as a sec-
the redshift uncertainty and number density are the more dominant ond step. This applies a cut to iauto at low redshift which is stricter
factors. than the global iauto < 22 cut (at lower redshift the sample is suf-
For illustrative purposes we show in Table 2 the variation in ficiently abundant that one can still select brighter galaxies, with
BAO distance error achieved by changing the number density and better photo-z, and still be sample variance dominated). The cut is
photo-z accuracy away from those at the optimal cuts described in the form,
below. We also include the variation with survey area. As pointed
iauto < a3 + a4 z. (3)
before, BAO distance errors are very sensitive to photo-z accuracy.
As with the color cut in Eq. 2, this is designed to find a sample
that balances redshift uncertainty with number density, to minimise
3.4.1 Optimization of the color cut the forecasted BAO error. The BAO forecast error was minimised
Thus, in order to maximize the signal-to-noise of the BAO fore- at the values a3 = 19 and a4 = 3 and this cut was applied to the
casted measurement, a color cut is applied to the sample in the sample. We find that the forecasted error improves by ∼ 15% when
introducing the redshift dependent flux limit as opposed to a global
iauto < 22 cut.
2 Photometric redshift errors leads to n̄eff P < 1 in all cases explored. The final forecasted uncertainty on angular diameter distance
Table 3. Characteristics of the DES Y1 BAO sample, as a function of red- estimates, or the stacking from the nearest neighbour redshifts from
shift. Results are shown for a selection of the sample in bins according to the training sample, in the case of DNF (henceforth we’ll call these
DNF photo-z (zphot ) estimate in top of the table and BPZ in the bottom, stack N (z)). Figure 7 shows the stack N (z) (yellow histograms)
both with MOF photometry. Here z̄ =< ztrue > is the mean true redshift, in all 4 redshift bins for our fiducial DNF photo-z analysis.
σ68 and W68 are the 68% confidence widths of (zphot − ztrue )/(1 +
ztrue ) and ztrue respectively, all estimated from COSMOS-DES valida-
tion with SVC correction, as detailed in Sec. 4 and Fig. 7. fstar is the esti- 4.1 COSMOS Validation
mated stellar contamination fraction, see Sec. 6
As detailed in DES-BAO-PHOTOZ, we check the performance of
DNF Ngal bias z̄ σ68 W68 fstar each code by using redshifts in the COSMOS field (which are not
part of the training set in the case of DNF), following the procedure
0.6 − 0.7 386057 1.81 ± 0.05 0.652 0.023 0.047 0.004 outlined in Hoyle et al. (2017). These redshifts are either spectro-
0.7 − 0.8 353789 1.77 ± 0.05 0.739 0.028 0.068 0.037
scopic or accurate (σ68 < 0.01) 30-band photo-z estimates from
0.8 − 0.9 330959 1.78 ± 0.05 0.844 0.029 0.060 0.012
0.9 − 1.0 229395 2.05 ± 0.06 0.936 0.036 0.067 0.015
Laigle et al. (2016). Both validation samples give consistent results
in our case because the samples under study are relatively bright.
BPZ Ngal bias z̄ σ68 W68 fstar The COSMOS field is not part of the DES survey. However a
few select exposures were done by DECam which were processed
0.6 − 0.7 332242 1.90 ± 0.05 0.656 0.027 0.049 0.018
by DESDM using the main survey pipeline. We call this sample
0.7 − 0.8 429366 1.79 ± 0.05 0.746 0.031 0.076 0.042
0.8 − 0.9 380059 1.81 ± 0.06 0.866 0.034 0.060 0.015
DES-COSMOS. Because the COSMOS area is small (2 square de-
0.9 − 1.0 180560 2.05 ± 0.07 0.948 0.039 0.068 0.006 grees) and DECam COSMOS images were deeper and not taken
as part of the main DES-Y1 Survey, we need to first resample the
DES-COSMOS photometry to make it representative of the full
DES Y1 samples that we select in our BAO analysis. Hence we
combining all the tomographic bins is ∼ 4.7%. Note that the dis-
add noise to the fluxes in the DES-COSMOS catalog to match the
cussion in this section only has as a goal the definition of the sam-
noise properties of the fluxes in the DES-Y1 BAO sample, this is
ple. The real data analysis with the sample defined here, and the
what we refer to as resampled photometry. Then for each galaxy in
final BAO error achieved, will of course depend in many other vari-
the DES-Y1 BAO sample, we select the galaxy in DES-COSMOS
ables that were not considered up to this point. Such as the quality
whose resampled flux returns a minimum χ2 when compared to the
of photometric redshift errors, analysis and mitigation of systemat-
DES-Y1 BAO flux (the χ2 combines all bands, g, r, i and z). This
ics, use of the full covariance and optimized BAO extraction meth-
is done for every galaxy in the DES-Y1 BAO sample to make up
ods.
the ‘COSMOS-Validation’ catalog, which by construction has col-
Nonetheless we stress that the forecasted error obtained in this
ors matching those in the DES-Y1 BAO sample. The “true” redshift
section matches the one from the analysis of mock simulations, see
is retrieved from the spectroscopic/30-band photo-z of this match.
e.g. DES-BAO-θ-METHOD, and is in fact quite close to the final
We then run the DNF photo-z code over the COSMOS-
BAO error obtained in DES-BAO-MAIN. In the following sections
Validation catalog to select 4 redshift bin samples in the same way
we discuss the various components that will enter the real data anal-
as we did for the full DES-Y1 BAO sample. We use the “true” red-
ysis, starting with the validation of photometric redsfhit errors and
shifts from the COSMOS-Validation catalogs to estimate the N (z)
the estimate of redshift distributions.
in each redshift bin by normalising the histogram of these true red-
shifts.
Results are shown as histograms in Figure 7, which are com-
4 PHOTOMETRIC REDSHIFTS pared to the stack N (z) from the photo-z code, for reference. The
black histograms show large fluctuations which are caused by real
The photometric redshifts used for redshift binning and transverse
individual large scale structures in the COSMOS field. This can
distance computations in our fiducial analyses are derived using the
be seen by visual inspection of the maps. This sampling variance
Directional Neighborhood Fitting (DNF) algorithm (De Vicente,
comes from the relatively small size of the COSMOS validation re-
Sánchez & Sevilla-Noarbe 2016), which is trained with public
gion. There is also a shot-noise component, indicated by the error
spectroscopic samples as detailed in Hoyle et al. (2017). For com-
bars over the black dots, but it is smaller. In the next section, we
parison we also discuss below the Bayesian Photometric Redshift
briefly describe the methodology to correct for this to be able to
(BPZ) (Benı́tez 2000) which we find slightly less performant in
make use of this validation sample effectively.
terms of the error with respect to “true” redshift values (see below).
In both cases we use MOF photometry which provides ∼ 10−20%
more accurate photo-z estimates with respect to the equivalent esti-
4.2 Sample variance correction
mates using SExtractor MAG AUTO quantities from coadd photom-
etry. In this section we summarise the steps taken to arrive at these As detailed in DES-BAO-PHOTOZ we apply a sampling variance
choices, based on a validation against data over the COSMOS field. correction (SVC) to the data and test this method with the Halo-
We recall that throughout this work we use the individual ob- gen mocks described in DES-BAO-MOCKS. In what follows we
ject’s mean photo-z from BPZ (not to be confused with the mean provide a summary of such process and its main results.
value z̄ =< z > of the sample) and the predicted value in the We use the VIPERS catalog (Scodeggio et al. 2016), which
fitted hyper-plane from the DNF code, as our point estimate for spans 24 square degrees to i < 22.5, to estimate the sampling
galaxy redshifts. As for the estimates of the N (z) from the photo-z variance effects in the above COSMOS validation. After correct-
codes, for comparison with our fiducial choice based on the COS- ing VIPERS for target, color and spectroscopic incompleteness we
MOS narrow band p(z), we will use the stacking of Monte Carlo select galaxies in a similar way as done in section 3. We then use the
realisations of the posterior redshift distributions p(z) for the BPZ VIPERS redshifts to estimate the true N (z) distribution of the par-
MOF-DNF 0.6<Z<0.7 < zstack > =0.657 MOF-DNF 0.7<Z<0.8 < zstack > =0.758
17.5 COSMOS SVC W68=0.046 z=0.005 68=0.023 14 COSMOS SVC W68=0.057 z=0.019 68=0.028
DES-Y1 stack W68=0.048 68=0.022 DES-Y1 stack W68=0.061 68=0.028
COSMOS raw W68=0.047 z=0.0 68=0.022 COSMOS raw W68=0.068 z=0.019 68=0.029
15.0 12
12.5
N(z) (Normalized dN/dz)
7.5 6
5.0 4
2.5 2
0.0 0
0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.6 0.7 0.8 0.9 1.0
True redshift z True redshift z
MOF-DNF 0.8<Z<0.9 < zstack > =0.852 MOF-DNF 0.9<Z<1.0 < zstack > =0.934
COSMOS SVC W68=0.057 z=0.008 68=0.029 COSMOS SVC W68=0.072 z=0.008 68=0.036
14 DES-Y1 stack W68=0.064 68=0.03 12 DES-Y1 stack W68=0.078 68=0.037
COSMOS raw W68=0.06 z=0.002 68=0.027 COSMOS raw W68=0.067 z=0.009 68=0.033
12 10
N(z) (Normalized dN/dz)
2 2
0 0
0.6 0.7 0.8 0.9 1.0 1.1 0.7 0.8 0.9 1.0 1.1 1.2
True redshift z True redshift z
Figure 7. Normalised redshift distributions for our different tomographic bins of DNF-MOF photo-z. Stack N (z) are shown for the full DES-Y1 BAO sample
(yellow histograms). The black histogram (with Poisson error bars) shows the raw 30-band photo-z from the COSMOS-DES validation sample. Magenta lines
shows the same sample corrected by sample variance cancellation (SVC, see text), which is our fiducial estimate. The labels show the values of W68 , σ68 and
∆z =< zstack > − < z > and in each case, see also Table 3.
ent DES-COSMOS sample (before we select in photometric red- 4.3 Photo-z validation results
shifts). The ratio of the N (z) in the DES-COSMOS sample to the
one in VIPERS gives a sample variance correction that needs to be In Table 3 we show the values of σ68 , which corresponds to the
applied to the N (z) in each of the tomographic bins. 68% interval of values in the distribution of (zphoto − ztrue )/(1 +
ztrue ) around its median value, where zphoto is the photo-z from
Figure 7 shows the SVC-corrected version of the raw COS- DNF (zmean above) and ztrue is the redshift from the COSMOS
MOS catalog in magenta. As shown in this figure the resulting validation sample corrected by SVC. We also show W68 and z̄
distribution is much smoother than the original raw measurements which are the 68% interval and mean redshift in the ztrue distribu-
(black histograms). This by itself indicates that SVC is working tion for each redshift bin. The corresponding values for the stack
well. Tests in simulations show that this SVC method is unbiased N (z) and raw N (z) are also shown in the labels of Figure 7. ∆z
and reduces the errors in the mean and variance of the N (z) distri- in the label inset shows the difference ∆z =< zstack > − < z >,
bution by up to a factor of two. Similar results are found for differ- where < zstack > is the mean stack redshifts for DES-Y1, shown
ent binnings in redshift. in the top label.
We have performed an extensive a comparison of the quan-
Notably, the distributions obtained from the stacked N(z) and tities shown in Table 3 computed with different validations sets:
the ones from COSMOS SVC match well overall, although some DES-COSMOS with and without SVC, using N (z) from DNF
discrepancies can be seen, e.g. for the second and fourth bin. stacks, using the COSMOS subsample with spectroscopic redshifts
More quantitative statements are provided below, but in DES-BAO- (as opposed to that with 30-band photo-z). We have also compared
MAIN (Table 5, entry denoted “w(θ) z uncal”) we show these have these N (z) to the one predicted by subset galaxies that have spec-
no impact in our cosmological results. The difference in angular di- tra within the BAO sample over full DES-Y1 footprint. Further-
ameter distance measurements when using either of these two sets more we have performed a validation using a larger spectroscopic
of redshift distributions is less than ∼ 0.25σ. sample in the VIPERS/W4 field (∼ 4 square degrees) which was
ngal /hngal i
We found the most important systematic effect, in terms of 1.00
its impact on the measured clustering, to be the stellar density. In 0.95
the top panel of Fig. 8 we find positive trends when comparing the
number density of our ‘galaxy’ sample as a function of the stellar 0.90
number density (nstar ). Our interpretation is that there are stars in 0.85
our sample. Assuming these contaminating stars follow the same
spatial distribution as the stars we use to create our stellar density 0.80
map, this stellar contamination will produce a linear relationship 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
between the density of our galaxy sample and the stellar density. In stellar density (arcmin−2 )
this scenario, the value of the best-fit trend where the number den-
sity of stars, nstar , is 0 is then the purity of the sample. We find the
results are indeed consistent with a linear relationship, as illustrated
in the top panel of Fig. 8. The stellar contamination, fstar , that can 1.10
0.6 < z < 1.0
be determined from these plots is listed in Table 3. The stellar con-
tamination varies significantly with redshift, as expected given the 1.05
proximity of the stellar locus to the red sequence as a function of ngal /hngal i
redshift. Thus, we measure the stellar contamination in ∆z = 0.05 1.00
bin widths and use a cubic spline interpolation in order to obtain the
stellar contamination at any given redshift. This allows us to assign
0.95
a weight to each galaxy given by,
w(fstar (z)) = ((1 − fstar (z)) + nstar fstar (z)/hnstar i)−1 , (4) 0.90
the mean i-band PSF FWHM (seeing, which we denote as si ) and 1.00
the g-band depth (dg ). For the seeing, we do not find a strong de- 0.95
pendence on redshift and thus use the full sample to define the see-
ing dependent weight 0.90
−1
w(si ) = (As + Bs si ) , (5) 0.85
where As and Bs are simply the intercept and slope of the best- 0.80
fit linear relationship, shown in the middle panel of Fig. 8. The 22.00 22.25 22.50 22.75 23.00 23.25 23.50 23.75 24.00
coefficients we use are Ai = 0.782 and Bi = 0.0625. For the 10σ g-band depth limit (magnitudes)
g-band depth, we fit linear relationships in redshift bins ∆z = 0.1
and again use a cubic spline interpolation in order to obtain a weight Figure 8. The galaxy density vs. potential systematic relationship used to
at any redshift define weights that we apply to clustering measurements. Top panel: The
galaxy density versus stellar density in four photometric redshift bins. The
w(dg , z) = (C(z) + dg (1 − C(z))/hdg i)−1 , (6)
linear fits are used to determine the stellar contamination. The χ2 values for
where C(z) is the interpolated result for the value of the linear- the fits are 9.7, 10.0, 3.5, and 14.3 (8 degrees of freedom). Middle panel:
fit where dg = 0. The relationships as a function of redshift and The galaxy density versus the mean i-band seeing for our full sample. The
the linear best-fit models are shown in the bottom panel of Fig. 8. inverse linear fit is used to define weights applied to clustering measure-
ments. The χ2 is 7.7 (8 degrees of freedom) and the coefficients are 0.788
The total systematic weight, wsys , is thus multiplication of the three
and 0.0618. Bottom panel: The galaxy density versus g-band depth in four
weights
photometric redshift bins. The coefficients are interpolated as a function of
wsys = w(fstar (z))w(si )w(dg , z). (7) redshift and used to define weights to be used in the clustering measure-
ments. The χ2 values for the fits, given 8 degrees of freedom, are 7.7, 8.9,
The dependencies we find are purely empirical as we lack any 12.7, and 6.1. The slopes are (-0.0256, 0.0320, 0.103, 0.0609).
7 TWO-POINT CLUSTERING
In this section we describe the basic two-point clustering properties
of the samples previously defined. We concentrate on large-scales
where the BAO signal resides, and the sample using zDNF−MOF
photometric redshifts which is the default one used in DES-BAO-
MAIN.
We compute the angular correlation function w(θ) of the sam-
ple, split into four redshift bins, using the standard Landy-Szalay
estimator (Landy & Szalay 1993),
DD(θ) − 2DR(θ) + RR(θ) Figure 9. Top panel shows the impact of the systematic weights on each
w(θ) = (8) redshift bin, shown by the differential angular correlations, with and with-
RR(θ)
out weights applied, relative to the uncertainty. One can see that the weights
as implemented in the CUTE software5 (Alonso 2012), where make the biggest difference for the 0.7 < z < 0.8 bin, which is the redshift
DD(θ), DR(θ) and RR(θ) refer to normalized pair-counts of range with the greatest stellar contamination. The thick solid line displays
Data (D) and Random (R) points, separated by an angular aper- the BAO feature in similar units, (wBAO − wno BAO )/σw , for the second
tomographic bin as an example (different bins show similar BAO strength
ture θ. Random points are uniformly distributed across the foot-
but displaced slightly in the angular coordinate). The systematic weights
print defined by our mask (albeit downsampled following the frac-
only modify the underlying smooth shape, and do not have a sharp feature
tional coverage of each pixel, described in Sec. 5), with an abun- at BAO scales. Bottom panel shows the ratio of correlations for each bin,
dance twenty times larger than that of the data in each given bin. which provides additional information on the absolute size of the correc-
For the fits and χ2 values quoted in this section we always con- tions (in this case we only plot up to scale with no zero crossings of w).
sider 16 angular-bins linearly spaced between θ = 0.45 deg and
θ = 4.95 deg, matching the scale cuts in the BAO analysis us-
ing w(θ) of DES-BAO-MAIN. We compute pair-counts in angular
aperture bins of width 0.3 deg in order to reduce the covariance BAO feature at this scales we also display in thick solid black line
between the measurements. The covariance matrix is derived from the theoretical angular correlation function with and without BAO,
1800 Halogen mocks, described in detail in DES-BAO-MOCKS. for the second tomographic bin for concreteness, relative to the sta-
The expected noise in the inverse covariance from the finite tistical errors. The corrections are all at the same level (or smaller)
number of realisations (Hartlap, Simon & Schneider 2007) and the than the expected BAO signal.
translation of that into the variance of derived parameters (Dodel- The weights have the largest impact in terms of clustering am-
son & Schneider 2013) is negligible given the size of our data vec- plitude for the redshift bin 0.7 < z < 0.8, which is the redshift
tor (16 angular measurements per tomographic redshift bin) and range with the largest stellar contamination (∼ 4%, see Table 3), al-
the number of model parameters (one bias per bin). For instance though never exceeding one σw . For the remaining bins the change
the increased error in derived best-fit biases
p in any given bin would in the correlation functions are within 1/4 of σw . We can assess
be sub-percent. The change in the full χ2 is ∼ 3.7% (16x4 data- quantitatively the total potential impact of the weights by calculat-
points, see the discussion below). We therefore neglect these cor- ing χ2sys = ∆w(θ)t C −1 ∆w(θ); the square-root of this number is
rections in this section. an upper bound in the impact, in terms of number of σ’s, that the
Figure 9 shows the impact of the systematic weights on the weights could have on the determination of any model parameter.
measured angular clustering in terms of the difference ∆w between In the range 0.45 deg < θ < 4.95 deg, with 16 data-points,
the pre-weighted correlation function w and the post-weighted one we find χ2sys = 0.1, 1.35, 0.2 and 0.5 respectively for each tomo-
wweighted , relative to the statistical error σw (i.e. neglecting all co- graphic bin separately (showing that for example best-fit bias de-
variance). To compare this against the expected amplitude of the rived solely from the 2nd tomographic bin can be shifted by more
than one sigma if weights are uncorrected for). More interestingly,
for the four bins combined and including the full covariance matrix,
5 https://github.com/damonge/CUTE we find χ2sys = 1.35. This implies a maximum impact of 1.16σ in
Figure 10. Angular correlation function in four redshift bins, for galaxies selected with zDNF−MOF . Symbols with error bars show the clustering of galaxy
sample corrected for the most relevant systematics. Dashed line displays a model using linear theory with an extra damping of the BAO feature due to
nonlinearities, and a linear bias fitted to the data (whose best fit value is reported in the inset labels). We consider 16 data-points and one fitting parameter in
each case (dof=15). Note that the points are very covariant, which might explain the visual mismatch in the first tomographic bin that nonetheless retains a
good χ2 /dof.
Figure 11. Angular cross-correlation functions of the four tomographic bins in 0.6 < zphoto < 1.0, see Fig. 10, for galaxies selected according
p to
zDNF−MOF . The model prediction shown with dashed lines assumes a bias equal to the geometric mean of the auto-correlation fits, i.e. bij = bi bj ,
and is basically proportional to the overlap of redshift distributions, which are shown in the bottom right panel.