Search for the Higgs boson in the \mathrm{b\bar{b}} decay channel using the CMS detector

Search for the Higgs boson in the decay channel using the CMS detector

Abstract

A search for the standard model Higgs boson in the \bbbardecay channel has been carried out with the CMS detector at the LHC collider. The searched production modes are the associated VH production, the VBF production and the production in association with top quark pairs (ttH). The analyses are based on pp collision data collected at centre-of-mass energies of 7 and 8 TeV, corresponding to integrated luminosities of 5 fb and 20 fb, respectively. The strategy and results of the searches are reported.

keywords:
Higgs, b-quark, b-tagging, jet, regression, energy resolution
1

00 \journalnameNuclear Physics B Proceedings Supplement \runauth \jidnuphbp \jnltitlelogoNuclear Physics B Proceedings Supplement

\dochead

1 Motivations

At a mass () of 125 GeV the standard model (SM) Higgs boson decay mode into a bottom quark-antiquark pair (\bbbar) dominates the total width ((1). While the decay of the Higgs boson to vector bosons has been observed in different channels (ZZ, ,WW) (2); (3); (4), the direct couplings of the Higgs boson to fermions, and in particular to down-type quarks, remains to be firmly established (5); (6).

The current measurements constrain indirectly the couplings to the up-type top quark, since the dominant Higgs production mechanism is gluon fusion induced by top-quark loop. The measurement of the  decay represents a direct test of whether the observed boson interacts as expected with the quark sector and provides the unique final test of the direct coupling of the Higgs boson to down-type quarks, an essential aspect of the nature of the newly discovered boson. To date, the most precise constraints on the couplings to down-type quarks are provided by the CMS experiment (7), exploiting the results from the VH(\bbbar) search (8).

2 Challenges

Despite the largest expected branching fraction, the  final state is quite more challenging to measure compared to the cleaner signatures provided by the decay in ZZ and then in four leptons or in two photons that have led to the Higgs boson discovery at LHC.

Besides the poorer invariant mass resolution, the \bbbarfinal state is also characterized by smaller signal over background ratio. The dominant production mechanism is gluon fusion ( at LHC), but when paired with the \bbbardecay mode, the resulting irreducible background from QCD production of b quarks is overwhelming, roughly 7 order of magnitude larger than the signal. The signal topology of other production mechanisms is exploited in order to increase the sensitivity to the  signal. The most sensitive channel at LHC is the search for the SM decay  in events where the Higgs boson is produced in association with a W or Z boson decaying leptonically. After event selections, VH(\bbbar) signal events are one order of magnitude more than the H events, however they are spread over an interval of mass that is ten times larger and with a signal over background a factor 40 smaller.

3 b-jet Identification

The identification or “b-tagging” of jets resulting from the fragmentation and hadronization of bottom-quark plays a fundamental role in reducing the otherwise overwhelming background to these signal signatures from processes involving jets from gluons (g), light-flavor quarks (u, d, s), and c-quark fragmentation.

A useful property of B hadrons in this respect is their lifetime, with 500 m. Therefore a B hadron with a momentum of 50 GeV will fly on average almost half a centimeter before decaying. This translates into the fact that secondary decay particles will have a sizable impact parameter (IP) with respect to the B hadron production point.

The B hadrons are much more massive than anything they decay into, thus the decay products will have a few GeV of momentum in the B rest of frame. This effect is particularly evident for leptons from B decays, which have order of a GeV of \ptrelative to the B flight direction, while leptons in generic jets (from decays in flight of ’s or K’s) tend to be more closely aligned with the jet. Another peculiar property of the B hadron decay is the relatively high rate of lepton production from leptonic decays ( 35%); indeed, the presence of leptons is a good signature of the presence of B hadrons in a jet. A variety of reconstructed objects – tracks, vertices and identified leptons – are used to build observables, which are then combined into a single discriminating variable, which separates b from light-flavored jets. The main observables used as input to the b-tagging algorithms are related to the B hadron lifetime and the presence of a secondary vertex.

The IP of a track with respect to the primary vertex can be used to distinguish decay products of a \cPqb hadron from prompt tracks. The IP is calculated in three dimensions, taking advantage of the excellent resolution of the pixel detector. For transverse momenta above a few GeV, since the multiple scattering is less relevant, the resolution on the two-dimensional IP is independent of and approaches 30 (10).

The presence of a secondary decay vertex and kinematic variables associated with this vertex can be used to discriminate between b and non-b jets. These variables include the flight distance and direction, i.e. the vector between primary and secondary vertex, and various properties of the system of associated secondary tracks such as the multiplicity, the invariant mass or the energy.

The CSV b-tagging algorithm (11) is used to identify b-jets in the CMS searches reported here. This algorithm combines in an efficient way the information about track IP and secondary vertices, providing discrimination also when no secondary vertices are found.

The efficiency to tag b jets and the rate of misidentification of non-b jets depend on the requirement on the discriminant, and on the transverse momentum and pseudorapidity of the jets. Several working points for the CSV output discriminant are used in the analyses. The loose (CSVL), medium (CSVM), and tight (CSVT) operating points are defined as the CSV values such that the a misidentification probability for light-parton jets is close to 10%, 1%, and 0.1%, respectively, at jet-\ptof about 50 GeV and . For the CSVT requirement the efficiencies to tag b quarks and c quarks are approximately 50% and 6% respectively. The corresponding efficiencies for CSVM are approximately 65% and 15%, while for CSVL are 81% and 32%.

4 Specific Corrections for -jet Transverse Momentum

The invariant mass resolution plays a fundamental role to improve the  signal discrimination against the non-peaking background and to achieve a better separation of the Higgs boson from the Z boson peak.
The jet energy calibration in CMS is performed as a function of the jet-\ptand , and taking into account the pile-up activity of the event. The calibration is quite accurate, however it is performed on a QCD dijet sample composed mainly by gluon initiated jets and does not take into account the additional details of the jet reconstruction.

Typical values for the jet energy resolution in CMS are about 15% for \pt 30 GeV (12). The presence of neutrinos in more than 35% of the B hadron decay chains results in a lower response for the b-jet, with respect to the light quark/gluon induced jets used in the standard calibration.

In CMS, the searches for the  (both VH and VBF) have developed a specific strategy to correct the b-jet energy to improve the invariant mass resolution of the reconstructed Higgs boson. By applying multivariate regression techniques similar to those used by the CDF experiment (13) the method attempts to recalibrate the jet energy to the true b-jet energy. Thanks to the particle-flow jet reconstruction (14), one can access various properties of each jet and adjust the calibration accordingly. The regression is essentially a multi-dimensional calibration to the particle level - including neutrinos - which exploits the main b-jet properties. A specialized BDT (15) is trained on simulated signal events and provides for each jet a correction factor that improves both the b-jet energy scale and its resolution.

Inputs are chosen among variables that are correlated with the b-quark energy and well measured. They include detailed jet structure information about tracks and jet constituents, which differs from light flavor quarks/gluons jets. Information from B-hadron decays on the reconstructed secondary vertices and the soft lepton (SL) from semi-leptonic decay are used, providing an independent estimate of the b quark \pt. Also the information carried by the variables related to the \METvector is exploited for the channel where no real missing transverse energy is expected (, VBF), acting as a kinematic constraint for the momentum balance in the transverse plane.

The most discriminating variables across all modes are kinematic, and this is due to the fact that most of the power of the regression derives from the neutrinos involved in the semi-leptonic B decays.

In Tab. 1 the variables used as input to the regression in the VH search are listed. Regression performance depend on the phase space used to train the BDT in a non negligible way, so separate trainings are performed for each channel with very similar sets of inputs to deal with the different kinematic properties.

Variable
jet-\ptbefore calibration
jet-\ptafter the default calibration
jet \etafter the default calibration
jet transverse mass after the default calibration
uncertainty on the JEC
transverse momentum of the leading track in the jet
secondary vertex decay length
error on jet secondary vertex decay length
jet secondary vertex mass
jet secondary vertex \pt
total number of jet constituents
relative transverse momentum of SL candidate to the jet-\pt
transverse momentum of SL candidate in the jet
distance in of SL candidate with respect to the jet axis
event total transverse missing energy (\MET)
azimuthal opening between \METand the jet directions
Table 1: Set of variables used as input to the regression for the VH channel.

The average improvement on the mass resolution, measured on simulated signal samples, when the corrected jet energies are used is 15-25%, resulting in an increase in the analysis sensitivity of 10–20%, depending on the specific phase space. This improvement is shown in Fig. 1 for simulated samples of events. A better separation of the VZ/VH signals is also achieved by applying this corrections, as reported in Fig. 2 for and ZZ simulated events.

The validation of the regression technique in data has been performed studying the \ptbalance in Z() events and the reconstructed top-quark mass distribution in \ttbar-enriched samples targeting the lepton+jets final state.

Figure 1: Dijet invariant mass distribution for simulated samples of  events before (red) and after (blue) the energy correction from the regression procedure is applied.
Figure 2: Mass difference between ZH and ZZ simulated processed before and after the regression is applied.

5 Vh(\bbbar) Searches

The most sensitive channel for the search for the SM decay  at the LHC is the associate production with a W or Z boson, when the Higgs boson recoils with large momentum transverse to the beam line ().

The CMS experiment reported an excess of events above the expected background, the first 2 indication of the  decay at the LHC. The result combines the analysis of the data samples corresponding to integrated luminosities of up to  fb at  (20) with the analysis of the full 8\TeVdata sample corresponding to a luminosity of up to  fb.

The search analyzes six channels separately for the V decay mode: , , , , , and , all with the Higgs boson decaying to \bbbar. For each channel, different  boost regions are selected. Because of different signal and background content, each  region has different sensitivity and the analysis is performed separately in each region, resulting in a total of 14 channels.

The presence of a vector boson decaying leptonically in the final state highly suppresses the QCD background, while also providing an efficient trigger path. The requirement that the Higgs boson is produced with a large boost provides several additional advantages: it exploits the harder \ptspectrum of the signal, thus resulting in a further reduction of the large backgrounds from W and Z boson production in association with jets; it helps in reducing the large background from top-quark production in the signal channels including neutrinos; it enables the accessibility of the  channel at trigger level via large \MET; and it generally improves the mass resolution of the reconstructed Higgs candidates.

The background processes to VH production with  are the production of vector bosons in association with one or more jets (V+jets), \ttbarproduction, single-top-quark production, diboson production (VV), and QCD multijet production. Except for dibosons, these processes have production cross sections that are several orders of magnitude larger than Higgs boson production.

Variable


100, 130, 130
40-250, 250
\MET see
CSV 0.50, 0.244
CSV
2, –, –
0.7, 0.7, 0.5
\METsignificance 3, –, –
Table 2: Selection criteria that define the signal region. The values listed for kinematic variables are in units of \GeV, and for angles in units of radians.

Key point of the analysis strategy is the use of the regression, to improve the invariant mass resolution as described in sec. 4. The reconstructed  mass resolution achieved is about 10%.

Control regions in data are selected to adjust the event yields from simulation for the main background processes in order to estimate their contribution in the boosted phase space defined by the signal region.

The signal region is defined by the requirements listed in Tab. 2. Fig. 3 shows the  distribution after the non resonant backgrounds subtraction. The VH signal is visible with a yield compatible with the SM expectation and a significance of 1.1.

Figure 3: Dijet invariant mass distribution combined for all channels after the non resonant subtraction.

To better separate signal from background under different Higgs boson mass hypotheses, an event BDT discriminant is trained separately at each mass value using simulated samples for signal and all background processes. The training of this BDT is performed with all events in the signal region. Among the most discriminant variables for all channels are the dijet invariant mass distribution, the number of additional jets, the value of CSV for the Higgs boson daughter with the second largest CSV value, and the angular distance between Higgs boson daughters. By making use of correlations between discriminating variables in signal and background events, the BDT yields a new variable that allows to discriminate the signal more effectively than with the use of the  information only.

Results are obtained from combined signal and background binned likelihood fits to the shape of the output distribution of the BDT discriminants trained separately for each channel. In total 14 BDT distributions are considered, one for each channel and each boost category. In the simultaneous fit to all channels, in all boost regions, the BDT shape and normalization for signal and for each background component are allowed to vary within the systematic and statistical uncertainties. Figure 4 combines the BDT outputs of all channels where the events are sorted in bins of similar expected signal-to-background ratio, as given by the value of the output of their corresponding BDT discriminant. The observed excess of events in the bins with the largest signal-to-background ratio is consistent with what is expected from the production of the SM Higgs boson.

Figure 4: Events are sorted in bins of similar expected signal-to-background as given by the BDT discriminant output value.

The combined effect of the systematic uncertainties results in a reduction of 15% on the expected significance of an observation when the Higgs boson is present in the data at the predicted SM rate.

The analysis sets a 95% confidence level limit of 1.89 times the SM production for an Higgs boson mass of 125\GeV. The expected limit in absence of Higgs boson production is 0.95. For \GeV, the excess of observed events corresponds to a local significance of 2.1  away from the background-only hypothesis, see Fig. 5. This is consistent with the 2.1  expected when assuming the SM prediction for Higgs boson production. The signal strength corresponding to this excess, relative to the expected cross section for the SM Higgs boson, is  (8).

Figure 5: Local p-values and corresponding significance (measured in ) for the background-only hypothesis.

The measured coupling , which quantifies the ratio of the measured Higgs boson partial width into \bbbarrelative to the SM value is consistent with the expectations from the SM, within uncertainties.
This result is the first 2 indication of the  decay at the LHC. The sensitivity of this search - expected significance - is the highest thus far, as reported in Tab. 3.

Experiment Expected significance
CDF 2.5 1.0 1.3  (17)
D0 1.2 1.1 1.5  (18)
D0+CDF 1.95 0.75 1.9  (16)
ATLAS 0.2 0.9 1.6  (19)
CMS 1.0 0.5 2.1  (8)
Table 3: Signal strength and expected significance for the VH searches at Tevatron and LHC colliders.

The combination of the CMS result with the evidence of the Higgs boson decay to leptons (21) results in the strong evidence (with an observed significance of 3.8 , when 4.4 are expected) for the direct coupling of the 125 GeV Higgs boson to down-type fermions adding precious information to the comprehension of the newly discovered boson.

ZH, as WH, is dominated by quark-initiated subprocesses (qZH), but there is also a large gluon-initiated contribution, ggZH, whose contribution to the total cross section is not negligible, 8%. NNLO QCD corrections to qqZH are included in the VH result reported in (8) both as inclusive and as \pt-differential corrections, while NLO corrections to ggZH are applied as a flat over \ptfactor (9). The \ptspectrum of the gluon-initiated contribution to associated production has became available recently (22). It is different from the dominant quark-initiated contribution and it peaks at \pt(H)150 GeV. In the estimation of Higgs boson signals in the boosted regime the resulting \ptdifferential correction due to ggZH contribution impacts up to 30% the cross section at the highest \ptcategory. Folding in the corrected ZH \ptspectrum, combining with WH as is, overall VH theory prediction scales up by 10%, which translates into a roughly 10% decrease (increase) in (sensitivity).

5.1 Diboson Cross Section Measurement

The Z is the purest \bbbarresonance candle, which allows to test b-tagging performance and b-jet energy specific corrections. The production cross section for the VZ process, where , is only few times larger than the VH production cross section. Given the nearly identical final state this process provides a benchmark against which the Higgs boson search strategy can be tested. The Z peak, whose resolution is improved by using the aforementioned b-jet specific energy corrections, is also used to measure the consistency of the diboson rate with the expectation from the SM. This important background to the VH Higgs production with H is measured in the relevant phase space for the Higgs boson search.

The study of VZ diboson production in proton-proton collisions is per se interesting because it provides an important test of the electroweak sector in the SM.

Using the BDT analysis, the VZ process is observed with a statistical significance of 6.3   (5.9  expected). This corresponds to a signal strength relative to the SM of . In Fig. 3 the VZ signal is clearly visible with a yield compatible with the SM expectation and a significance of 4.1. All cross sections extracted for individual channels provide compatible values with each other and the SM expectation.

The 6.3 first observation of the VZ(\bbbar) SM candle at an hadron collides results in a very strong validation of the full analysis strategy designed to reconstruct the  decay mode in the VH search at CMS (23).

6 VBF Production Mechanism

In the search for  signals the CMS experiment has also exploited the VBF production mechanism (24), characterized by a very particular event topology. In the VBF process a valence quark of each one of the colliding protons radiates a W or Z boson that subsequently interacts or “fuses”. Each valence quark carries on average 1/6 of the proton energy and in the radiation of the weak boson a t-channel four-momentum with Q is exchanged. In this way the two valence quarks are typically scattered away from the beam line and inside the detector acceptance.

The prominent signature of VBF is therefore the presence of two energetic hadronic jets, roughly in the forward and backward direction with respect to the proton beam line. As a result, in the case of a VBF Higgs boson production, the signal final state features are a central b-quark pair (from the Higgs boson decay) and a light- quark pair (u,d-type) from each of the colliding protons, in the forward and backward regions. Another important property of signal events is that, the Higgs boson being produced in a VBF process, no color is exchanged in the production. Thus, in the most probable color evolution of these events, the tagging VBF light-quark jets connect to the proton remnants in the beam line directions, while the two b-quarks connect between themselves, as the decay products of the color neutral Higgs. In this way very little additional QCD radiation and hadronic activity is expected in the space outside the color-connected regions, in particular in the whole rapidity interval (rapidity gap) between the two tagging jets, with the exception of the Higgs boson decay products.

Although the predicted cross section production is larger than the VH one, this search results to be less sensitive because the all hadronic final state is quite challenging to trigger on and the QCD multi jet background dominates the selected data sample. Other relevant backgrounds arise from: hadronic decays of Z or W bosons in association with additional jets, hadronic or semi-leptonic decays of top-pairs, and hadronic decays of single-top productions.

Upper limits on the production cross section times the  branching ratio, with respect to the expectations for a SM Higgs boson, are derived for a Higgs boson in the mass range 115–135 GeV and shown in Fig. 6. In this range, the expected 95% confidence level upper limits in the absence of a signal vary from 2.4 to 4.1 times the SM prediction, while the corresponding observed upper limits vary from 2.4 to 5.2. At a Higgs boson mass of 125 GeV the expected limit is 3.0 and the observed limit is 3.6. For a 125 GeV Higgs boson signal the observed (expected) significance is 0.5 (0.7), and the fitted signal strength is . Because of the small signal to background ratio, the results are dominated by the statistical uncertainty in the background.

Figure 6: Expected and observed 95% confidence level limits on the signal cross section in units of the SM expected cross section.

7 Production

The production paired with the  decay mode is an interesting process, whose rate depends on the largest of the fermionic couplings to the Higgs boson - top and bottom quarks - which are two key couplings to probe the Higgs boson’s consistency with SM expectations. Although the production contributes little to the expected cross section with respect to the total Higgs boson production (0.6%), this signature provides a probe that is complementary to the VH channel: they both provide information about the interaction between the bottom quark and the Higgs boson, but the dominant backgrounds are very different, \ttbar+jets production instead of W+jets production.

The vertex is the most challenging one to probe directly, but it represents the only way to probe the coupling of the Higgs boson to top quark in a model-independent manner. Since the expected SM rates in this channel are very small, a sizable excess would be clear evidence for new physics (25).

The final state has a large multi-jet background that is suppressed to a tiny level by requiring at least one charged lepton (electron or muon) in the event. The resulting signature is still dwarfed by the QCD production of a \ttbarpair plus additional jets. Due to the good experimental efficiency to tag jets that originate from b quarks, while rejecting light-flavored hadron decays, the background is greatly reduced by requiring at least two jets in the final state to be b-tagged. Still, the background subprocess remains irreducible since it has the same final state of the signal. It provides experimental challenges since its cross section is much larger than that of the signal and its rate and shape have also large theoretical uncertainties (26). Recently, CMS has reported new results (27) on this search which exploits the Matrix Element Method (28) technique to simultaneously achieve both a mitigation of the combinatorial background and also a maximal separation between the signal and the irreducible background. This analysis reported as observed limit , corresponding to a best–fit value of , Fig. 7.

Figure 7: The best-fit value of the signal strength modifier , broken-up by category (27).

8 Summary

Searches for the standard model Higgs boson decaying to \bbbarexploiting the VH, VBF and ttH production modes are reported. Data collected with the CMS experiment corresponding to integrated luminosities of up to 5 fb at 7 TeV and up to 20 fb at 8 TeV are analysed. The VH channel, as the most sensitive search, reports an excess of events above the expected background with a local significance of 2.1 standard deviations compatible with a Higgs boson mass of 125 GeV. The signal strength corresponding to this excess is . The fitted signal strengths for the VBF and ttH searches are respectively and .

References

Footnotes

  1. volume:

References

  1. LHC Higgs Cross Section Working Group CERN-2011-002
  2. CMS Collaboration, Phys. Rev. D 89 (2014) 092007
  3. CMS Collaboration, CMS-HIG-13-001 (2014)
  4. CMS Collaboration, JHEP 1401 (2014) 096
  5. CMS Collaboration, Nature Phys. 10 (2014) 3005
  6. ATLAS Collaboration, (2014) ATLAS-CONF-2014-009
  7. CMS Collaboration, (2008) JINST 3 S08004
  8. CMS Collaboration, Phys.Rev. D89 (2014) 012003
  9. L. Altenkamp and others JHEP 1302 (2013) 078
  10. CMS Collaboration, (2014) arXiv:1405.6569
  11. CMS Collaboration, JINST 8 (2013) P04013
  12. CMS Collaboration, (2010) CMS-PAS-JME-10-014
  13. Aaltonen, T. and others (2011), arXiv.1107.3026
  14. CMS Collaboration, (2009) CMS-PAS-PFT-09-001
  15. Hoecker, A. and others PoS ACAT (2007) 040
  16. CDF and D0 Collaborations, Phys. Rev. Lett 109 (2012) 071804
  17. CDF Collaboration, Phys. Rev. Lett. 109 (2012) 111802
  18. D0 Collaboration, Phys. Rev. Lett. 109 (2012) 121802
  19. ATLAS Collaboration, (2013) ATLAS-CONF-2013-079
  20. CMS Collaboration, JHEP 06 (2013) 081, arXiv.1303.4571
  21. CMS Collaboration, JHEP 1405 (2014) 104, arXiv.1401.5041
  22. Englert, Christoph and others, Phys. Rev. D89 (2014) 013013
  23. CMS Collaboration, Eur.Phys.J.C 74 (2014) 2973
  24. CMS Collaboration, CMS-PAS-HIG-13-011 (2013)
  25. Degrande, C. and others JHEP 1207 (2012) 036
  26. A. Bredstein and others, Phys. Rev. Lett. 103 (2009) 012002
  27. CMS Collaboration, (2014) CMS-PAS-HIG-14-010
  28. D0 Collaboration, Naure Phys. (2004) 429
Comments 0
Request Comment
You are adding your first comment
How to quickly get a good reply:
  • Give credit where it’s Due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should ultimately help the author improve the paper.
  • Remember, the better we can be at sharing our knowledge with each other, the faster we can move forward.
""
The feedback must be of minumum 40 characters
Add comment
Cancel
Loading ...
185724
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
Edit
-  
Unpublish
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description