1 Introduction

The study of WZ and ZZ (referred to collectively as VZ) diboson production in proton-proton collisions provides an important test of the gauge sector of the standard model (SM). In \(\mathrm {p}\) \(\mathrm {p}\) collisions at \(\sqrt{s} = 8\,\text {TeV} \), the predicted cross sections are \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {W}\mathrm {Z})= 22.3 \pm 1.1\,\text {pb} \) and \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}\mathrm {Z})= 7.7 \pm 0.4\,\text {pb} \) at next-to-leading order (NLO) in quantum chromodynamics (QCD) [1]. A significant deviation from these theoretical values would indicate contributions from physics beyond the SM. Both processes constitute important backgrounds to the associated production of V and standard model Higgs (H) bosons, especially in those channels involving \({\hbox {H}} \rightarrow {\mathrm {b}}\overline{{\mathrm {b}}} \) decays. The production rate of two vector bosons in \(\mathrm {p}\) \(\mathrm {p}\) collisions at the Large Hadron Collider (LHC) has been measured by the ATLAS and Compact Muon Solenoid (CMS) Collaborations in all-leptonic WZ and ZZ decay modes [25].

We present a measurement of the VZ production cross sections in the \(\mathrm {V} \mathrm {Z}\rightarrow \mathrm {V} {\mathrm {b}}\overline{{\mathrm {b}}} \) decay mode, where the V decays leptonically: \(\mathrm {Z}\rightarrow {\nu }\overline{\nu }\), \(\mathrm {W}^{\pm }\rightarrow \ell ^{\pm }{\nu }\), and \(\mathrm {Z}\rightarrow \ell ^{+}\ell ^{-}\), with \(\ell \) corresponding to either \(\mathrm {e}\) or \(\mu \). Contributions from \(\mathrm {W}\rightarrow \mathrm {\tau }{\nu }\) with leptonic \(\tau \) decays are included in the \(\mathrm {W}^{\pm }\rightarrow \ell ^{\pm }{\nu }\) channels. The analysis uses final states with no charged leptons (0-lepton), single lepton (1-lepton), or dilepton (2-lepton) events with electron and muon channels analyzed separately. The Z boson decays to \({\mathrm {b}}\) quarks are selected by requiring the presence of two b-tagged jets. The results are based on data corresponding to an integrated luminosity of 18.9 fb\(^{-1}\) collected with the CMS detector at the LHC. Two methods are used in the analysis, one involves a fit to the output of a multivariate discriminant, and the other a fit to the two-jet mass (\(m_{{\mathrm {b}}\overline{{\mathrm {b}}}}\)) distribution. The cross sections are calculated simultaneously for WZ and ZZ production at transverse momenta of the accompanying V of \(p_{\mathrm {T}} ^{\mathrm {V}}> 100\,\text {GeV} \), for Z boson masses falling within the window \(60<M_{\mathrm {Z}}<120\,\text {GeV} \). The latter requirement assures a uniform treatment of interference with background processes. Approximately 15 % of the WZ and 14 % of the ZZ total inclusive cross sections are contained within their respective regions of acceptance for \(p_{\mathrm {T}} ^{\mathrm {V}}> 100\,\text {GeV} \), as calculated using several event generators discussed in the following section. The 1-lepton channel is sensitive almost exclusively to WZ production, while the 2-lepton modes are restricted to the ZZ process. The channel with no charged leptons is sensitive to both production modes, with ZZ and WZ channels contributing 70 % and 30 %, respectively, to these events. The 0-lepton WZ events contribute primarily when the lepton from \(\mathrm {W}^{\pm }\rightarrow \ell ^{\pm }{\nu }\) falls outside of acceptance.

2 CMS detector, triggering, object reconstruction and event simulation

A description of the CMS detector can be found in Ref. [6]. Particles produced in \(\mathrm {p}\mathrm {p}\) collisions are detected in the pseudorapidity range \(|\eta |< 5\), where \(\eta = -\ln [\tan (\theta /2)]\), and \(\theta \) is the polar angle relative to the direction of the counterclockwise circulating proton beam. The CMS detector comprises a superconducting solenoid, providing a uniform axial magnetic field of 3.8\(\,\text {T}\) over a cylindrical region that is 12.5\(\,\text {m}\) long and 6\(\,\text {m}\) in diameter. The magnetic volume contains a silicon pixel and strip tracking system (\(|\eta | < 2.5\)), surrounded by a lead tungstate crystal electromagnetic calorimeter (ECAL) and a brass/scintillator hadronic calorimeter (HCAL) at \(|\eta | < 3.0\). A steel/quartz-fiber Cherenkov calorimeter extends the coverage to \( |\eta | = 5\). The steel flux-return yoke outside the solenoid is instrumented with gas-ionization detectors used to identify muons at \( |\eta | < 2.4\).

The 1-lepton channels rely on several single-lepton triggers with \(p_{\mathrm {T}}\) thresholds between 17 and \(30\,\text {GeV} \) and restrictive lepton identification. The 2-lepton channels use the same single-muon triggers for selecting the \(\mathrm {Z}\rightarrow \mathrm {\mu ^+}\mathrm {\mu ^-} \) events and 2-electron triggers with \(p_{\mathrm {T}}\) thresholds of 17 and 8\(\,\text {GeV}\) for the electron of higher and lower \(p_{\mathrm {T}}\), respectively, and with more restrictive isolation requirements for selecting the \(\mathrm {Z}\rightarrow \mathrm {e}^+\mathrm {e}^- \) events.

A combination of several triggers is used for the events without charged leptons: all triggers require \(E_{\mathrm {T}}^{\text {miss}}\) to be above a given threshold, such that the trigger efficiency ranges from 70 to 99 % for \(E_{\mathrm {T}}^{\text {miss}} =100\,\text {GeV} \) to \(170\,\text {GeV} \), respectively.

Electron reconstruction requires a match of a cluster in the ECAL to a track reconstructed in the silicon tracker [79]. Electron identification relies on a multivariate technique that combines observables sensitive to the amount of bremsstrahlung emitted along the electron trajectory, the match in position and energy of the electron trajectory with the associated cluster, as well as the energy distribution in the cluster. Additional requirements are imposed to minimize background from electrons produced through photons converting into \(\mathrm {e}^+\mathrm {e}^-\) pair while traversing the tracker material. Electron candidates are considered if observed in the pseudorapidity range \(|\eta | < 2.5\) but excluding the transition regions at \(1.44 < |\eta |< 1.57\) between the ECAL barrel and endcaps.

Muons are reconstructed using two algorithms [10]: one in which tracks in the silicon tracker are matched to signals in the muon chambers, and another in which a global fit is performed to the track that is seeded by signals detected in the outer muon system. The muon candidates are required to be reconstructed by both algorithms. Additional identification criteria are imposed on muon candidates to reduce the fraction of tracks misidentified as muons. These include the number of hits reconstructed in the tracker and in the muon system, the quality of the global fit to a muon trajectory, and its consistency with originating from the primary vertex. Muon candidates are finally required to fall in the \(|\eta | < 2.4\) range.

Jets are reconstructed from particle-flow [11, 12] objects using the anti-\(k_{\mathrm {T}}\) jet clustering algorithm [13], with a distance parameter of 0.5, as implemented in the fastjet package [14, 15]. Each jet is required to lie within \(|\eta | < 2.5\) and have \(p_{\mathrm {T}} > 20\,\text {GeV} \). Jet energy corrections are applied as a function of \(\eta \) and \(p_{\mathrm {T}} \) of the jet [16]. The imbalance in transverse momentum (often referred to as “missing transverse energy vector”) is calculated as the negative of the vectorial sum of the \({\varvec{p}}_{\mathrm {T}} \) of all particle-flow objects identified in the event, and the magnitude of this vector is referred to as \(E_{\mathrm {T}}^{\text {miss}}\). The procedures of Ref. [17] are applied on an event-by-event basis to mitigate the effects of multiple interactions per beam crossing (pileup).

The CMS combined secondary-vertex (CSV) b-tagging algorithm [18] is used to identify jets that are likely to originate from the hadronization of b quarks. This algorithm combines the information about track impact parameters and secondary vertices in a discriminant that distinguishes \({\mathrm {b}}\) jets from jets originating from light quarks, gluons, or \(\mathrm {c}\) quarks. The output of the CSV algorithm is a continuous discriminator with a value in the range 0 to 1, where typical thresholds for \({\mathrm {b}}\) jet selection range from loose (\(\approx \)0.2) to tight (\(\approx \)0.9). Depending on the chosen CSV threshold, the efficiencies for tagging jets originating from \({\mathrm {b}}\) quarks range from 50 % (tight) to 75 % (loose), while the misidentification rates for \(\mathrm {c}\) quarks range from 5 % (tight) to 25 % (loose) and for light quarks or gluons range from 0.2 % (tight) to 3 % (loose).

The b-jet energy resolution is improved by applying multivariate regression techniques similar to those used in the CDF experiment [19]. An additional correction, beyond the standard CMS jet energy corrections, is derived from simulated events to recalibrate each b-tagged jet with the generated \({\mathrm {b}}\) quark energy. This involves a specialized boosted decision tree (BDT) [20, 21] trained on simulated signal events, with inputs that include information on jet structure, such as information about individual tracks, jet constituents, information on semileptonic b-hadron decays, and the presence of any low-\(p_{\mathrm {T}}\) leptons. The BDT correction, identical to that used in Ref. [17], improves the resolution on the mass of the \({\mathrm {b}}\overline{{\mathrm {b}}}\) system by \({\approx }\)15 %, resulting in an increase in the sensitivity of the analysis of 10–20 %, depending on the specific channel. The \(\mathrm {Z}\rightarrow {\mathrm {b}}\overline{{\mathrm {b}}} \) invariant mass resolution after this correction is \({\approx }\)10 %.

Simulated samples of events are produced using several event generators, and the response of the CMS detector is modeled using the Geant4 program [22]. The MadGraph 5.1 [23] generator is used to generate the diboson signals, as well as the background from W+jets, Z+jets, and \(\mathrm {t}\overline{\mathrm {t}}\) events. The single-top-quark samples are generated with powheg  [2427], and generic multijet samples using pythia 6.4 [28]. VH event samples with a SM H boson mass of \(m_{{\hbox {H}}}= 125\,\text {GeV} \) are also produced using the powheg  [29] event generator interfaced to herwig ++ [30] for parton showering and hadronization. The NLO MSTW2008 set [31] of parton distribution functions (PDF) is used to produce the NLO powheg samples, while the leading-order (LO) CTEQ6L1 set [32] is used for the events that correspond to LO calculations. The Z2Star tune [33] is used to parametrize the underlying event. Corrections to account for differences in efficiencies between data and simulation are measured using data using a tag and probe technique [34], and applied as individual weights to each of the simulated events.

3 Event selection

We use the analysis techniques developed in the CMS VH studies of Ref. [17]. Event selection is based on the reconstruction of a vector boson that decays leptonically in association with the Z boson that decays into two b-tagged jets. Dominant backgrounds to VZ production include V+\({\mathrm {b}}\) jets, V+light flavor (LF = \(\mathrm {u}\) \(\mathrm {d}\) \(\mathrm {s}\) \(\mathrm {c}\) quark or gluon) jets, \(\mathrm {t}\overline{\mathrm {t}}\), single-top-quark, generic multijet, and H boson production. In general, b-tagging reduces the contributions from LF events, and counting additional jet activity is used to reduce background from \(\mathrm {t}\overline{\mathrm {t}}\) and single-top-quark events. Finally, the value of \(m_{{\mathrm {b}}\overline{{\mathrm {b}}}}\) provides a way to distinguish VZ from V+\({\mathrm {b}}\) and SM VH production, as discussed below.

The reconstruction of a \(\mathrm {Z}\rightarrow {\mathrm {b}}\overline{{\mathrm {b}}} \) decay proceeds by selecting two central jets from the primary vertex with \(|\eta |<2.5\), each with a \(p_{\mathrm {T}}\) above some chosen threshold, and defining the \({\mathrm {b}}\overline{{\mathrm {b}}}\) candidate as the jet pair with largest vectorial sum of transverse momenta (\({p_{\mathrm {T}}}^{{\mathrm {b}}\overline{{\mathrm {b}}}}\)). This combination is very efficient for \(p_{\mathrm {T}} ^{\mathrm {V}}>100\,\text {GeV} \) without biasing the differential distribution of the background, and also defines the two-jet mass \(m_{{\mathrm {b}}\overline{{\mathrm {b}}}}\), which is required to be \(<250\,\text {GeV} \). The two selected jets are also required to be tagged as \({\mathrm {b}}\) jets, with a value of the CSV discriminator that depends on the specific nature of the event.

Candidate \(\mathrm {W}^{\pm }\rightarrow \ell ^{\pm }{\nu }\) decays in WZ events are identified through the presence of a single isolated lepton and significant \(E_{\mathrm {T}}^{\text {miss}}\). Electrons and muons are required to have \(p_{\mathrm {T}} >30\,\text {GeV} \) and \(p_{\mathrm {T}} >20\,\text {GeV} \), respectively. To reduce contamination from generic multijet processes, the \(E_{\mathrm {T}}^{\text {miss}}\) is required to be \(>45\,\text {GeV} \). In addition, the azimuthal angle (\(\phi \)) between the \(E_{\mathrm {T}}^{\text {miss}}\) vector and the lepton is required to be \(<\pi /2\). At least two jets with \(p_{\mathrm {T}} > 30\,\text {GeV} \) and a moderate CSV discriminator value are required to define the \(\mathrm {Z}\rightarrow {\mathrm {b}}\overline{{\mathrm {b}}} \) candidate.

Candidate \(\mathrm {Z}\rightarrow \ell ^{+}\ell ^{-}\) decays in ZZ events are reconstructed by combining isolated, oppositely charged pairs of electrons or muons, with a dilepton invariant mass of \(75<m_{\ell \ell }<105\,\text {GeV} \). The \(p_{\mathrm {T}}\) of each lepton is required to be \(>20\,\text {GeV} \). The two jets of the \(\mathrm {Z}\rightarrow {\mathrm {b}}\overline{{\mathrm {b}}} \) candidate must pass a loose CSV discriminator value, which is optimized in simulated events for increasing the sensitivity of the analysis.

The identification of \(\mathrm {Z}\rightarrow {\nu }\overline{\nu }\) decays in ZZ events requires \(E_{\mathrm {T}}^{\text {miss}} >100\,\text {GeV} \) in the event, and at least one of the \({\mathrm {b}}\) jets with \(p_{\mathrm {T}} > 60\,\text {GeV} \) and the other with \(p_{\mathrm {T}} > 30\,\text {GeV} \) to form a \(\mathrm {Z}\rightarrow {\mathrm {b}}\overline{{\mathrm {b}}} \) candidate. Moderate CSV requirements are applied on both jets. Two additional event requirements are imposed to reduce the multijet background in which \(E_{\mathrm {T}}^{\text {miss}}\) originates from mismeasured jet energies. First, a \(\Delta \phi (E_{\mathrm {T}}^{\text {miss}},\text {jet})\) \(>0.5\) radians requirement is applied on the azimuthal angle between the direction of \(E_{\mathrm {T}}^{\text {miss}}\) and the \(p_{\mathrm {T}}\) of the jet closest in \(\phi \) that satisfies \(|\eta |<2.5\) and \(p_{\mathrm {T}} >25\,\text {GeV} \). The second requirement is that the azimuthal angle between the direction of \({E_{\mathrm {T}}^{\text {miss}}}^\text {(trks)}\), as calculated from only the charged tracks that satisfy \(p_{\mathrm {T}} >0.5\,\text {GeV} \) and \(| \eta |<2.5\), and the direction of the full \(E_{\mathrm {T}}^{\text {miss}}\) has \(\Delta \phi (E_{\mathrm {T}}^{\text {miss}},{E_{\mathrm {T}}^{\text {miss}}}^{\text{( }trks)})<0.5\) radians. Finally, to reduce background from \(\mathrm {t}\overline{\mathrm {t}}\) events in the 1-lepton and 0-lepton channels, events that contain any additional isolated leptons with \(p_{\mathrm {T}} >20\,\text {GeV} \) are rejected.

3.1 Multivariate analysis

The signal region is defined by events that satisfy the V and Z boson reconstruction criteria described above. To optimize the significance of the signal as well as the \({\mathrm {b}}\overline{{\mathrm {b}}}\) mass resolution, events are classified into different regions of the V boson transverse momentum. In particular, we define three regions for the 1-lepton channels: (i) \(100<p_{\mathrm {T}} ^{\mathrm {V}}<130\,\text {GeV} \), (ii)  \(130<p_{\mathrm {T}} ^{\mathrm {V}}<180\,\text {GeV} \), and (iii)  \(p_{\mathrm {T}} ^{\mathrm {V}}>180\,\text {GeV} \). A single inclusive region of \(p_{\mathrm {T}} ^{\mathrm {V}}>100\,\text {GeV} \) is defined for the 2-lepton channels. Three regions for the channel without charged leptons are defined by (i) \(100<p_{\mathrm {T}} ^{\mathrm {V}}<130\,\text {GeV} \), (ii) \(130<p_{\mathrm {T}} ^{\mathrm {V}}<170\,\text {GeV} \), and (iii) \(p_{\mathrm {T}} ^{\mathrm {V}}>170\,\text {GeV} \). For regions (i) and (ii), the requirement on \(\Delta \phi (E_{\mathrm {T}}^{\text {miss}},\text {jet})\) is tightened to \(\Delta \phi (E_{\mathrm {T}}^{\text {miss}},\text {jet})>0.7\) radians. To reduce background in the region of smallest \(p_{\mathrm {T}} ^{\mathrm {V}}\), the \(E_{\mathrm {T}}^{\text {miss}}\) significance (defined as the ratio of \(E_{\mathrm {T}}^{\text {miss}}\) to the square root of the total transverse energy deposited in the calorimeter) is required to be \({>}3 \sqrt{\text {GeV}}\).

To better discriminate between signals and background, the final stage of the analysis introduces a BDT discriminant trained on simulated samples for signal and all background processes. The set of input variables is identical to the one used in Ref. [17], and includes the mass of the \({\mathrm {b}}\overline{{\mathrm {b}}}\) system, the number of additional jets beyond the \({\mathrm {b}}\) and \(\overline{{\mathrm {b}}}\) candidates (\(N_{\mathrm {aj}}\)), the value of CSV for the \({\mathrm {b}}\overline{{\mathrm {b}}}\) jets with \(\mathrm {CSV}_{\text {min}}\) specifying the smaller value and \(\mathrm {CSV}_{\text {max}}\) the larger one, and the distance in \(\eta \)-\(\phi \) between the \({\mathrm {b}}\) and \(\overline{{\mathrm {b}}}\) jet axes, \(\Delta R({\mathrm {b}}\overline{{\mathrm {b}}})= \sqrt{{\left( \Delta \phi \right) ^2+\left( \Delta \eta \right) ^2}}\).

Figure 1(a) displays the combined differential distribution for events from all channels as a function of the logarithm of the signal-to-background (S/B) ratio of the values of the output of the corresponding S and B contributions to the BDT discriminants of each event. Panel (b) gives the ratio of the data (black points) to the SM expectation (histogram) relative to the background-only hypothesis, while panel (c) gives the ratio to the expectation from the SM, including the VZ contribution. The excess observed in bins with largest S/B is clearly consistent with what is expected for VZ production in the SM.

Fig. 1
figure 1

(a) Combined distribution for all channels in the value of the logarithm of the ratio of signal to background (S/B) discriminants in data and in Monte Carlo (MC) simulations, based on the outputs of the S and B BDT discriminants for each event. The two bottom panels display (b) the ratio of the data and of the SM expectation relative to the background-only hypothesis, and (c) data relative to the expected sum of background and VZ signal. The error bars and the cross-hatched regions reflect total uncertainties at 68 % confidence level

3.2 Two-jet mass analysis

As a cross-check of the multivariate analysis, we perform a simpler analysis based on the \(m_{{\mathrm {b}}\overline{{\mathrm {b}}}}\) distribution of the reconstructed \({\mathrm {b}}\overline{{\mathrm {b}}}\) jets of the hypothesized Z boson. The signal region is defined by events that satisfy the V and Z boson reconstruction criteria used in the multivariate analysis. Events are again classified according to \(p_{\mathrm {T}} ^{\mathrm {V}}\), and, in addition, more restrictive selections are introduced than in the multivariate analysis, because the single variable \(m_{{\mathrm {b}}\overline{{\mathrm {b}}}}\) is not a sufficiently sensitive discriminant.

In the 0-lepton and 1-lepton channels, the b-tagging requirements are tightened, respectively, to a tight \(\mathrm {CSV}_{\text {max}}\) and a medium \(\mathrm {CSV}_{\text {min}}\). A veto is also imposed on any additional jets, and \(\Delta \phi (\mathrm {V},\mathrm {Z})\) is required to be \({>}2.95\) radians. The regions of \(100<p_{\mathrm {T}} ^{\mathrm {V}}<130\,\text {GeV} \), \(130<p_{\mathrm {T}} ^{\mathrm {V}}<180\,\text {GeV} \), and \(p_{\mathrm {T}} ^{\mathrm {V}}>180\,\text {GeV} \) are used to analyze the 1-muon channel, and the regions for the 1-electron channel are defined as \(100<p_{\mathrm {T}} ^{\mathrm {V}}<150\,\text {GeV} \) and \(p_{\mathrm {T}} ^{\mathrm {V}}>150\,\text {GeV} \). The selected regions for the 0-lepton channel are identical in \(p_{\mathrm {T}} ^{\mathrm {V}}\) to the requirements used in the multivariate analysis, but we define ranges of \({p_{\mathrm {T}}}^{{\mathrm {b}}\overline{{\mathrm {b}}}}>110\,\text {GeV} \), \({p_{\mathrm {T}}}^{{\mathrm {b}}\overline{{\mathrm {b}}}}>140\,\text {GeV} \), and \({p_{\mathrm {T}}}^{{\mathrm {b}}\overline{{\mathrm {b}}}}>190\,\text {GeV} \), and impose an additional threshold for the jet of highest \(p_{\mathrm {T}}\) of \({>}80\,\text {GeV} \) for the region of \({p_{\mathrm {T}}}^{{\mathrm {b}}\overline{{\mathrm {b}}}}>140\,\text {GeV} \). For the 2-lepton channels, the \(p_{\mathrm {T}} ^{\mathrm {V}}\) ranges are defined by \(100<p_{\mathrm {T}} ^{\mathrm {V}}<150\,\text {GeV} \) and \(p_{\mathrm {T}} ^{\mathrm {V}}>150\,\text {GeV} \), and, in addition, we require medium \(\mathrm {CSV}_{\text {max}}\) and moderate \(\mathrm {CSV}_{\text {min}}\) thresholds, and \(E_{\mathrm {T}}^{\text {miss}} < 60\,\text {GeV} \).

Figure 2(a) combines events from all channels into a single \(m_{{\mathrm {b}}\overline{{\mathrm {b}}}}\) distribution, which is compared to expectations from the SM. Figure 2(b) shows the same distribution, but after subtracting all SM contributions except for the VZ signals and VH backgrounds. The VZ signal is clearly visible, with a yield compatible to that expected in the SM.

Fig. 2
figure 2

(a) The combined \({\mathrm {b}}\overline{{\mathrm {b}}}\) invariant mass distribution for all channels, compared to MC simulation of SM contributions. (b) Same distribution as in (a), but with all backgrounds to VZ production, except for the VH contribution, subtracted. The contributions from backgrounds and signal are summed cumulatively. The expectations for the sum of VZ signal and background from VH production are also shown superimposed. The error bars and cross-hatched regions reflect statistical uncertainties at 68 % confidence level

4 Background calibration regions and systematic uncertainties

Calibration regions in data are used to validate the simulated distributions used to build the BDT discriminants, as well as to correct normalizations of the major background contributions from W and Z bosons produced in association with jets (LF or \({\mathrm {b}}\) quarks) and \(\mathrm {t}\overline{\mathrm {t}}\) production. These calibration regions are identical to those of Ref. [17], and typically involve inversion of b-tag selection criteria and two-jet mass sidebands around the signal region. A set of simultaneous fits is then performed to distributions of discriminating variables in the calibration regions, separately for each channel, to obtain consistent scale factors that are used to adjust the yields from simulated events. These scale factors account not only for discrepancies between predicted cross sections and data, but also for any residual differences in the selection of physical objects. Separate scale factors are consequently applied for each of the background processes in the different channels. For the backgrounds from V+jets, the calibration regions are enriched in either \({\mathrm {b}}\) or LF jets. Uncertainties in the scale factors include statistical components arising from the fits to the discriminant (affected by the finite size of the data and MC samples), and systematic uncertainties originating from \({\mathrm {b}}\) tagging, jet energy scale, and jet energy resolution. The numerical values of the scale factors are close to unity and their uncertainties (3–50 %) are identical to those of Ref. [17].

The systematic uncertainties considered in the measurement of the cross section using the multivariate analysis are summarized in Table 1. The two columns give the uncertainty in the “signal strength” \(\mu \) for the WZ and ZZ processes, which corresponds to the ratio of the observed yield relative to the yield expected from the SM. Each systematic uncertainty is represented by a nuisance parameter and profiled in the combined fit. To evaluate the impact of individual uncertainties a fit to a simulated pseudo-dataset is performed removing individual nuisance parameters.

Table 1 Sources of systematic uncertainty, including whether they affect the distribution (dist) or normalization (norm) of the BDT output, and their relative contributions to the expected uncertainty in the signal strengths \(\mu _{\mathrm {W}\mathrm {Z}}\) and \(\mu _{\mathrm {Z}\mathrm {Z}}\) after fitting the model

Theoretical uncertainties in the acceptances are evaluated using the mcfm [1] generator by changing the QCD factorization and renormalization scales up and down by a factor of two relative to the default scales of \(\mu _R = \mu _F = m_Z\). The impact of uncertainties in PDF and \(\alpha _s\) on the cross section and acceptance of the VZ signal are evaluated following the PDF4LHC prescription [35, 36], using CT10 [37], MSTW08 [31], and NNPDF2.0 [38] sets of PDF, and the combined uncertainty is found to be 5 % for both WZ and ZZ production. Because of the large \(p_{\mathrm {T}} ^{\mathrm {V}}\) values required in this analysis, the results are sensitive to electroweak (EW) and NNLO QCD corrections, both of which can be significant. Since the exact corrections for the VZ process are not available, we use the NLO EW [3941] and next-to-next-to-leading-order (NNLO) QCD [42] corrections to VH production, and apply these to the VZ channel, because they are expected to be similar for the two processes. Based on the size of the correction, an additional 10 % uncertainty is assigned to the inclusive cross section to account for the extrapolation to the \(p_{\mathrm {T}} ^{\mathrm {V}}<100\,\text {GeV} \) region.

The uncertainty in CMS luminosity is estimated to be 2.6 % [43]. Muon and electron triggering, reconstruction, and identification efficiencies are determined in data from samples of \(\mathrm {Z}\rightarrow \ell ^{+}\ell ^{-}\) decays. The uncertainty in the lepton yields due to trigger inefficiency is 2 % per lepton, as is the uncertainty in lepton identification efficiency. The parameters describing the turn-on in the trigger efficiency in the 0-lepton channel are varied within their statistical uncertainties for different assumptions on the methods used to derive the efficiency. The estimated uncertainty is 3 %.

The jet energy scale is also varied within its uncertainty as a function of jet \(p_{\mathrm {T}}\) and \(\eta \), and the efficiency of the selections is then recomputed to assess the dependence on these variables. The effect of this uncertainty on the jet energy resolution is evaluated by smearing the jet energies according to their measured uncertainties, a process that affects both the normalization and distribution of events. An uncertainty of 3 % is assigned to the yields of all processes in the 0-lepton and 1-lepton channels due to uncertainties related to \(E_{\mathrm {T}}^{\text {miss}}\), such as its scale and resolution.

Scaling factors to normalize b-tagging in simulation to that in data (measured in \({\mathrm {b}}\) enhanced samples of jets that contain muons) are applied consistently to jets in simulated signal and background events. The measured uncertainties in b-tagging scale factors are 3 % per b-quark jet, 6 % per c-quark jet, and 15 % per mistagged jet (originating from a gluon or from a light quark) [18]. These translate into uncertainties in yields of 3–15 %, depending on channel and specific process. The BDT output is also affected by the distributions of the CSV output, and an uncertainty is therefore assigned according to \({\pm }1\) standard deviation (SD) variation in yield and shape of the CSV distributions.

Finally, the sizes of the simulated samples, as well as uncertainties in generator-level modeling of V+jets and \(\mathrm {t}\overline{\mathrm {t}}\) backgrounds, are taken into account to determine the total uncertainty in the signal strength \(\mu \).

5 Results

The total cross sections are determined from a simultaneous fit to all final states, constrained by the number of events observed in each category. The likelihood is written as a combination of individual channel likelihoods for the signal and background hypotheses. We extract the best-fit values of the signal strength assuming the SM expectation for the ratio of \(\sigma {(\mathrm {W}\mathrm {Z})}/\sigma {(\mathrm {Z}\mathrm {Z})}\) at NLO. Using the baseline multivariate analysis, the VZ process is observed with a statistical significance of 6.3 SD (5.9 SD expected). The measurement corresponds to a signal strength relative to the SM of \(\mu = 1.09 {}_{-0.21}^{+0.24}\). The cross-check analysis based on \(m_{{\mathrm {b}}\overline{{\mathrm {b}}}}\) yields a significance of 4.1 SD (4.6 SD expected), which corresponds to \(\mu = 0.97 {}_{-0.29}^{+0.32}\). In the following, the interpretation refers to the more sensitive multivariate analysis.

The cross sections extracted from the individual channels are consistent with each other and with the SM predictions, as can be seen in Fig. 3(a). To extract the WZ and ZZ cross sections, a simultaneous fit is performed floating independently the WZ and ZZ contributions, with results displayed in Fig. 3(b). The most likely values are \(\mu _{\mathrm {W}\mathrm {Z}} = 1.37 {}_{-0.37}^{+0.42}\) and \(\mu _{\mathrm {Z}\mathrm {Z}} = 0.85 {}_{-0.31}^{+0.34}\).

The values for the signal strength are extrapolated to the mass window \(60<M_{\mathrm {Z}}<120\,\text {GeV} \) for both the \({\mathrm {b}}\overline{{\mathrm {b}}}\) and lepton pair invariant masses. The resulting cross section for inclusive WZ production is \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {W}\mathrm {Z}) = 30.7 \pm 9.3\,\text {(stat.)} \pm 7.1\,\text {(syst.)} \pm 4.1\,\text {(th.)} \pm 1.0\,\text {(lum.)} \,\text {pb} \), compared to the theoretical value of \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {W}\mathrm {Z})= 22.3 \pm 1.1\,\text {pb} \), calculated with mcfm using the MSTW2008 PDF. The ZZ cross section is \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}\mathrm {Z}) = 6.5 \pm 1.7\,\text {(stat.)} \pm 1.0\,\text {(syst.)} \pm 0.9\,\text {(th.)} \pm 0.2\,\text {(lum.)} \,\text {pb} \), for the same Z-mass window, which can be compared to the theoretical value of \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}\mathrm {Z})= 7.7 \pm 0.4\,\text {pb} \), also calculated with mcfm using the MSTW2008 PDF. The uncertainties in both theoretical values include uncertainties in the PDF and \(\alpha _s\), and those originating from the uncertainty in renormalization and factorization scales. The ZZ cross section is in agreement with CMS measurements using all-leptonic V decays of Ref. [5], which is more precise than this analysis.

Fig. 3
figure 3

(a) Best-fit values of the ratios of the VZ production cross sections, relative to SM predictions for individual channels, and for all channels combined (hatched band). (b) Contours of 68 and 95 % confidence level for WZ and ZZ production cross sections. The large cross indicates the best-fit value including its 68 % statistical uncertainty, and the light small cross shows the result for the MCFM NLO calculation

The cross sections for \(p_{\mathrm {T}} ^{\mathrm {V}}> 100\,\text {GeV} \) and for Z bosons produced in the mass region \(60<M_{\mathrm {Z}}<120\,\text {GeV} \) are determined to be \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {W}\mathrm {Z}) = 4.8 \pm 1.4\,\text {(stat.)} \pm 1.1\,\text {(syst.)} \,\text {pb} \) and \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}\mathrm {Z}) = 0.90 \pm 0.23\,\text {(stat.)} \pm 0.16\,\text {(syst.)} \,\text {pb} \). The acceptance for this \(p_{\mathrm {T}} \) region has smaller theoretical uncertainty, estimated as 1 % using MC signal simulation; the measurements are found in agreement with the NLO mcfm calculations yielding \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {W}\mathrm {Z})= 3.39 \pm 0.17\,\text {pb} \) and \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}\mathrm {Z})= 1.03 \pm 0.05\,\text {pb} \).

6 Summary

We presented measurements of the inclusive \(\mathrm {p}\mathrm {p}\rightarrow \mathrm {V} \mathrm {Z}\) (where V denotes W or Z) cross sections in data recorded by the CMS experiment at the LHC at \(\sqrt{s} =8\,\text {TeV} \), corresponding to an integrated luminosity of 18.9 fb\(^{-1}\). The measurements are based on \(\mathrm {V} \mathrm {Z}\rightarrow \mathrm {V} {\mathrm {b}}\overline{{\mathrm {b}}} \) final states. The decay modes \(\mathrm {Z}\rightarrow {\nu }\overline{\nu }\), \(\mathrm {W}^{\pm }\rightarrow \ell ^{\pm }{\nu }\), and \(\mathrm {Z}\rightarrow \ell ^{+}\ell ^{-}\) (\(\ell = \mathrm {e}, \mathrm {\mu }\)) are used to identify the accompanying V. We observe \(\mathrm {V} \mathrm {Z}\rightarrow \mathrm {V} {\mathrm {b}}\overline{{\mathrm {b}}} \) production with a combined significance of 6.3 standard deviations. The total cross sections, defined for \(60<M_{\mathrm {Z}}<120\,\text {GeV} \), are found to be \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {W}\mathrm {Z}) = 30.7 \pm 9.3\,\text {(stat.)} \pm 7.1\,\text {(syst.)} \pm 4.1\,\text {(th.)} \pm 1.0\,\text {(lum.)} \,\text {pb} \) and \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}\mathrm {Z}) = 6.5 \pm 1.7\,\text {(stat.)} \pm 1.0\,\text {(syst.)} \pm 0.9\,\text {(th.)} \pm 0.2\,\text {(lum.)} \,\text {pb} \). These values are consistent with the predictions \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {W}\mathrm {Z})= 22.3 \pm 1.1\,\text {pb} \) and \(\sigma (\mathrm {p}\mathrm {p}\rightarrow \mathrm {Z}\mathrm {Z})= 7.7 \pm 0.4\,\text {pb} \) of the standard model at next-to-leading order.