-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propagate necessary weights for estimating MC closure #27
Comments
This is sufficient when estimating fakes but not for MC closure, where we
want to apply the FR weights based on the lepton flavor. This distinction
is currently not handled.
…On Mon, Oct 10, 2022, 4:05 PM saswatinandan ***@***.***> wrote:
Fake_Rate weight is considered in the evtweight calculation here
<https://github.com/HEP-KBFI/TallinnNtupleProducer/blob/main/EvtWeightTools/src/EvtWeightRecorder.cc#L61>
and here
<https://github.com/HEP-KBFI/TallinnNtupleProducer/blob/main/EvtWeightTools/src/EvtWeightRecorder.cc#L1149-L1176>
all leptons are looped over and fake weight is estimated only for those
leptons which fail tight selection and for those passing tight selection it
is 1. So I don't think we need any additional branch for Fake_Rate weight.
—
Reply to this email directly, view it on GitHub
<#27 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABPR6EFGWELSCPGWBQAR6DTWCQIANANCNFSM6AAAAAAQY4FWN4>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Suppose we are in electron,muon channel and we want to consider electron closure shape correction, then we will consider those events only with fakeable electron but not tight and tight muon. We can check it from the branch isTight and isFakeable branch and lepton flavour can be checked from pdgId. And fake weights are obtained from here where fake weight is already considered. Isn't it sufficient or something is missing. |
OK so it could work as you described, but the event selection string grows
exponentially with the multiplicity of leptons. I find it easier to
understand and simpler to implement if we just require all particles to
pass the fakeable selection and undo the FR for a given lepton flavor.
However, since the current approach (of not adding any new branches to the
Ntuple) is more favorable in terms of file size (which is quite a problem
for us -- although there's been zero effort to solve it), then I'm open to
the idea that we implement the complicated event selection string that
takes all possible permutations of lepton flavors and tightness conditions
into account. In that case we need some piece of code that generates the
selection string (when creating the cfg files for analysis jobs), given
lepton multiplicity and lepton flavor for which we want to derive the MC
closure for.
…On Tue, Oct 11, 2022, 3:26 PM saswatinandan ***@***.***> wrote:
Suppose we are in electron,muon channel and we want to consider electron
closure shape correction, then we will consider those events only with
fakeable electron but not tight and tight muon. We can check it from the
branch isTight and isFakeable branch and lepton flavour can be checked from
pdgId. And fake weights are obtained from here
<https://github.com/HEP-KBFI/TallinnNtupleProducer/blob/main/EvtWeightTools/src/EvtWeightRecorder.cc#L57>
where fake weight is already considered. Isn't it sufficient or something
is missing.
—
Reply to this email directly, view it on GitHub
<#27 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABPR6EASWJS3HSQS5BBFKMLWCVMIDANCNFSM6AAAAAAQY4FWN4>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Hi, I think Saswati's idea to handle the MC closure via custom event selection strings is a good idea. If I recall correctly, we compute the fake MC closure systematics separately for electrons (Clos_e) and muons (Clos_m). When we compute the Clos_e systematic, we relax the electron selection to fakeable, while keeping the muon selection tight, and apply the FR weights to the fakeable electron. So, I believe we anyway need a custom event selection string to relaxe the lepton selection from tight to fakeable only for electrons and not for muons. The FR weights are already correctly included in the evtWeight (except that we need to switch from FR measured in data to FR obtained in MC). A similar reasoning applies for computing the Clos_m sytematic. Or am I missing something ? |
AFAICT you're both correct. I did not consider the case where the event has
both fake electrons and muons but neither of which are tight -- such events
end up in fake AR but not in MC closure -- so the selection string has to
differ wrt the fake AR. If that weren't the case, then it'd have been
easier to just save the extra weights imo.
Thus, we need a python function that generates:
* "(lep1_isTight && lep2_isTight && ... && lepN_isTight)" for the SR;
* "(lep1_isFake && lep2_isFake && ... && lepN_isFake) && ! (lep1_isTight &&
lep2_isTight && ... && lepN_isTight)" for the fake AR;
* Permutations of "(lepA_isFake && ! lepA_isTight && (lepA_pdgId == 11 ||
lepA_pdgId == -11)" and "lepB_isTight && (lepB_pdgId == 13 || lepB_pdgId ==
-13)" to get MC closure for electrons (and swap 11 and 13 to get MC closure
for muons);
when creating cfg files for analysis jobs and apply the nominal event
weight in all cases. It could take the number of leptons and the type of
analysis region (SR, fake AR, MC closure for e/mu) as input and return one
of those strings given above. For taus we can have a separate function and
ignore the PDG ID info. I think it can be implemented in our job
distribution framework (right?). Or what do you think?
…On Tue, Oct 11, 2022, 5:20 PM Christian Veelken ***@***.***> wrote:
Hi,
I think Saswati's idea to handle the MC closure via custom event selection
strings is a good idea.
I think we actually need custom event selection strings anyway, Karl.
If I recall correctly, we compute the fake MC closure systematics
separately for electrons (Clos_e) and muons (Clos_m). When we compute the
Clos_e systematic, we relax the electron selection to fakeable, while
keeping the muon selection tight, and apply the FR weights to the fakeable
electron. So, I believe we anyway need a custom event selection string to
relaxe the lepton selection from tight to fakeable only for electrons and
not for muons. The FR weights are already correctly included in the
evtWeight (except that we need to switch from FR measured in data to FR
obtained in MC). A similar reasoning applies for computing the Clos_m
sytematic.
Or am I missing something ?
—
Reply to this email directly, view it on GitHub
<#27 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABPR6EA5JBVXI54BFNRPPLLWCVZTTANCNFSM6AAAAAAQY4FWN4>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Hi, I have an alternative proposal. We could add 2 flags:
to the RecoLepton class and 2 corresponding branches to the "plain" Ntuple (this would be an addition to the C++ code, which would work in the same way for all channels, as far as I can see). These extra branches would allow to simply
in the event selection string when running the analysis code to compute the Clos_e systematics. I think this would work (and would require the most minimal amount of coding to replace the event selection string). What do you think ? |
I guess for application region, minimum one lepton should be fakeable and not tight and maximum all leptons can be fakeable but not tight i.e.
|
It's the same thing. |
Right. Then we actually don't need this (lep1_isFake && lep2_isFake && ... && lepN_isFake) as all leptons are fakebale https://github.com/HEP-KBFI/TallinnNtupleProducer/blob/main/Writers/plugins/RecoLeptonWriter.cc#L132 so only |
Hi Christian, shouldn't it be
? In any case, the code that produces the selection string would be the same at the end of the day, ie it doesn't matter if we concatenate "isTight" flags with "&&" delimiters to obtain event in the SR, or "isClose_e" / "isTight || (is_electron && isFakeable && ! isTight)" flags to get events in the Clos_e region. It basically comes down to whether or not we want to keep the number of branches to an absolute minimum or make the selection string slightly more readable. I have no preference here. Saswati, I see no point in being so nitpicky. If we want to relax the lepton selection for the purpose of training an ML model for example, then we would probably need to save loose leptons instead of fakeable ones. |
Hi Karl, I think
is wrong. My understanding is that at least one of the leptons must not pass the tight lepton selection (to avoid overlap with the signal region). It is wrong to demand that all leptons fail the tight selection. Think about an event with 2 electrons in the 2lss channel. For this event, the isClos_m systematic requires that both electrons pass the tight selection, while the isClos_e systematic allows for either 2 fakeable or 1 fakeable + 1 tight electron, mimicking the selection that we apply in the fake background control region for the real data. I realized that I made a mistake too: It is wrong to replace "ntightlep = 2" -> "ntightlep <= 1" for Clos_e as well as for Clos_m.
in order to reproduce the behaviour in |
So I now had more time to focus on this topic. Sorry for the very long post but I don't see any other way than to explicitly spell it all out. I almost agree with your proposal, Christian, if it weren't for this one caveat that we haven't discussed at all: the gen-matching status of the selected leptons and taus. The second line that you highlighted in your latest reply says that a given event is vetoed in MC closure region for electron/muons if all selected objects are tight but at least one of those objects is a non-prompt electron/muon. Note that the functions It follows that the proposed selection fails to consider 2lss events as coming from MC closure for electrons if a prompt electron and a non-prompt muon in the event both pass the tight cuts. In the past I've tried to argue for removal of such contributions from the MC closure region (see this thread and the PDF file linked within), but unfortunately it introduced too large discrepancy between MC closure and fakes MC. Thus, if we want to replicate the same prescription as before, we then need to replace the proposed " For the sake of completeness I'll discuss all possible use-cases we might encounter in our analysis as to how the phase space can be partitioned based on the promptness and tightness criteria of selected leptons and taus. I'll focus only on the leptons in the following to keep things simple (mostly because taus have different gen-matching codes and no pdgId attributes). At the very end you'll find some tables that demonstrate how the partitioning is supposed to work. First, we have data events, which do not have any generator-level information available for obvious reasons:
In MC we have several ways of dividing the phase space:
In my opinion, the most efficient way to implement it all would be still at the analysis level. In this case we can drop the redundant Another issue is that we have used different (ie QCD-driven) fake rates (FR) in MC regions compared to FMC or data fakes, where we use data-driven FR. If we want to support this feature (which I think is the case because it allows to encapsulate potential differences in flavor composition between the measurement and application regions), then the easiest way to accomplish it is to:
edit: forgot to explain what each letter in the following tables stand for:
Single lepton channel:
Dilepton channel:
2l+1tau channel (assuming that taus are required to be gen-matched in the SR):
Of course, if taus are not required to be gen-matched (and data/MC SF is applied instead, as was the case with 2lss+1tau and 3l+1tau channels in ttH multilepton analysis), then the phase space partitioning is identical to the dilepton case. |
At the moment we include the FR weight in the nominal event weight. However, when estimating the contribution of MC in fake closure regions where leptons of one flavor are kept tight while the leptons of opposite flavor are relaxed to fakeable but not tight, then it means that we have to apply the FR weights to only those events that conform to the latter case.
However, I believe the easiest way to implement it is to factor out the FR weights that correspond to the flavor of tight leptons at the analysis level. In other words, we need to have two more additional branches: FR weights for electrons (eg
frWeight_e
) and FR weights for muons (frWeight_m
). When we estimate the MC closure contribution for electrons and muons, we just divide the nominal event weight with the FR weight of muons and electrons, respectively, and fill the histograms.The text was updated successfully, but these errors were encountered: