1
- ---
2
- title : " Results from CERN Summer school 2025: Supporting Automatic
3
- Differentiation in CMS Combine profile likelihood scans "
1
+ title: "Results from CERN Summer School 2025: Supporting Automatic
2
+ Differentiation in CMS Combine profile likelihood scans"
4
3
layout: post
5
4
excerpt: "A CERN Summer Student 2025 project aiming at the support of
6
5
automatic differentiation (AD) for likelihood scans in the CMS Combine
7
- tool to accelerate statistical inference by leveraging RooFit's
8
- AD support and LLVM-based gradient generation."
6
+ tool to accelerate statistical inference by leveraging RooFit's AD
7
+ support and LLVM-based gradient generation."
9
8
sitemap: false
10
9
author: Galin Bistrev
11
10
permalink: blogs/2025_galin_bistrev_results_blog/
@@ -15,140 +14,131 @@ tags: cern cms root combine c++ RooFit automatic-differentiation
15
14
---
16
15
17
16
### ** Introduction**
18
- Greetings! I’m Galin Bistrev, a fourth-year student specializing in Nuclear
19
- and Particle Physics at the University of Sofia "St. Kliment Ohridski".
20
- As part of the CERN Summer Student Programme 2025, I was working on a project
21
- that aimed to provide support for Automatic Differentiation (AD)
22
- into the CMS Combine tool profile likelihood scans.
23
-
17
+ Greetings! I’m Galin Bistrev, a fourth-year student specializing in
18
+ Nuclear and Particle Physics at the University of Sofia "St. Kliment Ohridski."
19
+ As part of the CERN Summer Student Programme 2025, I was working on a
20
+ project that aimed to provide support for Automatic Differentiation
21
+ (AD) into the CMS Combine tool profile likelihood scans.
24
22
25
23
Mentors: Jonas Rembser, Vassil Vasilev, David Lange
26
24
27
25
### ** Description of the Project**
28
26
29
27
This project aims to enhance support for Automatic Differentiation (AD)
30
28
in likelihood scans within the CMS Combine framework, the primary
31
- statistical analysis tool of the CMS experiment at CERN.
32
- Combine is built on top of RooFit, which has recently introduced AD to
33
- improve minimization techniques.By providing computationally efficient
29
+ statistical analysis tool of the CMS experiment at CERN. Combine is
30
+ built on top of RooFit, which has recently introduced AD to improve
31
+ minimization techniques. By providing computationally efficient
34
32
gradients through AD, RooFit achieves substantial performance
35
- improvements. In RooFit ,Clad converts internal likelihood representations
36
- into standalone C++ code, from which gradient routines for AD
37
- are generated.This strategy not only speeds up the fitting process but
38
- also increases the portability and shareability of likelihood models,
39
- making them usable even by those without detailed knowledge
40
- of RooFit or Combine internals.
41
-
42
-
33
+ improvements. In RooFit, Clad converts internal likelihood
34
+ representations into standalone C++ code, from which gradient
35
+ routines for AD are generated. This strategy not only speeds up the
36
+ fitting process but also increases the portability and shareability
37
+ of likelihood models, making them usable even by those without
38
+ detailed knowledge of RooFit or Combine internals.
43
39
44
40
### ** Brief overview of the CMS Combine engine**
45
- Combine is a statistical analysis framework that compares models
46
- of expected observations with real data. It is widely used for tasks
47
- such as searching for new particles or processes, setting limits on
48
- potential new physics, and measuring physical quantities like cross sections.
49
- Although developed with High Energy Physics (HEP) applications in mind,
50
- Combine contains no intrinsic physics assumptions, making it fully general
51
- and independent of any specific analysis. This flexibility allows it
52
- to be applied across a broad range of statistical problems.
41
+ Combine is a statistical analysis framework that compares models of
42
+ expected observations with real data. It is widely used for tasks such
43
+ as searching for new particles or processes, setting limits on
44
+ potential new physics, and measuring physical quantities like cross-sections.
45
+ Although developed with High Energy Physics (HEP)
46
+ applications in mind, Combine contains no intrinsic physics assumptions,
47
+ making it fully general and independent of any specific analysis.
48
+ This flexibility allows it to be applied across a broad range of
49
+ statistical problems.
53
50
54
51
Roughly, Combine performs three main functions:
55
52
56
53
- Builds a statistical model of expected observations.
57
-
58
54
- Runs statistical tests comparing the model with observed data.
59
-
60
55
- Provides tools for validating, inspecting, and understanding both the
61
56
model and the results of the statistical tests.
62
57
63
- ### Project goals
64
-
65
- In order for AD to be supported in Combine likelihood scans ,
66
- a number of goals needed to be achieved:
58
+ ### ** Project goals**
67
59
68
- - Refactoring some of Combine's logic into RooFit , so that Combine can
69
- reuse the AD-enabled minimization algorithm already present there.
60
+ In order for AD to be supported in Combine likelihood scans, a number of goals needed to be achieved:
70
61
62
+ - Refactoring some of Combine's logic into RooFit, so that Combine can
63
+ reuse the AD-enabled minimization algorithm already present there.
71
64
- Integrate gradient computation into likelihood scans, ensuring that
72
- derivatives are correctly propagated for efficient and accurate minimization.
73
-
74
- - Validate correctness and performance, confirming that the AD-based scans
75
- produce results consistent with traditional
76
- methods while offering improved performance.
65
+ derivatives are correctly propagated for efficient and accurate minimization.
66
+ - Validate correctness and performance, confirming that the AD-based
67
+ scans produce results consistent with traditional methods while
68
+ offering improved performance.
77
69
78
70
## ** Overview of Completed Work**
79
- Over the course of the project, several major tasks were completed to
80
- achieve the stated objectives:
71
+ Over the course of the project, several major tasks were completed to achieve the stated objectives:
81
72
82
73
- Imported the ` RooMultiPdf ` class in RooFit from Combine, enabling
83
- switching between multiple PDF-s, applying statistical penalties, and
84
- supporting code generation for AD.
85
-
86
- - The implementation of the new class was made to be supported by ` codegen `
87
- in RooFit by adding a new function in ` MathFunc.h ` and extending
88
- ` CodegenImpl.cxx ` to generate code for models making use of it.
89
-
90
- - Imported three pieces of code from Combine that handle the minimization
91
- procedures within the framework in RooFit's ` RooMinimizer.cxx ` .
92
- The first is a class imported by Jonas Rembser called ` FreezeDisconnectedParametersRAII ` ,
93
- which automatically freezes and unfreezes parameters disconnected from
94
- the likelihood graph. The second is the function ` generateOrthogonalCombinations ` ,
95
- which generates a list of index combinations by initializing a base configuration
96
- with all indices set to zero and then varying one category at a time.
97
- The third and final piece of code is a function called ` reorderCombinations ` ,
98
- which takes the set of indices produced by ` generateOrthogonalCombinations ` and adjusts
99
- each combination by adding the corresponding base values modulo the
100
- maximum allowed index, effectively shifting the combinations relative
101
- to the current best indices.
74
+ switching between multiple PDFs, applying statistical penalties,
75
+ and supporting code generation for AD.
102
76
103
- - Using the above stated functions , the discrete profiling algorithm,
104
- which is the main minimization algorithm in Combine, was imported in
105
- ` RooMinimizer .cxx` .
77
+ - The implementation of the new class was made to be supported by
78
+ ` codegen ` in RooFit by adding a new function in ` MathFunc.h ` and
79
+ extending ` CodegenImpl .cxx` to generate code for models making use of it .
106
80
107
- - A [ tutorial] ( https://root.cern/doc/master/rf619__discrete__profiling_8py.html ) was created
108
- along with a [ benchmark] ( https://github.com/vgvassilev/clad/issues/1521 ) ,made by Jonas Rembser,
109
- demonstrating discrete profiling with RooMultiPdf objects and evaluating
110
- the performance of AD in the likelihood scans.
81
+ - Imported three pieces of code from Combine that handle the
82
+ minimization procedures within the framework in RooFit's ` RooMinimizer.cxx ` .
83
+ The first is a class imported by Jonas Rembser
84
+ called ` FreezeDisconnectedParametersRAII ` , which automatically
85
+ freezes and unfreezes parameters disconnected from the likelihood graph.
86
+ The second is the function ` generateOrthogonalCombinations ` , which
87
+ generates a list of index combinations by initializing a base
88
+ configuration with all indices set to zero and then varying one category at a time.
89
+ The third and final piece of code is a function called ` reorderCombinations ` ,
90
+ which takes the set of indices produced by ` generateOrthogonalCombinations `
91
+ and adjusts each combination by adding the corresponding base values
92
+ modulo the maximum allowed index, effectively shifting the combinations
93
+ relative to the current best indices.
94
+
95
+ - Using the above-stated functions, the discrete profiling algorithm,
96
+ which is the main minimization algorithm in Combine, was imported
97
+ into ` RooMinimizer.cxx ` .
98
+ - A [ tutorial] ( https://root.cern/doc/master/rf619__discrete__profiling_8py.html )
99
+ was created along with a [ benchmark] ( https://github.com/vgvassilev/clad/issues/1521 ) ,
100
+ made by Jonas Rembser, demonstrating discrete profiling with RooMultiPdf objects
101
+ and evaluating the performance of AD in the likelihood scans.
111
102
112
103
## ** Results**
113
- With those objectives accomplished, RooFit now provides AD support for discrete profiling.
114
- However, the developed benchmark indicates that AD does not currently
115
- improve efficiency, as the gradient code generated by Clad introduces overhead.
116
- Further optimization in Clad is needed to achieve the potential performance
117
- gains for RooFit likelihood scans.More information regarding the issue can be
118
- found at [ #1521 ] ( https://github.com/vgvassilev/clad/issues/1521 ) .
104
+ With those objectives accomplished, RooFit now provides AD support for
105
+ discrete profiling. However, the developed benchmark indicates that AD
106
+ does not currently improve efficiency, as the gradient code generated by
107
+ Clad introduces overhead. Further optimization in Clad is needed to achieve
108
+ the potential performance gains for RooFit likelihood scans. More information
109
+ regarding the issue can be found at [ #1521 ] ( https://github.com/vgvassilev/clad/issues/1521 ) .
119
110
120
111
## ** Conclusions**
121
112
Thanks to this project, RooFit now enables AD support for discrete profiling in Combine,
122
113
which, after addressing the current overhead in Clad, would allow for
123
114
significantly faster and more efficient likelihood scans while maintaining
124
115
accurate optimization of both discrete and continuous parameters.
125
116
126
- ## * Future work*
117
+ ## ** Future work* *
127
118
- Further benchmarking is required to quantify the potential performance
128
- gains from automatic differentiation.
129
-
130
- - Additional optimization of Clad is needed to eliminate unnecessary overhead
131
- in gradient generation.
132
-
133
- - The discrete profiling logic implemented in RooMinimizer should be tested across
134
- different models to evaluate the minimizer’s behavior and robustness.
119
+ gains from automatic differentiation.
120
+ - Additional optimization of Clad is needed to eliminate unnecessary
121
+ overhead in gradient generation.
122
+ - The discrete profiling logic implemented in RooMinimizer should be
123
+ tested across different models to evaluate the minimizer’s behavior and
124
+ robustness.
125
+ - Extend doxygen documentation of RooMinimizer to describe treatment of discrete
126
+ parameters.
127
+ - Test if the implementation of discrete profiling works also inside CMS Combine ,
128
+ replacing their implementation in ` CascadeMinimizer.cxx ` .
135
129
136
130
## ** Acknowledgements**
137
- would like to express my sincere gratitude to the CERN Summer School for
138
- the opportunity to participate in such an inspiring project. I extend
139
- special thanks to Jonas Rembser, Vassil Vassilev, and David Lange for
140
- their invaluable guidance and for providing continuous
141
- learning opportunities throughout this journey. I am also grateful to
142
- the ROOT team for welcoming me and supporting me throughout my stay at
143
- CERN.
144
-
145
-
146
-
131
+ I would like to express my sincere gratitude to the CERN Summer School
132
+ for the opportunity to participate in such an inspiring project.
133
+ I extend special thanks to Jonas Rembser, Vassil Vassilev, and David Lange for
134
+ their invaluable guidance and for providing continuous learning opportunities throughout this journey.
135
+ I am also grateful to the ROOT team for welcoming me and supporting me throughout my stay at CERN.
147
136
148
137
## ** Related Links**
149
- - [ CMS Combine GitHub page] https://cms-analysis.github.io/HiggsAnalysis-CombinedLimit/latest/
150
- - [ ROOT official repository] https://github.com/root-project/root
151
- - [ My GitHub profile] https://github.com/GalinBistrev2
138
+ - [ CMS Combine GitHub page] ( https://cms-analysis.github.io/HiggsAnalysis-CombinedLimit/latest/ )
139
+ - [ ROOT official repository] ( https://github.com/root-project/root )
140
+ - [ My GitHub profile] ( https://github.com/GalinBistrev2 )
141
+ - [ Presentation] ( /assets/presentations/CaaS_Weekly_25_09_2025_Galin_Bistrev_AD_in_CMS_Combine.pdf )
152
142
153
143
154
144
0 commit comments