|
| 1 | +--- |
| 2 | +title: "Results from CERN Summer school 2025: Supporting Automatic |
| 3 | +Differentiation in CMS Combine profile likelihood scans " |
| 4 | +layout: post |
| 5 | +excerpt: "A CERN Summer Student 2025 project aiming at the support of |
| 6 | +automatic differentiation (AD) for likelihood scans in the CMS Combine |
| 7 | +tool to accelerate statistical inference by leveraging RooFit's |
| 8 | +AD support and LLVM-based gradient generation." |
| 9 | +sitemap: false |
| 10 | +author: Galin Bistrev |
| 11 | +permalink: blogs/2025_galin_bistrev_results_blog/ |
| 12 | +banner_image: /images/blog/banner-cern.jpg |
| 13 | +date: 2025-09-25 |
| 14 | +tags: cern cms root combine c++ rooFit automatic-differentiation |
| 15 | +--- |
| 16 | + |
| 17 | +### **Introduction** |
| 18 | +Greetings! I’m Galin Bistrev, a fourth-year student specializing in Nuclear |
| 19 | +and Particle Physics at the University of Sofia "St. Kliment Ohridski". |
| 20 | +As part of the CERN Summer Student Programme 2025, I was working on a project |
| 21 | +that aimed to provide support for Automatic Differentiation (AD) |
| 22 | +into the CMS Combine tool profile likelihood scans. |
| 23 | + |
| 24 | + |
| 25 | +Mentors: Jonas Rembser , Vassil Vasilev , David Lange |
| 26 | + |
| 27 | +### **Description of the Project** |
| 28 | + |
| 29 | +This project aims to enhance support for Automatic Differentiation (AD) |
| 30 | +in likelihood scans within the CMS Combine framework, the primary |
| 31 | +statistical analysis tool of the CMS experiment at CERN. |
| 32 | +Combine is built on top of RooFit, which has recently introduced AD to |
| 33 | +improve minimization techniques.By providing computationally efficient |
| 34 | +gradients through AD, RooFit achieves substantial performance |
| 35 | +improvements. In Roofit ,Clad converts internal likelihood representations |
| 36 | +into standalone C++ code, from which gradient routines for AD |
| 37 | +are generated.This strategy not only speeds up the fitting process but |
| 38 | +also increases the portability and shareability of likelihood models, |
| 39 | +making them usable even by those without detailed knowledge |
| 40 | +of RooFit or Combine internals. |
| 41 | + |
| 42 | + |
| 43 | + |
| 44 | +### **Brief overview of the CMS Combine engine** |
| 45 | +Combine is a statistical analysis framework that compares models |
| 46 | +of expected observations with real data. It is widely used for tasks |
| 47 | +such as searching for new particles or processes, setting limits on |
| 48 | +potential new physics, and measuring physical quantities like cross sections. |
| 49 | +Although developed with High Energy Physics (HEP) applications in mind, |
| 50 | +Combine contains no intrinsic physics assumptions, making it fully general |
| 51 | +and independent of any specific analysis. This flexibility allows it |
| 52 | +to be applied across a broad range of statistical problems. |
| 53 | + |
| 54 | +Roughly, Combine performs three main functions: |
| 55 | + |
| 56 | +- Builds a statistical model of expected observations. |
| 57 | + |
| 58 | +- Runs statistical tests comparing the model with observed data. |
| 59 | + |
| 60 | +- Provides tools for validating, inspecting, and understanding both the |
| 61 | +model and the results of the statistical tests. |
| 62 | + |
| 63 | +### Project goals |
| 64 | + |
| 65 | +In order for AD to be supported in Combine likelihood scans , |
| 66 | +a number of goals needed to be achieved: |
| 67 | + |
| 68 | +- Refactoring some of Combine's logic into RooFit , so that Combine can |
| 69 | + reuse the AD-enabled minimization algorithm already present there. |
| 70 | + |
| 71 | +- Integrate gradient computation into likelihood scans, ensuring that |
| 72 | + derivatives are correctly propagated for efficient and accurate minimization. |
| 73 | + |
| 74 | +- Validate correctness and performance, confirming that the AD-based scans |
| 75 | + produce results consistent with traditional |
| 76 | + methods while offering improved performance. |
| 77 | + |
| 78 | +## **Overview of Completed Work** |
| 79 | +Over the course of the project, several major tasks were completed to |
| 80 | +achieve the stated objectives: |
| 81 | + |
| 82 | +- Imported the `RooMultiPdf` class in RooFit from Combine, enabling |
| 83 | + switching between multiple PDF-s, applying statistical penalties, and |
| 84 | + supporting code generation for AD. |
| 85 | + |
| 86 | +- The implementation of the new class was made to be supported by `codegen` |
| 87 | + in RooFit by adding a new function in `MathFunc.h` and extending |
| 88 | + `CodegenImpl.cxx` to generate code for models making use of it. |
| 89 | + |
| 90 | +- Imported three pieces of code from Combine that handle the minimization |
| 91 | + procedures within the framework in Roofit's `RooMinimizer.cxx`. |
| 92 | + The first is the class `FreezeDisconnectedParametersRAII`, |
| 93 | + which automatically freezes and unfreezes parameters disconnected from |
| 94 | + the likelihood graph. The second is the function `generateOrthogonalCombinations`, |
| 95 | + which generates a list of index combinations by initializing a base configuration |
| 96 | + with all indices set to zero and then varying one category at a time. |
| 97 | + The third and final function is `reorderCombinations`, which takes the |
| 98 | + set of indices produced by `generateOrthogonalCombinations` and reorders |
| 99 | + them so that combinations differing least from the current best |
| 100 | + configuration are evaluated first. |
| 101 | + |
| 102 | +- Using the above stated functions , the discrete profiling algorith, |
| 103 | + which is the main minimization algorithm in Combine, was imported in |
| 104 | + `RooMinimizer.cxx`. |
| 105 | + |
| 106 | +- Created a [tutorial](https://root.cern/doc/master/rf619__discrete__profiling_8py.html) |
| 107 | + and a [benchmark](https://github.com/vgvassilev/clad/issues/1521), |
| 108 | + demonstrating discrete profiling with RooMultiPdf objects and evaluating |
| 109 | + the performance of AD in the likelihood scans. |
| 110 | + |
| 111 | +## **Results** |
| 112 | +With those objectives accomplished, RooFit now provides AD support for discrete profiling. |
| 113 | +However, the developed benchmark indicates that AD does not currently |
| 114 | +improve efficiency, as the gradient code generated by Clad introduces overhead. |
| 115 | +Further optimization in Clad is needed to achieve the potential performance |
| 116 | +gains for RooFit likelihood scans.More information regarding the issue can be |
| 117 | +found at [#1521](https://github.com/vgvassilev/clad/issues/1521). |
| 118 | + |
| 119 | +## **Conclusions** |
| 120 | +Thanks to this project, RooFit now enables AD support for discrete profiling in Combine, |
| 121 | +which, after addressing the current overhead in Clad, would allow for |
| 122 | +significantly faster and more efficient likelihood scans while maintaining |
| 123 | +accurate optimization of both discrete and continuous parameters. |
| 124 | + |
| 125 | +## *Future work* |
| 126 | +- Further benchmarking is required to quantify the potential performance |
| 127 | + gains from automatic differentiation. |
| 128 | + |
| 129 | +- Additional optimization of Clad is needed to eliminate unnecessary overhead |
| 130 | + in gradient generation. |
| 131 | + |
| 132 | +- The discrete profiling logic implemented in RooMinimizer should be tested across |
| 133 | +different models to evaluate the minimizer’s behavior and robustness |
| 134 | + |
| 135 | +## **Acknowledgements** |
| 136 | +would like to express my sincere gratitude to the CERN Summer School for |
| 137 | +the opportunity to participate in such an inspiring project. I extend |
| 138 | +special thanks to Jonas Rembser, Vassil Vassilev, and David Lange for |
| 139 | +their invaluable guidance and for providing continuous |
| 140 | +learning opportunities throughout this journey. I am also grateful to |
| 141 | +the ROOT team for welcoming me and supporting me throughout my stay at |
| 142 | +CERN. |
| 143 | + |
| 144 | + |
| 145 | + |
| 146 | + |
| 147 | +## **Related Links** |
| 148 | +### Related Links |
| 149 | +- [CMS Combine GitHub page]https://cms-analysis.github.io/HiggsAnalysis-CombinedLimit/latest/ |
| 150 | +- [ROOT official repository]https://github.com/root-project/root |
| 151 | +- [My GitHub profile]https://github.com/GalinBistrev2 |
| 152 | + |
| 153 | + |
| 154 | + |
0 commit comments