Clarification on integral caching in the new RooFit batchmode

elusian · August 22, 2022, 7:54pm

Hello

I’m implementing a custom pdf in RooFit and I’m trying to expose an analytical integral. My pdf is a wrapper around another pdf (think RooEffProd) and has an analytical integral if the underling pdf has one.

In march there was however PR #10100 in RooFit (now in ROOT 6.26), whose description says that “the caching of normalization integrals doesn’t work with the new RooFit batch mode”.

What exactly can and cannot be done with integrals in the new batchmode?
Does this mean that other classes with integral caching (RooRealSumPdf for example) won’t work properly with the new batchmode?

Thank you in advance.

eguiraud · August 23, 2022, 11:28am

Hi @elusian ,

this is a question for @jonas , let’s ping him.

Cheers,
Enrico

elusian · September 1, 2022, 3:40pm

Hello, sorry for the ping, but are there any news?

eguiraud · September 8, 2022, 10:41am

Hi @elusian ,

thank you for the ping, we still need @jonas here. Maybe @moneta can also help.

Cheers,
Enrico

jonas · September 8, 2022, 10:50am

Hi @elusian,

sorry for the late reply!

About the RooEffProd:

The problem in the RooEffProd was that it used an additional member variable, _fixedNormSet, to fix the normalization for the input PDF when the RooEffProd is integrated. This is not how the integration of PDFs is usually handled in RooFit, which usually works different (see the source code of RooAbsPdf::getValV() if you are interested). The stuff that I removed in the RooEffProd in the linked PR was therefore redundant, and for the BatchMode it even interfered with how the PDFs are integrated and gave wrong results. That’s why I removed it.

About integrals in the new RooFit BatchMode:

At the interface level, nothing has changed in the BatchMode for integrals. However, the RooFit internal call stack in the implementation of the PDF evaluation looks very different, hence these hacks introducing additional state to the PDF like in the RooEffProd were problematic, because then the result can depend on the order in which the integral and the actual PDF evaluation is called. And about caching in particular, in the new BatchMode the integral object for a given PDF for a given normalization set is cached like before, just in a different place. But this should not be relevant for users and developers of PDFs.

On your plans to expose the analytical integral:

Just look like it is done in any of the PDFs that is covered by the unit tests, like the ones in stressRooFit. You can look at the RooGaussian for example. It is usually enough to overload getAnalyticalIntegral and analyticalIntegral to support analytical integration in your RooAbsPdf-derived class.

I hope this helps a bit, let me know which parts I should further clarify!

Jonas

elusian · September 15, 2022, 3:03pm

Hello @jonas , thank you for your answer and the explanation.

The problem with the analytical integral is that this function is a “composed” one, like the one in RooFitCore, not like the “concrete” ones in RooFit, so looking at the simple integrals like RooGaussian won’t help. It’s essentially a specialized RooProduct, because using RooProduct in my model would cause the evaluation of numeric integral for each event.

If the problem was _fixedNormSet and not the caching of integrals I’ll try to use it.

Thank you again!

jonas · September 15, 2022, 3:21pm

Yes, there is nothing special to consider with integrals for the BatchMode, there was just something fishy with the RooEffProd logic that worked in the non-BatchMode more by change.

Another thing you can do to avoid the normalization integrals for PDFs is to overload RooAbsPdf::selfNormalized():

github.com

root-project/root/blob/master/roofit/roofitcore/inc/RooAbsPdf.h#L253


      
          }
          virtual double getNorm(const RooArgSet* set=nullptr) const ;
          
          
virtual void resetErrorCounters(Int_t resetValue=10) ;
          void setTraceCounter(Int_t value, bool allNodes=false) ;
          
          
double analyticalIntegralWN(Int_t code, const RooArgSet* normSet, const char* rangeName=nullptr) const override ;
          
          
/// Shows if a PDF is self-normalized, which means that no attempt is made to add a normalization term.
          /// Always returns false, unless a PDF overrides this function.
          virtual bool selfNormalized() const {
            return false ;
          }
          
          
// Support for extended maximum likelihood, switched off by default
          enum ExtendMode { CanNotBeExtended, CanBeExtended, MustBeExtended } ;
          /// Returns ability of PDF to provide extended likelihood terms. Possible
          /// answers are in the enumerator RooAbsPdf::ExtendMode.
          /// This default implementation always returns CanNotBeExtended.
          virtual ExtendMode extendMode() const { return CanNotBeExtended; }
          /// If true, PDF can provide extended likelihood term.

Then, you just need to make sure that evaluate() returns something that is normalized, consistent with the normalization set and range your will use for fitting. If you don’t need the general integral caching mechanism that can cache the integrals over any subset of PDF inputs because you’ll have the same normset anyway, then I would go for a selfNormalized custom PDF.

system · September 29, 2022, 3:22pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.