Meaning of [Auto,Dirty] for RooAddPdf/RooHistPdf

Nabil · June 11, 2013, 3:03pm

dear experts,
does one know the meaning of the “Dirty” word, i get when i do Print(“v”) to see the contents of the RooAddPdf?
I could not find hints about it in the Roofit documentation.
thanks.

— RooAbsArg —
Value State: DIRTY
Shape State: DIRTY
Attributes:
Address: 0xbfd76ad8
Clients:
Servers:
(0xbfd76258,V-) RooHistPdf::fRHpdfS “signal PDF”
(0xbfd76698,V-) RooHistPdf::fRHPdfB “Background PDF”
(0xbfd75f78,V-) RooRealVar::fS "signal fraction"
Proxies:
!refCoefNorm ->
!pdfs ->
1) fRHpdfS
2) fRHPdfB
!coefficients ->
1) fS
— RooAbsReal —

Plot label is “fRAPdfSB”
— RooAbsPdf —
Cached value = 0

[#1] INFO:Plotting – RooAbsPdf::plotOn(fRAPdfSB) indirectly selected PDF components: ()
0xbfccb3a8 RooAddPdf::fRAPdfSB = 0.000763515 [Auto,Dirty]
0xbfccab28/V- RooHistPdf::fRHpdfS = 0.000222916 [Auto,Dirty]
0xbfcca568/V- RooRealVar::mass = 400
0xbfccaf68/V- RooHistPdf::fRHPdfB = 0.000993153 [Auto,Dirty]
0xbfcca568/V- RooRealVar::mass = 400
0xbfcca848/V- RooRealVar::fS = 0.298139 +/- (-0.00911388,0.00910802)

Wouter_Verkerke · June 13, 2013, 9:44am

Hi Nabil,

This indeed largely undocumented and relates mostly to the internal workings of RooFit
calculation strategies. The first keyword (Auto in your case) defines the operation
mode for calculations for a given object, the second keyword (Dirty in your case) indicates the value cache
status.

In operation mode Auto (the default) a caching and lazy evaluation strategy is followed to calculate
the value of the object: once a value has been calculated (valid at current parameter/observable values)
the result is stored in a cache, and the same value is returned in a subsequent calculation as long
as variable values have not changed. In that case the cache status is reported as ‘Clean’. Once a parameter
or observable of the function changes, it will notify the pdf, which will set it cache status to ‘Dirty’.
This signals that the value will need to be recalculated on the next call to getVal(), but no upfront calculation happens once a variable changes

There are two other modes of operation that only applies to pdf objects inside a likelihood object.
The first one is ADirty (for always dirty). In this mode no tracking of variable changes is done and the
value is explicitly recalculated every time. This makes sense for any function of the observables
x inside a likelihood as each observable x is a priori known to change every time inside a likelihood calculation
(so any value tracking algorithm just introduces overhead in this mode)

The other mode is AClean (for always clean). Objects in this mode never recalculate themselves. Inside
an optimized likelihood object all functions that depend only on observables and constant parameters have this status. The value of such object is calculated once when the likelihood object is calculated and stored in a ‘cache dataset’ inside the likelihood as they will never change. When an event is retrieved from the dataset in the likelihood calculation loop, the cached values corresponding to the loaded event are directly written into the value caches of these objects, hence there is never a need to call evaluate() on them during likelihood evaluations. Inside a typical likelihood all pdfs are either ADirty or AClean, depending
on the const-ness of their parameters, while their normalization integral objects will typically all be still
in Auto mode.

Hope that clarifies things a bit…

Wouter

Nabil · June 13, 2013, 10:09am

Wouter,
many thanks for the clarification. I understand i should not pay attention to these for my final result.
thanks again.