Home | News | Documentation | Download

Adding "PDFInterpol=logspline" to TMVA::Factory?

Without knowing definitively the underlying distribution of a TMVA response, I have been considering fitting the PDF histograms of the response using PDFInterpol=KDE as opposed to one of the PDFInterpol=Spline{0-5} options. However, one of my colleagues has argued that the KDE would produce too small tails when considering the Rarity/CDF of the fitted PDF. It looks like this dilemma has been posted elsewhere, where it was suggested that a logspline could yield the desired behavior of bigger tails. Figure 2 in these lecture notes have raised my hopes that a logspline fit to the histogram PDFs may be worthwhile to pursue.

Has this been considered before? Could it be implemented?

@moneta @swunsch could you help here please? Thanks in advance!

1 Like

As an interim solution, I was looking into rebuilding ROOT and enabling R support bindings to be able to use R’s logspline library. The way I thought I would do this is I would refer to the evaluated response values within the TrainTree in the file written for the single TMVA::Factory object that I create. I would use the response values corresponding to classID==0 to create a vector “x” which I would then feed into R to get a response using fit <- logspline(x).

I found this method won’t work, though, because it seems the TrainTree object within the file has only saved the response values, their probabilities, their classID, and ther className for the first method booked. With a single Factory object I book multiple TMVA::DataLoader objects of type kFisher to try different combinations of signal and background samples, in order to use the same overall data sample for each DataLoader object. ATDirectoryFile object is created for each training within the Method_Fisher TDirectoryFile object though. Could additional leafs for each training be added to the TrainTree and TestTree?