Code to improve RooKeyPDF for large N

Dear all

I used RooKeysPdf in a slightly unusual situation, where the large
spread of event weights neccesitated the used of an AKE (rather than
a plain histogram) even when we have 10^5 to 10^6 events.
Unfortunately the CPU time of RooKeysPdf scales as o(N^2) where N is
the number of events.

However, RooKeysPDF is hardcoded to target an resolution on the
x-axis of 1000 points. I modified it to use a similar trick in
calculating the bandwidth parameters (_weights), which dramatically
improved the execution times in our use case from >1 hours to about
3 minutes. The basic scaling for large N is now o(N).

The code is available in:

I hope the maintainers will consider including a patch along this
lines into ROOT.


  • I did not optimize the switch between the two algorithms.
  • Need to check the interplay with mirroring - should the switch
    be based on _nEvents or on data.numEntries()?

Amnon Harel,
University of Rochester