Dear all

I used RooKeysPdf in a slightly unusual situation, where the large

spread of event weights neccesitated the used of an AKE (rather than

a plain histogram) even when we have 10^5 to 10^6 events.

Unfortunately the CPU time of RooKeysPdf scales as o(N^2) where N is

the number of events.

However, RooKeysPDF is hardcoded to target an resolution on the

x-axis of 1000 points. I modified it to use a similar trick in

calculating the bandwidth parameters (_weights), which dramatically

improved the execution times in our use case from >1 hours to about

3 minutes. The basic scaling for large N is now o(N).

The code is available in:

www-d0.fnal.gov/~aharel/AKE.h

www-d0.fnal.gov/~aharel/AKE.c

I hope the maintainers will consider including a patch along this

lines into ROOT.

Caveats:

- I did not optimize the switch between the two algorithms.
- Need to check the interplay with mirroring - should the switch

be based on _nEvents or on data.numEntries()?

cheers,

Amnon Harel,

University of Rochester