I noticed that my python script with ROOT libraries is running much slower on Mac (10.16.7, Catalina) then on RedHat7.
I found that it is connected with using ROOT.Math library. I wrote simple script to test it.
#import sys, ROOT, math
from ROOT import gBenchmark
for i in range(50000):
# 1 sqrt3=math.sqrt(2.)
# 2 sqrt2=ROOT.Math.sqrt(2.)
if n%10000 ==0:
print('n # ',n)
There are 2 operators in loop that are calculating just sqrt(2.).
I made 3 timing measurements using gBenchmark.
Without both of them – 0.01 seconds
sqrt3=math.sqrt(2.) -Only this operator, very fast, ~0.01 seconds
sqrt2=ROOT.Math.sqrt(2.) – Only this operator, very slow ~12 secons, 1200 times!!! slower than #2.
The result doesn’t change much if I am using homebrew, port or direct from cern.root website.
I have old Mac laptop with ROOT 6.24/00, python 2.7.16. The same script is working very fast.
I contacted with some IT experts. They don’t ay me much exsperts that there is definitely some problem.
Random guess, but as I expect math.sqrt and ROOT.Math.sqrt to both be optimized to death, I think what you might be measuring is the overhead of calling C++ functions via PyROOT compared to calling standard Python C extension modules (I don’t know why that might be slower on Mac than on RedHat).
Sorry for typo.
Yes, this helps, 0.06 seconds instead of 13 seconds.
This is interesting suggestion and observation.
It was just a example. In the real program I am using another packages.
Do I need to import all of them one by one?
@valkuba Yes, in old ROOT.py, lookup results on the facade were cached, whereas in “new” PyROOT, successful top-level lookups no longer are, hence the difference you observe between those two ROOT versions. I.e. it’s a choice (bug) in the new approach, not a Mac-thingy (or even Python-thingy).
To get the old speed back, you only ever need to do ‘from ROOT import Math’ however, since ‘Math’ is bound by cppyy, which does cache successful lookups, so Math.sqrt will be fast (even as, yes, just sqrt will be even faster, but the same is true for use of sqrt from math.sqrt).
(Aside, on my Linux box, using **0.5 outperforms all and cppyy.gbl.std.sqrt has an additional slowdown of 25% b/c it’s a templated function, which results in an extra internal lookup.)
The remaining performance differences between Math.sqrt and math.sqrt you’re left with are:
wrapper generation for cppyy on first call
Math.sqrt being an overloaded function whereas math.sqrt only works on double
For python3: math.sqrt benefiting from an optimized call API, which isn’t available (in alternate form) to cppyy until python3.8 and only actually used as of cppyy 2.1.0.
And yes, sqrt is my favorite function when trying to understand call overhead
You mean to make old PyROOT equally slow by removing the caching? Personally, I would leave old ROOT.py well alone and instead opt to add caching to the new ROOT/_facade.py, as a way to equalize old and new behavior. Two orders of magnitude for a common use case is nothing to sneeze at and fixing ROOT/_facade.py is certainly easier than asking folks all over to change a decade and a half worth of legacy codes. Besides, recommendations in preference of from X import Y over import X; X.Y have varried over time. Personally, I used to like the former but now prefer the latter b/c of Jupyter notebooks.