Just to start with, I’m not able to reproduce the LAPACK issue with just import ROOT
. But I have evidence that there are other things which could be from ROOT import X
, or maybe import ROOT.Y
, that load additional shared libraries.
We’re using ROOT 6.28-something right now. This error occurs (only, so far!) in an Apptainer build (Ubuntu 22) which includes Python 3.10.12, ROOT 6.28, Numpy 2.0.2, SciPy 1.14.1, and of course lots of low level stuff. Numpy, SciPy, etc. are all installed with pip
.
The ROOT libraries are installed in /usr/local/lib, and I see /usr/local/lib/ROOT/, but I don’t see submodules etc., so this is where my understanding stops.
I understand. In this way, it’s analogous to G4python. All the more reason to want something written down somewhere that says, “If you want the RooFit stuff, use this import command.” “If you want RDataFrame, use this.” And so on.
Oh, I think I see your point. The from ROOT import X
action can use any individual ROOT class (like TTree, TChain) as X. You’re quite right that re-documenting all of that is crazy. I’m more interested in the mapping between submodules and ROOT library names. For example, to get RDataFrame, one uses import ROOT.RDF
I think, but the library is libROOTDataFrame.so.
Where would I find a list, table, whatever of the possible import ROOT.X
?
Traceback (most recent call last):
File "/scratch/user/kelsey/CATs-LAPACK_problem/./fitter_test.py", line 68, in <module>
params, _ = curve_fit(TESshape, bins[fitStart:fitEnd], trace[fitStart:fitEnd],
File "/usr/local/lib/python3.10/dist-packages/scipy/optimize/_minpack_py.py", line 1033, in curve_fit
_, s, VT = svd(res.jac, full_matrices=False)
File "/usr/local/lib/python3.10/dist-packages/scipy/linalg/_decomp_svd.py", line 156, in svd
lwork = _compute_lwork(gesXd_lwork, a1.shape[0], a1.shape[1],
File "/usr/local/lib/python3.10/dist-packages/scipy/linalg/lapack.py", line 1011, in _compute_lwork
return _check_work_float(ret[0].real, dtype, int_dtype)
File "/usr/local/lib/python3.10/dist-packages/scipy/linalg/lapack.py", line 1031, in _check_work_float
raise ValueError("Too large work array required -- computation "
ValueError: Too large work array required -- computation cannot be performed with standard 32-bit LAPACK.
You can see that the actual error is deep inside SciPy. I trigger it with curve_fit()
, but searching StackOverflow etc. shows that the ValueError can be raised by any number of fitting or other things.
Reproducibility is “easy”, but is currently SuperCDMS-specific. We have a Python-only superclass of RDataFrame, which is part of an analysis tools package called “CATs” (CDMS Analysis Tools). The latter doesn’t use TensorFlow, PyTorch, etc. If I import cats
(anything from it, not specifically CDataFrame), before my two lines
import numpy
from scipy.optimize import curve_fit
Then I get the 32-bit LAPACK complaint above. If I don’t import our CATs module at all, my fit works just as expected. If I import CATs after scipy, my fit also works as expected.
I have used a find/grep pipe to collect every import
and from...import
line from the CATs package. There are low-level Python things, there’s Numpy, and there’s ROOT related things:
from ROOT import gInterpreter, TChain, TTree
from ROOT import RDataFrame, TChain, TTree
from ROOT import Numba
from ROOT.RDF import AsRNode
import ROOT
What I’m struggling with now is that if I try any one of those individual ROOT-related imports, I cannot trigger the 32-bit LAPACK complaint.
So now I’m trying to identify the relevant .so libraries, by using lsof -p
to collect and diff the lists of what’s been loaded. This is why I would really like a mapping between what I can import from ROOT in Python, and which shared library that triggers. I can get that mapping myself, provided I have documentation for “what I can import from ROOT”. Hence my posted question.