Cppyy.gbl and ROOT namespace


_ROOT Version:master with PyROOT experimental
Platform: Not Provided
Compiler: Not Provided


Hello,

I’m just trying experimental PyROOT (master) and see that probably all ROOT classes are in cppyy.gbl namespace (len(dir(cppyy.gbl)) == 2332). There is also cppyy.gbl.ROOT but it contains much less entries (len(dir(cppyy.gbl.ROOT)) == 92) and some of them look very strange (e.g., ‘Detai’, ‘Experimenta’, ‘Fi’, ‘Mat’ … ).

  1. What is the difference between
    from ROOT import <ROOT_class>
    and
    from cppyy.gbl import <ROOT_class>
    ? Is it roughly the same for a user point of view?

  2. What is the purpose of the “mini” cppyy.gbl.ROOT?

  3. I find the current state quite confusing (maybe I misunderstood something). I would expect from ROOT import <ROOT_class>
    being more or less equivalent to
    from cppyy.gbl.ROOT import <ROOT_class>
    (i.e., all ROOT classes being accessible only from one dedicated ROOT namespace in cppyy.gbl, similarly as, e.g., std has its own namespace: cppyy.gbl.std). Wouldn’t be then easier to use cppyy in connection to non-ROOT c++ code (to avoid name conflicts and improve modularity/separation of different libraries)?

Cheers,
Jiri

Hi Jiri,

The python module ROOT is just a module facade to cppyy.gbl plus some extra setup that we need in ROOT, so yes, it is roughly the same.

cppyy.gbl.ROOT is a Python proxy of the C++ ROOT namespace, so it is the same as ROOT.ROOT.

We are now adding a shortcut for the ROOT namespace, so that you can do e.g. ROOT.Math instead of having to type ROOT.ROOT.Math, which would be the canonical way. it should be in master soon.

Hi,
besides the ROOT vs ROOT.ROOT issue pointed out by @etejedor above, something else that is probably a source of confusion here: cppyy.gbl (as well as cppyy.gbl.ROOT) is populated dynamically:

>>> import ROOT
>>> dir(ROOT)
['PyConfig', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'gROOT', 'keeppolling', 'module']
>>> ROOT.TH1F
<class 'ROOT.TH1F'>
>>> dir(ROOT)
['AddressOf', 'MakeNullPointer', 'PyConfig', 'PyGUIThread', 'SetMemoryPolicy', 'SetOwnership', 'SetSignalPolicy', 'TH1F', 'TROOT', 'Template', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'gInterpreter', 'gROOT', 'gSystem', 'kMemoryHeuristics', 'kMemoryStrict', 'kSignalFast', 'kSignalSafe', 'keeppolling', 'module', 'std']
>>> ROOT.TTree
<class 'ROOT.TTree'>
>>> dir(ROOT)
['AddressOf', 'MakeNullPointer', 'PyConfig', 'PyGUIThread', 'SetMemoryPolicy', 'SetOwnership', 'SetSignalPolicy', 'TH1F', 'TROOT', 'TTree', 'Template', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'gInterpreter', 'gROOT', 'gSystem', 'kMemoryHeuristics', 'kMemoryStrict', 'kSignalFast', 'kSignalSafe', 'keeppolling', 'module', 'std']

and similarly for cppyy.gbl. So size and contents of dir(module) are not super well defined.

(Populating the modules dynamically lowers startup time because PyROOT does not have to parse as much and create as many proxies until the user actually asks for them)

Cheers,
Enrico

dir() only shows the current contents by default. For the use of tab-completion etc., dir(cppyy.gbl) (and similarly on other namespaces) now show a listing of all that is currently known, incl. from the class table, rootmap files, etc.

Many thanks for your useful explanations! There are, however, still some points which are not clear to me:

Hmm, I don’t fully understand this comment. As you wrote something like ROOT.ROOT.Math is for accessing the Python proxy of the C++ ROOT namespace (“non-pythonized” automatically generated python binding for C++ ROOT, if I understand it correctly). Shouldn’t it be different from “pythonized” ROOT.Math? Or is this related to the difficulties that some classes/enums/…/variables in ROOT are in C++ ROOT:: namespace (such as Math being in ROOT::) and some are not (and the shortcut is effectively an attempt to “remove” the ROOT:: namespace)?

  1. Small example:
$ python                                                                        
>>> import cppyy
>>> dir(cppyy.gbl) # very very long list of mostly ROOT names
[ ... ,'RooAbsAnaConvPdf', ...,'TH1F', ..., 'gApplication', ... , 'kInfo', ... , 'std', ...]
>>> len(dir(cppyy.gbl))
2332

a) It looks to me that cppyy.gbl is evidently “non-dynamically” populated, even if python module ROOT is not imported at all (at least all the ROOT specific names exist in cppyy.gbl).

b) The list of ROOT names in cppyy.gbl is really long and everything what is non-ROOT is “lost” there, and there may be mainly name-conflicts if cppyy is used for other c++ libraries. The current state is:

cppyy.gbl.*    # roughly equivalent to the python module ROOT
cppyy.gbl.ROOT # the Python proxy of the C++ ROOT namespace

What about having ROOT namespace in cppyy.gbl which would contain everything what is ROOT specific, i.e.:

cppyy.gbl.ROOT      # roughly equivalent to the python module ROOT
cppyy.gbl.ROOT.ROOT # the Python proxy of the C++ ROOT namespace

It would allow to better separate ROOT from other libraries used in connection to cppyy. In my opinion it would improve modularity and maintenance. What do you think?

Cheers,
Jiri

I can’t comment authoritavely on your other questions, but

Yes, precisely. Some classes/functions can only be accessed as ROOT.ROOT.*, for others ROOT.* is enough, and in the python world this distinction is awkward – in the C++ world, the distinction is that legacy classes are not in the ROOT:: namespace, while newer classes are.

It is – you (or your setup) must be doing some work at startup. My lxplus home is pretty vanilla, and:

ssh eguiraud@lxplus
[eguiraud@lxplus793 ~]$ source /cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/latest/x86_64-centos7-gcc62-opt/setup.sh
[eguiraud@lxplus793 ~]$ python
>>> import cppyy
>>> dir(cppyy.gbl)
['__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'std']
>>> cppyy.gbl.TH1D
<class 'ROOT.TH1D'>
>>> dir(cppyy.gbl)
['ROOT', 'TH1D', '__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'std']

Cheers,
Enrico

But that would make PyROOT users have to type ROOT.ROOT always, which is something we would like to avoid.

If you setup lcg, you’re probably pulling in the cppyy.py from standard ROOT, i.e. the old PyCintex.py.

No, cppyy.gbl does not get “non-dynamically” populated … dir() shows what is available for dynamic creation, not what is there. On p3, you can use object.__dir__(cppyy.gbl) to see the actual content.

Experimental PyROOT does not use vanilla cppyy, but builds it against ROOT, so by definition all ROOT stuff is there. The former also still carries some ROOT code (in the cppyy-cling backend), but that is diminishing.

Clashes in names on the Python side mean clashes at the linker level on the C++ side, so that’s not important for most cases as even the C++ code won’t run. Further, ROOT has this ‘T’ namespacing going on. What is much, much, worse, is the insistence of ROOT/meta to put all of std:: in the global namespace. That gives no end of grief due to clashes and name mismatches. It’s deeply hard-wired and even in cppyy-cling, I have not been able to excise it. But it will my priority once v6.18 is out.

In PyROOT, in RootWrapper.cxx, there is a lookup in ROOT:: if a name can not be found. You can mimic that behavior by changing cppyy.gbl.__getattr__.

Ah thank you @wlav , of course my example was for non-experimental PyROOT.

So with experimental PyROOT the list is actually filled but the proxies corresponding to those names are not actually there until they are used.

Many thanks again for your comments (especially to @wlav). There were very helpful.

Two remaining points:

  1. As I mentioned in my first post, cppyy.gbl.ROOT (or ROOT.ROOT) now contains, e.g., ‘Math’ and ‘Mat’. However, only ROOT::Math namespace exists on C++ side, there is no ROOT::Mat namespace (see also ‘Detai’, ‘Experimenta’, ‘Fi’, ‘RD’ … in cppyy.gbl.ROOT):
$ ipython                                                          
In [1]: import cppyy                                                                                         
In [2]: cppyy.gbl.ROOT.Math                                                                                  
Out[2]: <namespace cppyy.gbl.ROOT.Math at 0x55a8ce2ffcb8>
In [3]: cppyy.gbl.ROOT.Mat  
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-3-1934dfdf2136> in <module>
----> 1 cppyy.gbl.ROOT.Mat

AttributeError: <namespace cppyy.gbl.ROOT at 0x55a8ce2f4858> has no attribute 'Mat'. Full details:
  type object 'ROOT' has no attribute 'Mat'
  'ROOT::Mat' is not a known C++ class
  'Mat' is not a known C++ template
  'Mat' is not a known C++ enum

Looks to me like a bug, or is there any purpose of ‘Mat’ which somehow lost letter ‘h’?

  1. @etejedor

As to the having all ROOT classes/functions/enums/… in cppyy.gbl or in separate namespace cppyy.gbl.ROOT.

a) I agree that one should try to avoid “ROOT.ROOT”. One can avoid having cppyy.gbl.ROOT.ROOT (corresponding to the C++ ROOT:: namespace) using the “shortcut” you mentioned, i.e., making accessible everything from C++ ROOT:: in cppyy.gbl.ROOT and not in cppyy.gbl (if there are no name conflicts).

b)

Yes, this was one of the things which confused me. There is already cppyy.gbl.std (which is now accessible also from ROOT.std). One may imagine any other C++ library having its entry in cppyy.gbl - if I understand correctly this is how cppyy is intended to be used outside ROOT world, right?

In my opinion, if ROOT makes visible all its classes/functions/… from cppyy.gbl and not from cppyy.gbl.ROOT, then it makes using cppyy in connection to other non-ROOT C++ libraries more difficult and less intuitive.

Having cppyy.gbl.ROOT as separate namespace for all ROOT classes/functions/… could help to make it more equivalent to python module ROOT (if required). The extra setup you mentioned for python module ROOT could be maybe implemented in cppyy.gbl.ROOT and the python module ROOT could be basically just a link to it. At the moment the usage of cppyy by ROOT looks to me too “ROOT-centric” and I don’t see the benefit.

Cheers,
Jiri

Yes, that looks like a bug. It seems that both ‘Mat’ and ‘Math’ appear when you run dir(same for other cases like ‘RD’ and ‘RDF’) but in a lookup only the correct one is found. I just opened a ticket to follow the issue, thanks for reporting!

https://sft.its.cern.ch/jira/browse/ROOT-10163

Traditionally ROOT has played the role of the global namespace, not only in the experimental PyROOT, also in the current one. So the issue is that all the code that uses PyROOT now relies on that convention.

Thanks!

I’m not sure if I understand you correctly. Having everything related to ROOT in cppyy.gbl.ROOT instead of directly in cppyy.gbl should not affect the python module ROOT. Maybe only cppyy.gbl.std would need to be accessible also as cppy.gbl.ROOT.std for backward compatibility but it is not difficult to do. Or is there any more fundamental problem? Is cppyy already used somehow (partially?) by the current PyROOT or is it used only by the experimental one? If cppyy is used only by the experimental PyROOT then I don’t see the problem related to backward compatibility. Only “issue” I can imagine now is that, e.g., gROOT.ProcessLine (or similar commands) would put code into cppy.gbl.ROOT namespace by default while, e.g., cppyy.cppdef would put it in cppyy.gbl by default - and, therefore, it would not been recognized by PyROOT. This could be, however, an advantage and not disadvantage.

I know that ROOT is sometimes regarded as “the global namespace” also in more general meaning. However, not even all the ROOT classes/functions/… are in the C++ ROOT:: namespace. Then there is the “T” and “R” naming convention instead of proper namespaces/modules. If I look, e.g., at this list of C++ libraries https://en.cppreference.com/w/cpp/links/libs then I can’t find a single one which would use ROOT:: namespace as they use their own namespaces (e.g., boost::), and they don’t depend on ROOT. All these libraries, therefore, may coexist together if they use proper C++ namespaces. However, if someone wants to pick any of the C++ library, use cppyy to generate python bindings and also use (Py)ROOT then it gets complicated (especially if the library doesn’t have its C++ namespace). Just installation of ROOT in given environment means currently that cppyy.gbl contains all the ROOT classes/functions/… available for dynamic creation - without even importing ROOT. It looks to me that the “traditional” role of ROOT as “the global namespace” is changing. I would, therefore, personally vote for steps leading to easier usage of (Py)ROOT with other libraries.

The current PyROOT also uses cppyy, an older version of it.

If I understand what you are proposing, you suggest that any ROOT class should obtained by e.g. from ROOT import TTree, but any other class should be obtained with from cppyy.gbl import SomeClass? When I say ROOT is used in PyROOT as the global namespace, what I mean is that people expect that if they do from ROOT import MyClass, MyClass being in the global namespace, the lookup will work, for instance with a dynamically created class.

In addition to that, there is the ROOT C++ namespace, where some ROOT classes (the newest ones) are placed. That would be ROOT.ROOT or cppyy.gbl.ROOT. I am aware that many other classes in ROOT do not belong to that namespace, these are the older ones. See for instance new additions like RDataFrame, they go in the ROOT namespace.

I appreciate your comments and I take note of the issue with diring cppyy.gbl in PyROOT, which can indeed be annoying.

@jprochaz: yes, to your point 1, I presume that is a bug (is why I first got interested in this thread). There is quite a bit of string manipulation going on in creating the list. Among others, dealing with std:: that ROOT/meta likes to remove, and which is my biggest headache at the moment. Not just here; that issue comes up again and again. In fact, you worry about ROOT classes in the global namespace, but at least ROOT classes do live in the global namespace. All of std:: does not and yet it’s shoe-horned into it.

No, the cppyy.py in ROOT is not an older version of cppyy. That is, in fact, the old PyCintex.py, which itself was based on libPyROOT. What happened was that as part of the move to ROOT6, I wanted users of PyCintex.py to move their codes to cppyy.gbl rather than ROOT.py. This for two reasons: 1) be ready for PyPy (which supports cppyy), and 2) allow ROOT to be as ROOTy as it wanted, w/o affecting Gaudi, Athena, COOL, etc. that were previous users of PyCintex. Think about e.g. the handling of graphics that ROOT.py does: building old PyCintex codes on top of ROOT.py would be too complex a tangle at run-time.

As for mixing PyROOT code and other cppyy-bound code: I don’t think that will be an issue per se. A priori, I don’t think that PyROOT should expose anything cppyy, and as you say: the rest of the world uses proper namespaces, thus allowing one bad player. What is already mixed (within HEP) has been mixed through PyROOT for years. And probably all of that will go away with ROOT7.

As for its use outside of HEP v.s. its use being ‘too “ROOT-centric”’, remember the history here. I left HEP and decided to fork PyROOT since the code was quickly acquiring lots of bit rot. I thought that, after cleanup, here was a product ready to be used, since it could already handle large code bases (ROOT, Gaudi, and lots of experiment software). Pfah! I was so wrong. I spent so far almost two years dealing with support for modern C++, portability, and plain old performance. It is that new cppyy that is being back-ported into ROOT and although all effort seems to be going towards backwards compatibility of ROOT.py, there is simply no comparison in performance and handling of modern C++ between cppyy and libPyROOT.

Then for packaging/exposure of bindings: I only recently started to look into it. First I find, again, that this ROOT stuff simply isn’t ready for large-scale deployment. I tried modules (promised since 2012), only to find that to crash and burn as well. So, there’s a lot of work still to be done.

That said, what the “outside” world wants are something like the cmake fragments that ship with cppyy: a simple way of dropping in some headers, specifying some libraries, some conventional names, and then get a Python module. There is, at that point, no direct use of cppyy.gbl. The way that that works, is by generating a Python package of the proper name and a list of C++ entities using the clang python module (rootcling being beyond fixing/saving), which are then prepped at run-time.

There are some examples linked from the main cppyy documentation (e.g. https://github.com/camillescott/cppyy-bbhash), so you can see how that works. But again, first time this is used on something large (e.g. the Point Cloud Library), Cling performance (ie. lack of) kills it, but also still more missing modern C++ features in ROOT/meta (e.g. anonymous unions). So, as said, working on that right now. After that, there may be a better option, or at least clearer path, to make things more modular even when the C++ codes themselves are not cleanly structured.

@etejedor, I’m personally supporting any effort going into improving of modularization of ROOT.

@wlav, many thanks for all your comments and explanations!

Cheers,
Jiri

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.