Where are the PyROOT instructions? Why do they keep moving?

jfcaron · April 26, 2021, 9:32pm

I am giving a PyROOT tutorial and I want to link to the documentation. Last year I asked (Old PyROOT Instruction Pages) and was told to go here: https://root.cern.ch/how/how-write-ttree-python but now that’s broken too.

Why do these instructions keep getting deleted? Is there a permanent place for them? There is no equivalent at the official PyROOT manual section here: Python interface: PyROOT - ROOT

jalopezg · April 27, 2021, 7:31am

Hi @jfcaron,

You also probably want to bookmark the PyROOT tutorials link: ROOT: PyRoot tutorials. I am also inviting @etejedor, as he might provide any other useful link.

Cheers,
J.

jfcaron · April 27, 2021, 5:23pm

None of the PyROOT tutorials seem to do what was previously explained on the how-write-ttree-python page. Bits and pieces are around (like using array.array, using SetBranchAddress) but it’s not in a coherent piece.

Are we not meant to do this anymore? Is the only official way to get data from e.g. a text file to use RDataFrame or TTree::ReadFile?

etejedor · April 28, 2021, 7:27am

Hello @jfcaron ,

Since ROOT 6.22 we have a new PyROOT and we are also writing new documentation.

Part of it is the manual, which you already referenced.

The other piece of documentation we are writing is the pythonization docs, e.g. how to use ROOT C++ classes from Python. For example, related to your question, you have the TTree pythonization docs here (see PyROOT box):

https://root.cern.ch/doc/master/classTTree.html

which explain what’s different when you use SetBranchAddress and Branch from Python.

That being said, the new way to process TTrees in ROOT that we are promoting is called RDataFrame:

https://root.cern/doc/master/classROOT_1_1RDataFrame.html

You can read existing datasets, transform them and obtain results, or create/extend datasets and write them back to ROOT files. We’d be happy to help you with it if you decide to use it instead of the more low-level TTree interface.

jfcaron · April 28, 2021, 7:45pm

Thanks, that section on the TTree page was what I was looking for. I would like a tag to allow a URL for that section in particular, but at least I can tell people to Ctrl+F for PyROOT.

I do mention RDataFrame in my tutorial, and I understand it’s the way of the future, but I’m not experienced enough myself to teach others. The concepts are different enough from traditional procedural ROOT data analysis that it’s hard to translate. I guess I’m working against the machinery of progress by teaching the old way.

etejedor · April 29, 2021, 7:16am

Hello,

We have some teaching material for RDataFrame (slides, notebooks) that you could reuse, if that makes things easier. I understand the transition from TTree to RDataFrame requires some effort, but it is a good investment in the long run. For example, RDataFrame has implicit parallelization (it runs your analysis in all the cores of your machine transparently). Also, the programming model is more high-level and easier to explain.

eguiraud · April 29, 2021, 7:27am

I think the main issue is that

for entry in t:
    x = entry.branch_name
    ...

, being a native Python for loop with many Python calls performed at every event, is terribly slow. RDataFrame pushes all that to C++.

system · May 13, 2021, 7:27am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.