Pyroot: Branch seen as leaf, can't access actual leaves

I am trying to extract information from a root file using pyroot.
I tried using the usual GetBranch, GetLeaf, GetValue but it did not work as usual, so I tried alternatives.

I found an equivalent code in c++ which is extracting the values just fine from my ROOT file but when applying the same thing in pyroot, I get an empty object as my “mcinfo” (see below).
I also tried displaying the branches and leaves with GetListOfBranches() and GetListOfLeaves(). I get the proper list of branches with the first but when looking at the list of leaves (here MC but it’s the same with all branches), I get out of MC.GetListOfLeaves() that the only leaf is MC too… I have several leaves (for instance energy) that I can access just fine with the c++ code and directly with data.Scan(“energy”) too…

Anyone has an idea of how to fix this? Sorry, I am not that familiar with ROOT yet. I assume it is a rather easy fix, but I don’t see what is wrong with this…

C++ code:

    TFile *file = new TFile(fname);
    TTree *data = (TTree*)file->Get("data");
    TBranch * McinfoBranch;
    MCInfo* mcinfo = (MCInfo*)file->GetList()->FindObject("MC");
    data->SetBranchAddress("MC", &mcinfo, &McinfoBranch);

pyroot version:

    infile = ROOT.TFile(fname)
    data = infile.Get("data")
    mcinfo = infile.GetList().FindObject("MC")

which gives me this when printing mcinfo: <cppyy.gbl.TObject object at 0x(nil)>.

If data.Scan("energy") works, you should be able to access the leaves directly (i.e. you don’t have several leaves inside a branch); in c++:

    TFile *file = new TFile(fname);
    TTree *data = (TTree*)file->Get("data");
    double ene;
    data->SetBranchAddress("energy",&ene);
    // define histogram, etc...; then loop over the tree and fill the histogram:
    for (int i=0; i<data->GetEntries(); ++i) {
      data->GetEvent(i);
      h->Fill(ene);
    }
// etc

With pyroot, maybe using RDataFrame is easier for just drawing:

infile = ROOT.TFile(fname)
df = ROOT.RDataFrame("data",infile)
h = df.Histo1D("energy")
h.Draw()
# etc

Hi,
Thank you for your reply. But, as I said, I am not able to access “energy” in any way. I tried your pyroot bit just in case as I did not try RDataFrame yet but it is not working either. And in any case I would like to retrieve the values themselves not just histogram them with ROOT.
I don’t know why data.Scan(“energy”) works but something like

data = infile.data
for event in data:
    Event = data.GetEvent(0)
    event.GetLeaf("energy").GetValue(0)

would not work. Knowing that I know from the c++ code that “energy” is under “MC” and there I can access the values too. So that is not the problem.

The C++ version of the code is “unsual”. I would have expected:

    TFile *file = new TFile(fname);
    TTree *data = (TTree*)file->Get("data");
    TBranch * McinfoBranch;
    MCInfo* mcinfo = nullptr;
    data->SetBranchAddress("MC", &mcinfo, &McinfoBranch);

to be sufficient. To understand a bit more what

(MCInfo*)file->GetList()->FindObject("MC");

does, can you send the output of

file->ls();

?

Thanks,
Philippe.

Here is the output of infile.ls()

  TFile**		/path/myrootfile.root	Root File
  TFile*		/path/myrootfile.root 	Root File
  KEY: TTree	data;1	        Root data Tree
  KEY: TTree	Secdata;1	Root Secondaries data Tree
  KEY: TTree	SKdata;1         SK data Tree

And for instance doing:

data = infile.Get("data")

liste = infile.GetList()
for l in liste:
    print (l.GetName())

mcinfo = liste.FindObject("MC")
print ("MC:", mcinfo)

returns

data
MC: <cppyy.gbl.TObject object at 0x(nil)>

I also get issues if I do something much simpler like MC=data.GetBranch(“MC”) or MC=data.MC as “data.GetLeaf(“energy”)” or “MC.energy” returns either “<cppyy.gbl.TLeaf object at 0x(nil)>” or “‘TBranchElement’ object has no attribute ‘energy’”. While I know it is there as data.Scan(“energy”) works just fine…

The output of ls() indicates there is no top level object named “MC” in the file. Consequence in C++ (as in python):

file->GetList()->FindObject("MC");

returns a nullptr and thus the C++ code in the original post is indeed functionally identical to mine.
The original python can be correct as:

    infile = ROOT.TFile(fname)
    data = infile.Get("data")
    mcinfo = ROOT.MCInfo() 

You probably need to use the full name of the branch i.e. likely something along the line of:

"MC.somethingelse.energy"

see the result of tree->Print() or tree.GetBranch("MC").Print()

Thank you a lot for your help.

I tried “mcinfo = ROOT.MCInfo()” however and get “cannot instantiate incomplete class ‘MCInfo’”

As for the output of tree.GetBranch(“MC”).Print(), I have this:

*Br    2 :MC        : MCInfo                                                 *
*Entries :       10 : Total  Size=      78114 bytes  File Size  =       1398 *
*Baskets :        1 : Basket Size=    8388608 bytes  Compression=  55.53     *
*............................................................................*

I tried “mcinfo = ROOT.MCInfo()” however and get “cannot instantiate incomplete class ‘MCInfo’”

Then try:

mcinfo = ROOT.MakeNullPointer("MCInfo")

Assuming it is the full output, it means that this branch is not split (i.e. likely in the original write the corresponding variable is a pointer to the abstract class MCinfo).

Sorry, I am quite new to this and I don’t get how this is helping me access the leaves. I tried doing “mcinfo = ROOT.MakeNullPointer(“MCInfo”)” and then " ROOT.MCInfo()" but I still get the same output.

And yes, this is the sole output of “tree.GetBranch(“MC”).Print()”.

It would have to be followed by the python equivalent of the SetBranchAddress and GetEntry.

But maybe we should try to get the RDataFrame version working.
What is the output of (here in C++ but you should be able to do something similar in python):

   TFile *file = new TFile(fname);
   TClass::GetClass("MCInfo")->GetStreamerInfos()->ls();

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.