Reading objects in root file with pyRoot

Dear experts,

Although not newbie I’m dealing with a very trivial issue.
I want to access elements of one of my branches (that are objects of a c++ class) through pyROOT.
I see all structure in TBrowser and I can properly draw my value of interest through console like:

FPGATrackSimEventTree->Draw("m_optional.@m_OfflineClusters.size()")

However, when looping over events in pyROOT, navigation like

for entry in tree:
    opts = entry.m_optional
    offClus = opts.m_OfflineClusters

gives errors complaining that m_OfflineClusters is not a member of the corresponding “m_optional” object.

Unfortunately I couldn’t spot a related example for guidance.

Thanks for the support!
Best,
Yannis

PS: I’m using root 6.30.02 on lxplus (python 3.9.12)

Can you attach the output of tree.Print()?

root [2] FPGATrackSimEventTree->Print()
******************************************************************************
*Tree    :FPGATrackSimEventTree: data                                                   *
*Entries :     1000 : Total =         7740109 bytes  File  Size =    1750277 *
*        :          : Tree compression factor =   4.44                       *
******************************************************************************
*Branch  :FPGATrackSimEventInputHeader                                       *
*Entries :     1000 : BranchElement (see below)                              *
*............................................................................*
*Br    0 :TObject   : BASE                                                   *
*Entries :     1000 : Total  Size=      14794 bytes  File Size  =       2042 *
*Baskets :        3 : Basket Size=       6400 bytes  Compression=   7.00     *
*............................................................................*
*Br    1 :m_event   : FPGATrackSimEventInfo                                  *
*Entries :     1000 : Total  Size=      71745 bytes  File Size  =       7699 *
*Baskets :       12 : Basket Size=       6400 bytes  Compression=   9.25     *
*............................................................................*
*Br    2 :m_optional : FPGATrackSimOptionalEventInfo                         *
*Entries :     1000 : Total  Size=    3383083 bytes  File Size  =     970204 *
*Baskets :      659 : Basket Size=       6400 bytes  Compression=   3.47     *
*............................................................................*
*Br    3 :m_Hits    : vector<FPGATrackSimHit>                                *
*Entries :     1000 : Total  Size=    4269085 bytes  File Size  =     757566 *
*Baskets :      840 : Basket Size=       6400 bytes  Compression=   5.61     *
*............................................................................*

I get the same output in pyROOT

OK, then maybe @vpadulan can help

Can you try:

    offClus = entry.GetLeaf("m_optional.m_OfflineClusters").GetValue()

or

    offClus = entry.GetLeaf("m_optional", "m_OfflineClusters").GetValue()

Would this work?

I had a similar issue before

Hi @FoxWise
I saw your ticket while looking for a solution.
In my case both give the ReferenceError: attempt to access a null-pointer.

But the ROOT version I’m using is newer than the one mentioned in the that ticket.
Whatever fix was implemented should be present in this release, right?

1 Like

Hello again,

Any news on this?
By the way the m_OfflineClusters object that I cannot access seems to be a TNonSplitBrowsable one in the ROOT context (if that helps somehow)

Can you share a ROOT file, so it is easier to debug?

You can cut it to a few events with RDataFrame, if the size is too large

Hi @FoxWise ,

Here’s a public link with a sample:
https://cernbox.cern.ch/s/CfYZJRRw1F1Ftru

Thanks!

Unfortunately, I have tried to hack my way around this with no success, so I can’t help.
I hope the experts will comment on your issue.

The furthest I got is:

import ROOT

file = ROOT.TFile("test.root")
tree = file.FPGATrackSimEventTree

for event in tree:
    m_optional = event.GetBranch("m_optional")
    browsables = m_optional.GetBrowsables()
    m_OfflineClusters = browsables.FindObject("m_OfflineClusters")
    leaves = m_OfflineClusters.GetLeaves()
    s = leaves.FindObject("@size")

    print(m_OfflineClusters, type(m_OfflineClusters))
    print(s, type(s))
    break

Output:

Name: m_OfflineClusters Title:  <class cppyy.gbl.TNonSplitBrowsable at 0xc25cae0>
Name: @size Title: size of vector<FPGATrackSimCluster> of FPGATrackSimCluster <class cppyy.gbl.TCollectionPropertyBrowsable at 0xc28d690>

As you can see, I was able to find the branch, which is displayed in the TBrowser and which is accessed in the TTree::Draw().

However, the branch is of type TNonSplitBrowsable, and I don’t know how to access the data there.

With this keyword, you can even search the ROOT forum.
It shows 15 posts where people also struggle to access TNonSplitBrowsable.

However, I could not find a comprehensive answer that would provide information on how to access it event by event.
They all refer to TTree::Draw() or to the TTreeReader and RDataFrame

The only thing I can comment on from my experience:
I guess one should develop a more straightforward structure to store data in the future.
Ideally, without user-defined classes, only default C++ types are used.
If one must use user-defined classes, one also needs to create corresponding dictionaries, which was never trivial to understand for me, but it is doable after scrolling through tons of posts/documentation.

Maybe branches will become available if you have dictionaries for your classes and load them in your Python script. I don’t know.

cheers,
Bohdan

Thanks for looking into it, Bohdan!
Having a dictionary seems very reasonable, though in my case the c++ class has other classes as dependencies so it’s virtually impossible to load the whole project just to read a root file.

But I was hoping that since the values were available through TBrowser, there would be a feasible (if not even simpler) approach through PyRoot.

Anyway, lets wait for experts to reply.
Thanks again for your time!

Hello,

Any feedback on this?

Cheers,
Yannis

You have a few options to read unsplit objects.

One is to use:

file->MakeProject("userlib", "*", "recreate++");

which will create a bare bone (just data members) version of all the classes used in the file, generate the dictionary for them and load that library.

The other options are indeed to use RDataFrame (which will allow your code to leverage many cores for ‘free’) or the TTreeReader which should give you access to the same information TTree::Draw is able to access. (And if none of those works you could also use directly TTreeFormula which is used by TTree::Draw)

Dear @imaznas ,

On top of what @pcanal already wrote, I want to add that in RDataFrame (and probably also in TTreeReader or any other interface that requires full knowledge of the C++ types you want to operate on) you will need some definition of the FPGATrackSimOptionalEventInfo class, otherwise the tool cannot know how to read your data:


root [0] ROOT::RDataFrame df{"FPGATrackSimEventTree", "test.root"};
Warning in <TClass::Init>: no dictionary for class FPGATrackSimEventInputHeader is available
Warning in <TClass::Init>: no dictionary for class FPGATrackSimEventInfo is available
Warning in <TClass::Init>: no dictionary for class FPGATrackSimOptionalEventInfo is available
Warning in <TClass::Init>: no dictionary for class FPGATrackSimCluster is available
Warning in <TClass::Init>: no dictionary for class FPGATrackSimHit is available
Warning in <TClass::Init>: no dictionary for class FPGATrackSimMultiTruth is available
Warning in <TClass::Init>: no dictionary for class FPGATrackSimOfflineTrack is available
Warning in <TClass::Init>: no dictionary for class FPGATrackSimOfflineHit is available
Warning in <TClass::Init>: no dictionary for class FPGATrackSimTruthTrack is available
root [1] auto df1 = df.Define("clusters", "m_optional.m_OfflineClusters");
input_line_37:2:12: error: unknown type name 'FPGATrackSimOptionalEventInfo'
auto func0(FPGATrackSimOptionalEventInfo& var0){return var0.m_OfflineClusters
           ^
Error in <TRint::HandleTermInput()>: std::runtime_error caught: 
RDataFrame: An error occurred during just-in-time compilation. The lines above might indicate the cause of the crash
 All RDF objects that have not run an event loop yet should be considered in an invalid state.

You need a dictionary as @pcanal suggested.

Cheers,
Vincenzo

Hi @vpadulan, @pcanal

In the end, using a dictionary, made via cmake, (along with env vars properly pointing to the project) seems to have solved the issue easily i.e. I have full access in object’s C++ routines etc. So no use of TTreeReader or RDataFrame was needed.

Thanks for your time.