Home | News | Documentation | Download

How to access the information contained in branches in RDataFrame

Dear experts,
From tutorials, I know how to interact with the data contained in a tree in RDataFrame:

d = ROOT.RDataFrame(treeName, fileName)

However, I do not kwow how to proceed in case of a more complex tree structure. For instance, I do not know how to access the individual leaves of each branch of the tree depicted in the attached figure.
For example, how can I access the hcalTPieta leaf?
Thank you for your time.
Cheers,
Nathanael
tree

Hi @Nathanael,
what does d.GetColumnNames() return for that TTree?

Cheers,
Enrico

Hi @eguiraud,
Thank you for your reply.
I tried this piece of code

import ROOT
fileName = "ZToMuMu_mc_off_MET_PU_mit_392SF.root"
treeName = "l1CaloTowerTree"
d = ROOT.RDataFrame(treeName, fileName)
d.GetColumnNames()

and I get segmentation violation.
Please see attached the test root file ZToMuMu_mc_off_MET_PU_mit_392SF.root (2.3 MB)

Uhm a segmentation violation seems like an overreaction on the part of the script :smiley:

What ROOT version are you on? Here’s what I get with v6.22:

$ python read_cols.py                                                                                                                                         (cern-root) 
TClass::Init:0: RuntimeWarning: no dictionary for class L1Analysis::L1AnalysisEventDataFormat is available                                                                    
TClass::Init:0: RuntimeWarning: no dictionary for class L1Analysis::L1AnalysisCaloTPDataFormat is available                                                                   
TClass::Init:0: RuntimeWarning: no dictionary for class L1Analysis::L1AnalysisL1CaloTowerDataFormat is available                                                              
TClass::Init:0: RuntimeWarning: no dictionary for class L1Analysis::L1AnalysisL1UpgradeTfMuonDataFormat is available                                                          
TClass::Init:0: RuntimeWarning: no dictionary for class L1Analysis::L1AnalysisBMTFInputsDataFormat is available                                                               
TClass::Init:0: RuntimeWarning: no dictionary for class L1Analysis::L1AnalysisL1UpgradeDataFormat is available                                                                
TClass::Init:0: RuntimeWarning: no dictionary for class GlobalAlgBlk is available                                                                                             
TClass::Init:0: RuntimeWarning: no dictionary for class L1Analysis::L1AnalysisL1HODataFormat is available                                                                     
TClass::Init:0: RuntimeWarning: no dictionary for class L1Analysis::L1AnalysisL1CaloClusterDataFormat is available                                                            
TClass::Init:0: RuntimeWarning: no dictionary for class L1Analysis::L1AnalysisGeneratorDataFormat is available                                                                
Error in <TChain::LoadTree>: Cannot find tree with name l1CaloTowerTree in file ZToMuMu_mc_off_MET_PU_mit_392SF.root                                                          
Traceback (most recent call last):                                                                                                                                            
  File "read_cols.py", line 5, in <module>                                                                                                                                          
    d.GetColumnNames()                                                                                                                                                        
cppyy.gbl.std.runtime_error: vector<string> ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void>::GetColumnNames() =>                                                  
    runtime_error: GetBranchNames: error in opening the tree l1CaloTowerTree                                                                                                  

So I inspected the file and indeed, like your screenshot above shows, l1CaloTowerTree is not a TTree but a directory (with a TTree called L1CaloTowerTree – with a capital L! – inside).

This works:

import ROOT
fileName = "ZToMuMu_mc_off_MET_PU_mit_392SF.root"
treeName = "l1CaloTowerTree/L1CaloTowerTree"
print(ROOT.RDataFrame(treeName, fileName).GetColumnNames())

and prints

{ "CaloTP.nHCALTP", "CaloTP.hcalTPieta", "CaloTP.hcalTPiphi", "CaloTP.hcalTPCaliphi", "CaloTP.hcalTPet", "CaloTP.hcalTPcompEt", "CaloTP.hcalTPfineGrain", "CaloTP.nECALTP", "C
aloTP.ecalTPieta", "CaloTP.ecalTPiphi", "CaloTP.ecalTPCaliphi", "CaloTP.ecalTPet", "CaloTP.ecalTPcompEt", "CaloTP.ecalTPfineGrain", "CaloTP", "L1CaloTower.nTower", "L1CaloTow
er.ieta", "L1CaloTower.iphi", "L1CaloTower.iet", "L1CaloTower.iem", "L1CaloTower.ihad", "L1CaloTower.iratio", "L1CaloTower.iqual", "L1CaloTower.et", "L1CaloTower.eta", "L1Cal
oTower.phi", "L1CaloTower" }

I tried to plot CaloTP.hcalTPieta. This also works:

import ROOT                                                                                                                                                                   
fileName = "ZToMuMu_mc_off_MET_PU_mit_392SF.root"                                                                                                                             
treeName = "l1CaloTowerTree/L1CaloTowerTree"                                                                                                                                  
df = ROOT.RDataFrame(treeName, fileName)                                                                                                                                      
                                                                                                                                                                              
df.Histo1D("CaloTP.hcalTPieta").Draw()
input() // to keep the process alive while the canvas is displayed

Cheers,
Enrico

1 Like

Thank you for your reply Enrico!

This works. But when I try to run this

I get

Traceback (most recent call last):
  File "L1Tntuple.py", line 6, in <module>
    df.Histo1D("CaloTP.hcalTPieta").Draw()
TypeError: can not resolve method template call for 'Histo1D'

I think this is related to the way hcalTPieta is accessed because I have no problems with Histo1D in other scripts.

ROOT 6.18/04

Hi,
any chance you can upgrade to v6.22?
This last problem you are encountering is an old bug that has been fixed in recent ROOT versions.

Cheers,
Enrico

Hi Enrico,
Yes, upgrading ROOT to v6.22 via conda makes it work. Thanks!
I was wondering how to access “CaloTP.nHCALTP”. In general, I get an error every time I try to access quantities like “.n…”.
I see that here nMuon can be accessed like other RVec quantites.

This one looks like a bug, sorry about that! df.GetColumnType("CaloTP.nHCALTP") returns "L1Analysis::L1AnalysisCaloTPDataFormat" but it should just be Short_t, right?

Thank you for your prompt reply Enrico.
Yes, I get L1Analysis::L1AnalysisCaloTPDataFormat
while it should be something like Short_t or UInt_t

Yes, this is actually a consequence of a problem in TTree:

tree->GetLeaf("CaloTP.nHCALTP") and tree->GetLeaf("CaloTP", "nHCALTP") return a nullptr (while tree->GetLeaf("nHCALTP") returns the TLeaf).

The corresponding jira ticket is https://sft.its.cern.ch/jira/browse/ROOT-10942 and we consider it a critical bug, so it will be fixed in the next patch releases.

In ROOT master, the problem is partially fixed and, as a workaround, you can use df.Histo1D("nHCALTP") (just “nHCALTP”, without “CaloTP”) and it works.
To get a ROOT nightly build via conda you can use (very fresh and still undocumented):

conda create -n root-nightly -c conda-forge -c https://root.cern/download/conda-nightly/latest root-nightly

Cheers,
Enrico

1 Like

Thank you very much @eguiraud
The workaround works!

Hi,
glad to hear that! FYI, yesterday I merged another patch in master that makes df.Histo1D("CaloTP.nHCALTP") also work.

If you re-install the conda ROOT nightly build you should get a ROOT version in which RDataFrame properly deals with all the cases discussed in this thread. If you have time to try this out, please let me know how it goes.

Cheers,
Enrico

Hi Enrico,
Just installed again ROOT nightly build via conda and yes,

df.Histo1D("CaloTP.nHCALTP")

works without any problem. Thanks!

1 Like