How can I use multiple ROOT files in bulk for data analysis in Python code?

For example, in this code I can only call 1 file, how can I batch call other files.

Dear Emirhan,

Thanks for the post, and welcome to the ROOT community!
You can certainly do that with RDataFrame. Taking your code as example:

import ROOT
filepath = 'JetNtuple_RunIISummer16_13TeV_MC_*.root' # assuming '104' was the file index :)
df = ROOT.RDataFrame('jetTree', filepath)
# Example plot of Transverse momenta of central Jets
h_jet_pt = df.Filter('fabs(jetEta) < 1.3').Histo1D('jetPt')
c = ROOT.TCanvas()
h_jet_pt.Draw()
c.Draw() #to draw embedded in a notebook

I hope this helps!

Cheers,
Danilo

PS
RunIISummer16 seems to be an old Monte Carlo campaign and perhaps it should be considered to have a look to the newer, high fidelity Ultra Legacy samples.

1 Like

Hello Danilo,

Thank you very much for your answer.

When I work through Jupyter Notebook I can’t call the ROOT library or download it using pip install, so the solution you suggested didn’t work.

For more information, you can find the full project I’m trying to run on github:

cernopendata-datascience/QCDJetsMachineLearning github

Cern Open Data Portal

JetNTuple_QCD_RunII_13TeV_MC sample with jet properties for jet flavour and other jet-related ML studies

Sorry I can’t add a link because I’m a new user.

Uproot → Getting started guide → Iterating over many files
Uproot → Getting started guide → Reading many files into big arrays
Scikit-HEP Tutorial

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.