Please read tips for efficient and successful posting and posting code
I have a code that chains 1001 root files and applies an analysis cut. Then an energy histogram is drawn and saved as a root file and the equivalent bin and count data is saved in a .txt file that I use for further analysis.
What I want to do: I want to get the event id ‘ievt
’ for events after the analysis cut is applied but since the files are chained, each file has the same range of ievt and there isn’t a way to get the unique id for the event that survived. Generally run numbers are used for separating different files but in this case, I don’t have that. Is there a way I can get remove the degeneracy of the ievt?
I don’t know what else I should provide since my root file has:
ievt: event_id
energy: total energy from the event
no_of_events: number of events
edep_in_int: edep in a sub volume1
edep_in_ext: edep in a sub volume2
det_id: id of the detector that had non-zero edep
ROOT Version: Not Provided
Platform: Not Provided
Compiler: Not Provided
Hello @rushabhgala,
if all your ROOT file contains are the below branches, the only unique identifier will be a combination of the dataset name (e.g. the file name) + the event number inside the file.
Alternatively, albeit less reproducible, is a “global” event number that you can compute by counting all events, but this requires that you always read the files in the same order to be unique. I would try to go with option one, and assign each file a dataset ID similar to a run number.
There are a few more branches in my file, but they’re all similar to
edep_in_X: edep in a sub volumeX
so that won’t be of much help to me.
How can I assign each file a dataset ID? Should I write another branch for each file? or is there another way?
I tried using the ‘global event number’ but as you said it is not reproducible for me as I can’t guarantee reading the files in the same order every time.
Hello,
there are a few options:
- Are you using RDataFrame or a plain TChain to analyse the files? In RDataFrame, you can ask for file name and entry number. This is called DefinePerSample.
- In a TChain, you can list the files first, get them in a stable order, and create the chain. Now, by just counting every event before the cuts, you should get a “stable” global event number.
- If you don’t have a stable order, you can ask a TChain for the current file, see e.g. in this post. This can be used to derive a unique ID from the filename and entry number.
There might be more ways, but let’s see if one of the above can work for you.