Entries (GetEntries) vs instances (Scan) - One entry but 566 instances - What is an instance as opposed to an entry?

alandou · July 2, 2020, 12:16pm

Hi,

I am looking at a root file that has 566 track objects stored inside. I’m not sure I can say 566 entries as .GetEntries() returns the value 1. When I .Scan() the tree however, it returns 566 so-called instances. I think I am missing the difference between the two. Unfortunately, “instance” is not a great keyword to search for as it can mean a lot of things.

Here is what this looks like (this continues until the 566th “instance”, so, as I expect, there is not just one entry):

I would like to loop over those instances. How do I do that? What I am used to use, .GetEntry(), doesn’t work here.
What is the difference between an entry and an instance? that should most likely help me figure out how to solve my issue.

_ROOT Version: 6.20/02
Platform: macOS catalina

Best wishes,
Aimeric

jblomer · July 2, 2020, 2:35pm

Hi Aimeric,

Welcome to the ROOT forum!

The ROOT file seems to contain a single event (i.e. just 1 entry) with all the tracks as a sub collection. There are several possible options how exactly the collection of tracks is stored. The command trackTree->PrintInfo() should tell which one it is. Could you upload or privately send to me a copy of the ROOT file, that would make it easier for me to give you a code snippet.

Btw, you can perhaps more quickly access to ROOT file in the interactive prompt if you run root -l /path/to/file.root, which allows you to call directly events->GetEntry() in the prompt.

Cheers,
Jakob

jblomer · July 2, 2020, 2:41pm

I’m sorry, I got confused, it is trackTree->Print() that shows the data types stored in the ROOT file.

jblomer · July 2, 2020, 4:57pm

Actually, even without knowing exactly of what type the track collection is, you should be able to read the track properties using a TTreeReaderArray or an RDataFrame. In RDataFrame, you’ll see the track properties (tracks.mX etc.) as RVec.

Cheers,
Jakob

alandou · July 2, 2020, 5:02pm

Hi Jakob,
I think I see where I got confused: I’ve mostly looked at a tree with several collision events in them, so it was natural to go through them using .GentEntry(), which allows to loop over the events of the tree. I just thought that as my .root file here was a collection of tracks, the entries would now be tracks. Looks like entries are by convention mostly used for events?
I will try to look at RDataFrame to loop over the tracks. I just need to see if it saves the class structure of the tracks. I had had a look at it, but I mostly found ways to read the different properties of the track. I want to loop over the tracks and pass each one to another function. I must have missed what I was searching for.
If I get nowhere on my own I’ll make another post on this topic!

Thanks for your help,
Aimeric

alandou · July 2, 2020, 5:21pm

Ok so I’ve given it another look, also tried to look at RVec in a bit more detail, but most of what I see would be an alternative to the TTree.Draw() I’d use if I only wanted to look at float/int variables, like charge or even a small vector for the momentum.
Here I want to get the whole track class for each track stored. I want to loop twice over the tracks, and feed each pair of tracks thus obtained to a function. The input of this function is a specific class, in the form of which those tracks should be stored in the tree I have.

If you want to look at my data file, I attach it to this post: tpctracks.root (168.9 KB)

alandou · July 2, 2020, 5:57pm

I think I’ve figured part of the problem: in the same configuration as the pic I posted in the first post, trackTree->Show() tells me the branch “Tracks” is a pointer to a vector. Could it be that the instances are the different rows of the vector? And instead of storing the different tracks in different entries, it is stored in the different rows of the only one entry of the tree?

The issue though is that if I do TracksArray->size() where TracksArray is a pointer to the vector “Tracks”, I get a size of 0.

Here is a rough idea of what I am trying to do:

std::unique_ptr<TFile> trackFile(TFile::Open(("/pathToTheFileDirectory/tpctracks.root")));
std::unique_ptr<TTree> trackTree((TTree*)trackFile->FindObjectAny("events"));

std::vector<o2::tpc::TrackTPC>* TracksArray = nullptr;
trackTree->SetBranchAddress("Tracks", &TracksArray);

int nTracks = TracksArray->size();  //gives me a size of 0, even though trackTree->Scan() gives several "instances"

for (int itrack1 = 0; itrack1<nTracks; itrack1++){
        o2::tpc::TrackTPC track1 = TracksArray->at(itrack1); //get the itrack1-th entry of the vector?
        for (int itrack2 = itrack1+1; itrack2<nTracks; itrack2++){
            o2::tpc::TrackTPC track2 = TracksArray->at(itrack1); //get the itrack2-th entry of the vector?
            
            newVariable = myFunction(track1,track2);
            ...
        }
}

jblomer · July 3, 2020, 7:59am

I think what’s missing is a trackTree->GetEntry(0) in order to populate the TracksArray vector.

You can also try the TTreeReader style access:

std::unique_ptr<TFile> trackFile(TFile::Open(("tpctracks.root")));
assert(trackFile && !trackFile->IsZombie());
    
TTreeReader treeReader("events", trackFile.get());
TTreeReaderArray<o2::tpc::TrackTPC> tracks(treeReader, "Tracks");
    
while (treeReader.Next()) {
   auto nTracks = tracks.GetSize();
   for (unsigned i = 0; i < nTracks; ++i) {
      auto track1 = tracks.At(i);
      for (unsigned j = i + 1; j < nTracks; ++j) {
         auto track2 = tracks.At(j);
         // ...
      }
   }
}

alandou · July 3, 2020, 5:07pm

Of course!
This tree structure messed with my head.
Problem solved.
Thank you for the help, and the RDataFrame/TTreeReader suggestions, I’ll definitely take some time to adopt one of those.