Read TRefArray with TTreeReader

Dear experts,
I hope to ask the recommended way to read TRefArray using TTreeReader. It is related to this thread. I am trying to read from the output file generated by Delphes. And the FatJet.Constituents branch stores the TRefArray which point to the objects I need. With TTreeReaderArray I can directly read out it but if I loop other branch before, the result of TTreeReader got distortion.
The test code and root file are listed here if helpful. The snippet is here:

#include "classes/DelphesClasses.h"
R__LOAD_LIBRARY(libDelphes)
bool test_consti(const TRefArray & constituents) { 
  if(constituents.GetEntries()<1) return false;
  std::cout<<"[0] UID "<<constituents.GetUID(0)<< " class "<<constituents.At(0)->ClassName()<<std::endl;
  if(TString(constituents.At(0)->ClassName())=="GenParticle") {
      auto const p = dynamic_cast<const GenParticle*>(constituents.At(0));
      std::cout << p->PT<<std::endl;
  }
  if(TString(constituents.At(0)->ClassName())=="ParticleFlowCandidate") {
      auto const p = dynamic_cast<const ParticleFlowCandidate*>(constituents.At(0));
      std::cout << p->PT<<std::endl;
  }
  if(TString(constituents.At(0)->ClassName())=="Muon") {
      auto const p = dynamic_cast<const Muon*>(constituents.At(0));
      std::cout << p->PT<<std::endl;
  }
  return true;
}
void tester(){
  TChain* chain = new TChain("Delphes");
  chain->Add(TString::Format("top.reco.root"));
  TTreeReader r(chain);
  TTreeReaderArray<int> rp(r, "Particle.PID");
  TTreeReaderArray<TRefArray> rv(r, "FatJet.Constituents");
  int count=0;
  while (r.Next()) {
    count++;
    if(count<158)
      continue;
    std::cout<< count-1 << std::endl;
    // for (const int & each : rp) {
    //   ;
    // }
    for (const TRefArray & each : rv) {
      test_consti(each);
    }
    if(count>158)
      break;
  }
}

Un-comment the 3 lines of codes would change the result of TTreeReaderArray<TRefArray>.

I am not sure whether I mis-use the TTreeReader and thank you for any comment or suggestion.

Best,
Qi


_ROOT Version:6.30/02
_Platform:x86_64-centos7-gcc12-opt (3.10.0-1127.el7.x86_64)
_Compiler:linuxx8664gcc


Hi Qi,

Could you share with us the input file too, in order for us to be able to reproduce?

Best,
D

Hi Danilo,
Yes I put the root file here and could you access it? (sorry the size of file is kind of large to upload here, as the issue happen quite randomly I need to produce some events to reproduce it. Let me know if I need to skim the files further)

Best,
Qibn

Hi Danilo,
I uploaded the root file and not sure whether it is enough to reproduce. Do I need to upload more events for the test?
Thank you!

Qi.

link to root file

Hi Qi,

Apologies that I did not reply to this thread, I missed your answer. Do you see the problem also without Delphes?

Cheers,
D

Hi Danilo,
I only tested with the Delphes output tree and not sure whether it is common for ROOT TRefArray. If there is root test file of TRefArray I can have try.

Best,
Q

Dear @qiqiqi ,

Thanks for reporting on our forum. I understand the boundaries of your problem, but if we had a ROOT-only reproducer, without the need for external dependencies, it would be much easier to debug. I understand this may not be trivial to create, so I will try to reproduce your environment with Delphes, note that this will take time.

Best,
Vincenzo

Hi @vpadulan ,
Thank you for your comment and time investigating on this. Apologize currently I only have the test code and root file based on the Delphes output (and only reproduce on specific events, thus to make more test cases might be tricky). I can help from my side to provide more info and test.
(the dependency I guess probably just the custom class definition which could be find here )
Best,
Q

On the surface the code looks okay. How is the uncommenting the code changing the result?

Hi, in the folder I attached two logs which show the difference
scripts and logs
============= with commented (directly loop the TRefArray)
157
[0] UID 3782 class Muon
79.9007
158
[0] UID 1758 class ParticleFlowCandidate
56.0664
============= loop other collection then loop TRefArray
157
[0] UID 3782 class Muon
79.9007
158
[0] UID 1758 class GenParticle
0.273852

I can reproduce your problem and it boils down to the GenParticle objects from the previous entries not being deleted and thus one of them with the 1758 uid being found instead of the ParticleFlowCandidate.
To work around the problem add (before the loop):

   TClonesArray *particles = nullptr;
   chain->SetBranchAddress("Particle", &particles);

and within the loop (right before its end):

      if (particles)
         particles->Delete(); // This deletes the contained objects.
1 Like

Thank you so much! Just another question is there an equivalent way to apply this when using RDataFrame (even in PyROOT)? Thank you!
Best,
Qibin
Assuming we have a function to process
Func(ROOT::VecOps::RVec<TRefArray>& arr){...};

Dear @qiqiqi ,

Using the suggestions from @pcanal I was able to create a minimal working example using RDF. Note that due to having to specify the types of the columns explicitly for this specific issue, this is best done in a strongly-typed fully-compiled environment:

#include <ROOT/RDataFrame.hxx>
#include <ROOT/RVec.hxx>
#include <classes/DelphesClasses.h>

void test_consti(const TRefArray &constituents) {

  if (constituents.GetEntries() < 1) {
    std::cout << "Found no constituents.\n";
    return;
  }

  std::cout << "[0] UID " << constituents.GetUID(0) << " class "
            << constituents.At(0)->ClassName() << std::endl;

  if (auto const p = dynamic_cast<const GenParticle *>(constituents.At(0))) {
    std::cout << p->PT << std::endl;
  } else if (auto const p = dynamic_cast<const ParticleFlowCandidate *>(
                 constituents.At(0))) {
    std::cout << p->PT << std::endl;
  } else if (auto const p = dynamic_cast<const Muon *>(constituents.At(0))) {
    std::cout << p->PT << std::endl;
  } else {
    std::cout << "Could not recognize class type of constituent.\n";
  }
}

// The TClonesArray & parameter is required to ensure proper cleanup of the
// contents from the previous entry
void test_fun(ULong64_t entry, const ROOT::RVecI &PIDs,
              const ROOT::RVec<TRefArray> &allConstituents,
              TClonesArray &clones) {
  std::cout << "Entry: " << entry << "\n";
  std::cout << "Number of PIDs: " << PIDs.size() << "\n";
  // Traverse the PIDs to trigger the issue from
  // https://root-forum.cern.ch/t/read-trefarray-with-ttreereader/60477
  for (auto &&pid : PIDs) {
  }
  for (auto &&each : allConstituents)
    test_consti(each);
  // Make sure the references are cleaned up, this solves the forum issue
  clones.Delete();
}

void testrdf() {
  ROOT::RDataFrame df{"Delphes", "top.reco.root"};
  df.Range(157, 159).Foreach(test_fun, {"rdfentry_", "Particle.PID",
                                        "FatJet.Constituents", "Particle"});
}

int main() { testrdf(); }

I installed libDelphes on my personal machine. Then I was able to compile the above example with

CPLUS_INCLUDE_PATH=../install/include:$CPLUS_INCLUDE_PATH LD_LIBRARY_PATH=../install/lib:$LD_LIBRARY_PATH g++ -O3 -o delphes_mwe_rdf.out delphes_mwe_rdf.cpp `root-config --cflags --glibs` -L../install/lib -lDelphes

And run it with

CPLUS_INCLUDE_PATH=../install/include:$CPLUS_INCLUDE_PATH LD_LIBRARY_PATH=../install/lib:$LD_LIBRARY_PATH ./delphes_mwe_rdf.out

I expect you won’t need to explicitly set all the env variables in your case since I assume you have the software stack already set up.

Note that the above example is just to reproduce your issue and to show the fix as suggested by @pcanal . If you want to build a more realistic RDF application I invite you to take a look at the docs and tutorials. In case you need more assistance I can surely help you develop the analysis further.

Cheers,
Vincenzo

1 Like