Get objects in file within o(1) time?

Hello,

I have 1,000,000 th2d objects stored in .root file directly and I am using TFile::GetObejct() to retrieve them. The more TH2D I read, the longer it takes. Since TFile::GetObejct() gets object by name, I assume there is some searching in this method so it’s likely to take o(n) time to get each object. For 1,000,000 objects, the reading time has become unacceptable.

Is there any way to get objects in file within o(1) time?

Hello,

I can imagine two things you could do to speed up access:

  1. you can organize the histograms in a tree of directories and subdirectories.
  2. you can store the histograms in a tree and access them by number

In general, 1M histograms in a ROOT file sounds like a lot and I’m wondering if you can share a bit more context about your use case. Also including @pcanal who might have additional comments.

Cheers,
Jakob

Thank you for suggestions jblomer,

These TH2D are PMT waveform of multiple channels and each of them represents a cosmic muon event. We had our detector running for more than 24h and got about 1M muon events. Rewriting data means re-doing the experiment :upside_down_face:. Therefore I am hoping there is a way to retrieve these data without rewriting to file.

Hi @JINGYU ,
you can probably rearrange the histograms into a new structure without re-doing the experiment, but first it would be interesting to check what exactly is taking time e.g. with perf or other profiling tools:
it’s possible that time is actually spent in slow clean-up operations, if the histograms end up in ROOT’s global “lists of objects”.

Also a simple thing you can try, if it works for your usecase, is calling the static method TH1::SetDirectory(nullptr) before you start retrieving the histograms, which detaches histograms from these lists, see ROOT: TH1 Class Reference .

Cheers,
Enrico

If you are retrieving (almost) all the objects and if Enrico’s idea (calling SetDirectory) does not help enough then another thing to do is iterating over the TKeys:

auto keys = mydirectory->GetListOfKeys();
TIter next(keys);
TKey *key = nullptr;
 while(key = (TKey)next()) {
    TH1 *hist = key->ReabOject<TH1>();
     if (hist) {
        hist->SetDirectory(0);
        // record the histogram somewhere otherwise it is lost.
     }
 }

Thank you eguiraud and pacnal, Tkey::ReadObject() solves my problem perfectly.

TH1::SetDirectory(nullprt) seems to have no effect, perhaps it’s because I deleted the th2d I got from TFile::GetObject() to solve a memory leak problem. I had to look into source code to locate this memory leak. Maybe add warn or auto-cleaning feature to the “global list of objects”?

Ok, then time was not spent in addition/removals from the global list of objects. perf or other performance analysis tools can actually show where time was spent, but glad that Philippe’s suggestion was spot on :slight_smile:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.