Looping over file contents

Maarten_Boonekamp · July 28, 2020, 3:02pm

ROOT Version: 6.18
Platform: Ubuntu
Compiler: 7.5.0

Hello,

When looping over keys in a file, is there a safe and more efficient way to retrieve the name of the corresponding object than obj = key->ReadObj() and obj->GetName?

I compared the standard (I think?) procedure:

  TList* list = dir->GetListOfKeys() ;
  TIter next(list) ;
  vector<TH1*> vTemp;

  while ( (key = (TKey*)next()) ) {
    TObject* obj = key->ReadObj();
    string histoname = obj->GetName();
    delete obj;    
    vTemp.push_back( (TH1*) f->Get( histoname.c_str() );
  }

and this short-cut :

  TList* list = dir->GetListOfKeys() ;
  TIter next(list) ;
  vector<TH1*> vTemp;

  while ( (key = (TKey*)next()) ) {
    string histoname = key->GetName();
    vTemp.push_back( (TH1*) f->Get( histoname.c_str() );
  }

The latter is orders of magnitude faster, but only works when the key name matches the corresponding object name, which is not always guaranteed.

Am I missing something? If not, wouldn’t it be useful to provide a function returning the list of objects in a file (bypassing the key list), or a way to access object name from the key without reading in the entire object?

Thanks
Maarten

vpadulan · July 28, 2020, 3:22pm

Hi @Maarten_Boonekamp,
Check out this answer and tell me if helps.
Cheers,
Vincenzo

pcanal · July 28, 2020, 3:27pm

If the object name does not match the key name, by definition, the object name information is then only available inside the object stream itself. In order to avoid having to unbox, you would need to enforce the “rule” that those 2 names must be the same.

Cheers,
Philippe.

Maarten_Boonekamp · July 28, 2020, 4:43pm

Thanks for the answer!
I’m not sure… these examples just count keys and prints their class names; I am comparing two methods to store a list of pointers to objects from a file, for more manipulation downstream. Sorry if I’m missing something…
Cheers,
Maarten

vpadulan · July 28, 2020, 4:54pm

Hi @Maarten_Boonekamp,
The link shows a more recommended way to iterate over the TList , as also shown in the docs.
Still, the caveat written by @pcanal is very true so you need key and object to have the same name to avoid further operations.
Cheers,
Vincenzo

Maarten_Boonekamp · July 28, 2020, 5:04pm

Thanks Philippe.

So you confirm that there is no such function as file->GetListOfObjects() returning a TList of TObjects, and that there is no other way to access an object’s name from a key than reading it in entirely?
Can I ask why (sorry for my ignorance)? TKey always seems an unnecessary layer to me, for such admittedly simple applications.

In the examples above, I do some further simple manipulations on the retrieved histograms, but in the first case, the CPU is completely dominated by the somewhat superfluous calls to ReadObj (again, only needed to get the name of the object). In the second case I access the histogram contents in the same way downstream, but they are never explicitly read in, and the execution is ~instant, but I will miss histograms that don’t have the same name as the key.

The problem is that when user1 writes out his histograms with h->Write(name), “name” is that of the key but the object preserves its initial name. When user2 later uses this, he gets into the sort of problems above.

So I think it would be useful to provide direct access to the list of objects in a file, or enable a more efficient access to an object name from a key. Another simple possibility would be to enforce that Write(name) also updates the name of the written object (this might pose other issues but I think this is what most people expect, actually).

Cheers,
Maarten

Maarten_Boonekamp · July 28, 2020, 5:06pm

Thanks again Vincenzo! I’ll improve my syntax but this is not really the issue, which I tried to explain better above.
Cheers,
Maarten

system · August 11, 2020, 5:06pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.