File with many histos -- ROOT without garbage collection

Dear ROOTers,

I was wondering how one could disable the use and filling of the various TLists of objects ROOT creates when its libraries are linked against a C++ application, or more precisely some of them. This garbage collection using TLists and iteratively looping over them is way too slow for my use case. Or perhaps I am using the wrong model.

I have a folder structure with the following depth

2x9x3x3 and each end folder contains ~7K histograms. When looping over my file to do some operations, like scaling those histograms and saving them in a new file, the first folders are being processed normally but as I proceed, it gets slower and slower, even with the fact I delete every histogram I get from TKey::ReadObj(). Should I be deleting the TKey as well or is this done automatically ?

When I interrupt the program in gdb, I see that most of the time is spend in RecursiveRemove() instead of doing any useful (to me) work. I understand the lists are useful of interactive ROOT but I don’t understand why it should slow me down when linking the ROOT libraries.

I already tried to use SetOwnership and Delete() the TLists when I exit a folder but that didn’t solve my problem.

I was also wondering how to add objects (histograms, canvases) to a current ROOT folder without causing infinite loops. I tried to clone the TList from TDirectory::GetListOfKeys() but without success. How should one go about this?

I tried the following, i.e. cloning the ListOfKeys() before starting, without success:

[code]void ScaleHistos( TDirectory *dir ) {

TList *listOfKeys = (TList*) dir->GetListOfKeys();//->Clone(); // Use of Clone() causes problems.

TIter next(listOfKeys);
while (TKey* key = (TKey*) next()) {
	TObject *obj = key->ReadObj(); //<-- Segmentation fault if listOfKeys is a clone.
}

}[/code]

I also realized that cloning a TList does not change its name. Why is this intended ?

root [0] f = new TFile("tmp.root","recreate"); root [1] f->mkdir("tmpDir"); root [2] f->GetListOfKeys()->Print() Collection name='THashList', class='THashList', size=1 TKey Name = tmpDir, Title = tmpDir, Cycle = 1 root [3] TList * tmpList = (TList*) f->GetListOfKeys()->Clone("tmpList"); root [4] tmpList->Print(); Collection name='THashList', class='THashList', size=1 TKey Name = tmpDir, Title = tmpDir, Cycle = 1
I understand that this is because the TList doesn’t inherit from TNamed.

I also thought of storing my histograms in a TTree but I get stuck when defining my branch because all my histograms do not have the same binning, and hence I need to use a generic TH1F pointer, which will point to the current histogram I want to fill. But then I cannot instantiate the branch using TTree::Branch with a NULL pointer. Any suggestion on how to do that ? I looked at the tutorial about storing histograms inside a ROOT tree but this is not the same use case. Perhaps it is not compatible with what I intend.

Thanks a lot for any suggestion,
Karolos

[quote]Should I be deleting the TKey as well or is this done automatically ?[/quote]It is done automatically (when you close the file).

To reduce the time spent in the ‘garbage collection’ and assuming that you are 100% sure that no pointer to the contained object is shared and assuming you explicitly delete the TFile object, you can remove the TFile object from the list of files “gROOT->GetListOfFiles()->Remove(myfile);”

[quote] TList listOfKeys = (TList) dir->GetListOfKeys();//->Clone(); // Use of Clone() causes problems.[/quote]Do not clone the list of keys, I have no idea was would be the semantic of this cloning but it is for sure a bad idea (the TKeys are not intended to be copied, they are ‘handle’ to the bits in the TFile). Instead you must copy/clone/move the object then point to.

[quote]I was also wondering how to add objects (histograms, canvases) to a current ROOT folder without causing infinite loops[/quote]Humm … I don’t know what the problem is, the following works fine?mylist->Add(myhisto);so your use case must be more complex …

Why do you need to clone the list and/or the objects?

[quote]Any suggestion on how to do that ? I looked at the tutorial about storing histograms inside a ROOT tree but this is not the same use case. [/quote]Why? How does the example not work for you?

[quote]because all my histograms do not have the same binning, and hence I need to use a generic TH1F pointer[/quote]What’s the difference between a ‘regular TH1F pointer’ and a ‘generic TH1F pointer’? The binning information are regular data member and should not affect the storing?

[quote]I also realized that cloning a TList does not change its name. Why is this intended ?[/quote]It is fixed in the trunk (revision 37411).

Cheers,
Philippe.

[quote=“pcanal”][quote]Should I be deleting the TKey as well or is this done automatically ?[/quote]It is done automatically (when you close the file).

To reduce the time spent in the ‘garbage collection’ and assuming that you are 100% sure that no pointer to the contained object is shared and assuming you explicitly delete the TFile object, you can remove the TFile object from the list of files “gROOT->GetListOfFiles()->Remove(myfile);”

[/quote]

Unfortunately that won’d do it because I am dealing with several directories in the same file. Is there any equivalent for a given directory?

[quote=“pcanal”][quote] TList listOfKeys = (TList) dir->GetListOfKeys();//->Clone(); // Use of Clone() causes problems.[/quote]Do not clone the list of keys, I have no idea was would be the semantic of this cloning but it is for sure a bad idea (the TKeys are not intended to be copied, they are ‘handle’ to the bits in the TFile). Instead you must copy/clone/move the object then point to.

[quote]I was also wondering how to add objects (histograms, canvases) to a current ROOT folder without causing infinite loops[/quote]Humm … I don’t know what the problem is, the following works fine?mylist->Add(myhisto);so your use case must be more complex …

Why do you need to clone the list and/or the objects?

[/quote]

I use GetListOfKeys() to loop over all object in the directory, and make a recursive call of my function in case of a subdirectory. I then select the operation I need to do on a histogram given its name. I was trying to clone the list to avoid an infinite loop: because I was keeping adding histograms to the file, the ListOfKeys would always get extra items in it. Probably I am not doing this correctly.

This is probably because I was doing this wrong. I was defining the new branch in my main, using a NULL TH1F*, which would only be assigned later on for each histogram in my file. I’ll use a dummy TH1F to set the Branch.

Thanks,
Karolos

[quote]Unfortunately that won’d do it because I am dealing with several directories in the same file. Is there any equivalent for a given directory?[/quote]I don’t see the problem. As long as you do not have the same histogram objects shared amongst several directory, removing the TFile would be enough (as it implicitly also remove all its sub-directory).

[quote]I use GetListOfKeys() to loop over all object in the directory, and make a recursive call of my function in case of a subdirectory. I then select the operation I need to do on a histogram given its name. I was trying to clone the list to avoid an infinite loop: because I was keeping adding histograms to the file, the ListOfKeys would always get extra items in it. Probably I am not doing this correctly.[/quote]I see. You needed to make a ‘shallow’ copy of the collection instead of a ‘deep’ copy. Clone is by definition is a deep copy. To do what you need simply create an empty list, then iterate through each of the TKey pointer and add the pointer to your list.

[quote]This is probably because I was doing this wrong. I was defining the new branch in my main, using a NULL TH1F*, which would only be assigned later on for each histogram in my file. I’ll use a dummy TH1F to set the Branch.[/quote]The following should work:TH1F *pointer = 0; tree->Branch("name_of_the_branch","TH1F",&pointer);

Cheers,
Philippe.