Esoteric overwriting of pointer content

Hi all,
I am currently trying to run code, which can be found in

and I ran into a problem where I don’t know how to proceed anymore. In the past, this code has been running without problems.
The CMSSW version is 11_3_2, which apparently has ROOT 6.22/09 running.

Let me first describe what the problem is and then go through the debugging steps I did. In L344-346, TTrees are read from a File and saved into pointers. These TTrees do exist and contain what is expected. However, when trying access branches in L408, these branches do not exist anymore.

Now I tried to find out what happened here and found out that the branches do not exist anymore because the Objects that the pointers point at are not the same anymore. These TTrees get overwritten in L371-374, when new TTrees are allocated to new points. Before this, TTrees are accessible and contain what is expected.

To be clear, this means that the pointer sectorNameTree, which is initialized in L374, points to the same address as the pointer baselineTreeX, which was initialized in L344. For some reason, ROOT or C++ thought that this place in memory was free and put a new TTree there, deleting the old content.

But why do these objects get overwritten? I was not sure, but I suspected it might have something to do with directories. So I checked what is in the current directory by doing gDirectory->GetList() at various places in the code. I found out that in fact, the TTrees vanish from the current directory once a new file is opened in L357. To me, this is unexpected, because usually, if one opens a new file and thus a new directory, all objects will be carried over.

So my next idea was to prevent the objects from getting overwritten. I tried this in two ways: One was using SetDirectory(nullptr) on the TTrees. This does not change anything. Another option was to add the TTrees to the directory again after opening the file (SetDirectory(gDirectory). Adding them succeeds in making them appear in gDirectory->GetList(), but does not prevent them from being overwritten anyway.

I am really confused by this behavior and don’t know how to prevent it. Is this intended behavior? Was anything about directories changed in recent versions of ROOT?

Cheers,
Marius

Hi Marius,

Thank you for your wonderful question - really excellent, easy to understand and all the info we need! (Love the title, too!)

The answer that you don’t want to hear: it’s because of PAW :slight_smile: Let me explain:

ROOT (old ROOT, predating RDataFrame an RNTuple etc) has the notion of a TDirectory seen as a successful concept carried over from PAW: things have a name, and if there’s a new thing with a name added to the directory then it replaces old things with the same name. So you were on the right track - but you did it too late, the new TTree was already created and already replaced the old one.

This might help:

  if (firstIter) {  // should be always true in setBaseline mode, since file is recreated
    gROOT->cd(); // Force subsequent trees to be in-memory trees
    if (!setBaseline) {

Now - these will produce in-memory trees, but according to line 881 that doesn’t seem to be what you want? As you know, TTrees are generally storage-backed, where the storage to be used is determined at creation time, not at Write() time, because Fill() can cause flushing to storage. But you cannot write to the same TFile with the same name and read the old TTree. How important is it that you update the existing file, rather than writing to a new one?

OK, more questions than answers, sorry about that :slight_smile:

Hi Axel,
Thanks a lot for your quick answer. Unfortunately, I am not quite sure that this is the root (hehe) of the problem. Doing gROOT->cd() does not change the error I get, the Tree is overwritten all the same. To be clear, I did the SetDirectory(nullptr) before opening the new file. Also, if the problem is related to new objects with the same name being created and overwriting the old objects, then why is a TTree with the internal name “iterTreeX” replaced by a TTree with the internal “nameTree”? In this case, I would expect that iterTreeX gets replaced by iterTreeX, and so on.

Also, in that case, shouldn’t the problem be solvable by just renaming the trees read from the baseline file with “SetName()” so they don’t get overwritten? I tried this and it did help.

To make things more confusing, I managed to make the code work by backporting this file to code from a previous CMSSW version (10_6_X). However, there are no major changes in that portion of the code and if I undo the few changes in this code portion that there are, this does not help. I will investigate further soon.

Cheers,
Marius

Hi Marius,

Hmm then I misread the tree names? Very well possible of course.

You can certainly ask valgrind for help. I will be off until Mon, back on Tue - if you didn’t solve it yourself by then why don’t I try again. In that case a reproducer would certainly help, even if it’s with empty trees or just one entry or whatever…

Hi Alex,
Since i managed to make the older version of the code run, this is not high priority anymore. I will try in the following days to debug this problem by implementing the changes between the old and the new version of the code one by one and testing where the problem arises. I will come back with my findings then.

Thanks for your help so far, reproducing might be quite hard however unless you have have access to cmssw/cern afs.

Cheers,
Marius

Sorry for the spam, but I found the lines of code that cause this issue. It’s cmssw/ApeEstimatorSummary.cc at master · cms-sw/cmssw · GitHub
This part of the code runs before the function that gives the error (calculateApe()). In these lines, pointers to trees are removed after writing the trees to a file and closing it.

I assume that the reason for the behavior is that closing the file already removes trees that were in its directory and then deleting the tree pointers screws up something about the root memory management. I am not 100% sure if this is expected behavior, I would in fact expect a segmentation fault when trying to delete a pointer that is empty. Also, the file and trees removed here are not related to those opened in the next function. However, removing the lines makes the code run as expected again.

Cheers,
Marius

1 Like

Thanks for sharing the mystery’s resolution!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.