Broken TDirectoryFile::FindObjectAny and TDirectoryFile::FindKeyAny

It seems to me that TDirectoryFile::FindObjectAny and TDirectoryFile::FindKeyAny misbehave.
Assume that you have several objects with the same “name” and different “cycle” (e.g. “name;1”, “name;2”).
Both these methods return valid objects for any 0 < “cycle” < 32768 (e.g. for “name;3”, “name;4”, …, “name;32767” they happily return the same as for “name;2” and for “name;32768” one may even get a “segmentation violation” sometimes).
Please fix it.
These methods should return “non-null” pointers only for “cycles” which really exist in the file (actually, the highest allowed “cycle” in a file should be 9998, because 9999 means a “memory resident” object).

Hi Pepe,

Indeed there is a problem is the parsing of the cycle value (poor handling of value that don’t fit in a short …). I shall update this shortly.

Thanks,
Philippe.

Hi Pepe,

This is fixed in the master.

Thanks,
Philippe.

How about “v5-34-00-patches”? 8-[

done.

Philippe.

I compiled the new “v5-34-00-patches” and I still see this problem

Hi,

Pepe can you send me an explicit reproducer. I get

root [15] f->ls()
TFile**         hsimple.root    Demo ROOT file with histograms
 TFile*         hsimple.root    Demo ROOT file with histograms
  OBJ: TH1F     hpx     This is the px distribution : 0 at: 0x2d3d190
  KEY: TH1F     hpx;8   This is the px distribution
  KEY: TH1F     hpx;7   This is the px distribution
  KEY: TH1F     hpx;6   This is the px distribution
  KEY: TH1F     hpx;5   This is the px distribution
  KEY: TH1F     hpx;4   This is the px distribution
  KEY: TH1F     hpx;3   This is the px distribution
  KEY: TH1F     hpx;2   This is the px distribution
  KEY: TH1F     hpx;1   This is the px distribution
  KEY: TH2F     hpxpy;1 py vs px
  KEY: TProfile hprof;1 Profile of pz versus px
  KEY: TNtuple  ntuple;1        Demo ntuple
root [16] delete f
root [17] f = TFile::Open("hsimple.root","UPDATE")
(class TFile*)0x29a4ae0
root [18] f->FindObjectAny("hpx;9")
(const class TObject*)0x2d3d190
root [19] f->FindObjectAny("hpx;0")
(const class TObject*)0x0
root [20] f->FindObjectAny("hpx;32678")
(const class TObject*)0x2e640d0
root [21] f->FindObjectAny("hpx;32679")
(const class TObject*)0x2e5b870
root [22] f->FindObjectAny("hpx;33000")
(const class TObject*)0x0
root [23] f->FindObjectAny("hpx;32766")
(const class TObject*)0x2e630e0
root [24] f->FindObjectAny("hpx;32767")
(const class TObject*)0x0
root [25] f->FindObjectAny("hpx;32768")
(const class TObject*)0x0

Cheers,
Philippe.

Note that f->FindObjectAny(“hpx;9”) returned a valid pointer, even though f->ls() shows that there is no “hpx;9” in the file (the last one is “hpx;8”). I think you will get another valid pointers if you try f->FindObjectAny(“hpx;10”) or f->FindObjectAny(“hpx;11”) and so on.
That’s the problem. I expect that if a particular cycle does not exist in a file, I will get a null pointer (note also: one could delete a cycle from the file, let’s say I delete “hpx;5” and then I also expect that I get a null pointer if I try to f->FindObjectAny(“hpx;5”) but I should get valid pointers if I try f->FindObjectAny(“hpx;4”) or f->FindObjectAny(“hpx;6”)).
Note also that you tried f->FindObjectAny(“hpx;32678”) two times and at one time you got a valid pointer and on second time you got a null pointer.

BTW. I dont understand why you open the file in “UPDATE” mode. That’s not what I intend. I open the file in “READ” mode and I simply try to retrieve different “cycles” of a particular object (graph / histogram).

Hi Pepe,

This is an intentional functional particularity of the *Any function. TDirectoryFile::Get for example behaves as you expect while the *Any make extract effort to find the object requested anywhere it might be. In this case, it must return the highest cycle less or equal to requested cycle.

Cheers,
Philippe.

I think I misunderstood old statements
I thought I needed to use “FindObjectAny” because …
I remember that if I try to retrieve objects with “GetObject”, then always only the very recently retrieved “cycle” stays in memory (any previously retrieved “cycle” is automatically deleted from memory).

Is there a guarantee that “Get” will always leave in memory all “cycles” that I retrieve?

Another question is … “Get” will actually load the object into memory. Is there any easy way to “test” that a specific “cycle” exists in the file without retrieving the object itself? Unfortunately, TDirectoryFile::FindKeyAny behaves like TDirectoryFile::FindObjectAny when trying to “find cycles”.

Hi Pepe,

TDirectoryFile::GetKey and FindKey also (somewhat surprisingly) behaves like *Any. However you can still do what you want with:

TKey *key = file->GetKey(name,cycle); // Or FindKey(namewithcycle);
if (key->GetCycle() == cycle) {
    return key;
} else { 
    return nullptr;
}

[quote]I remember that if I try to retrieve objects with “GetObject”, then always only the very recently retrieved “cycle” stays in memory (any previously retrieved “cycle” is automatically deleted from memory).[/quote]Indeed GetObject keeps only one cycle in memory … if and only if it manages the object lifetime (so this affect only TTree and TH* (and a few more) and the only explicitly delegated by the user).

So to keep multiple cycle in memory and use GetObject you can do:

TObject *v1 = nullptr;
TObject *v2 = nullptr;
directory->GetObject(namecycle1, v1);
if (v1) directory->Remove(v1);
directory->GetObject(namecycle2, v2);
if (v2) directory->Remove(v2);

Cheers,
Philippe.

So, I now tried “Get” with some TH* objects and it unfortunately behaves exactly like “GetObject” (i.e. as I feared, it deletes any “memory resident” object when retrieving a new one from a file).

Well, current spree deals with some TGraph objects, so “GetObject” and “Get” are still fine, though.

So, I now tried “Get” with some TH* objects and it unfortunately behaves exactly like “GetObject”

Yes, this is the expectation (and the same ‘remove-ownership-from-TDirectory’ pattern would work).

Cheers,
Philippe.

Many thanks for all your help.

In the end, I think that, in more complex cases, one needs to go back to scanning of the list of keys of a file and utilize “TKey::ReadObj” (or “TKey::ReadObjAny”).

Hi,

if an object (with the desired “name”) exists before the file is opened, one would probably need to deal with it first,

By definition, it would not be in the just open file yet …

This makes me realize than I am not quite sure what your use case is (usually going back to old cycle is used only for debugging purposes).

Cheers,
Philippe.

Although it is not recommended, I saw many cases in which people store, in a single file, many histograms (or graphs, or even some “event” objects) which have the same “name”, but they differ in the “cycle” number (I did it myself, too). Each histogram (or graph) keeps a “sub-sample” of data. In the end, you may need to analyze/compare different “sub-samples”, or you may want to add them all (to get the “total result”).
One may argue that, in such cases, one should create a TTree with a branch that keeps these histograms (or graphs), but almost nobody does that.
It is easier to write histograms (or graphs) without changing their “names”, multiple times into a file → well; reading them back can be a bit tricky, as one can see in this thread …

Two small source code examples which utilize “GetObject” and/or “Get”:

In more complex cases, looping through the list of keys of a file (plus “TKey::ReadObj” or “TKey::ReadObjAny”) should always be safe (see also another example: post 5 in “On using a TList to store TH1 / TGraph objects?”).

1 Like