TFile gets same pointer for a different key version

It’s written in the documentation that TFile stores several version of objects if they are changed.

I wrote several trees into one file, but when I try to get them, only the second one is returned. How should I do that? Or is key versioning obsolete and discouraged in future ROOT versions?

This is from a python session:

>>> fil.ls()
TFile** file.root
TFile* file.root
OBJ: TTree tree another tree : 0 at: 0x562333e6c970
KEY: TTree tree;2 another tree
KEY: TTree tree;1 tree
KEY: TTree привет world;1 hello
>>> fil.Get(“tree;2”)
<cppyy.gbl.TTree object at 0x562333e6c970>
>>> fil.Get(“tree;3”)
<cppyy.gbl.TObject object at 0x(nil)>
>>> fil.Get(“tree;1”)
<cppyy.gbl.TTree object at 0x562333e6c970>
>>> fil.Get(“tree”)
<cppyy.gbl.TTree object at 0x562333e6c970>

(all pointers are the same). From ROOT interpreter it is also the same:

root [2] _file0->ls()
TFile** file.root
TFile* file.root
KEY: TTree tree;2 another tree
KEY: TTree tree;1 tree
KEY: TTree привет world;1 hello
root [3] _file0->Get(“tree;2”)
(TObject *) 0x55596dc85120
root [4] _file0->Get(“tree;1”)
(TObject *) 0x55596dc85120

ROOT Version: 6.22/06
Platform: Arch Linux
Compiler: linuxx8664gcc


Hi @ynikitenko; Could you provide a minimal working example that illustrates how you are generating the TFile?

Cheers,
J.

You seem to be misusing the concept of the “namecycle”.
When you create different trees, they should have different “names” (the “cycle” numbers are handled by the ROOT I/O automatically).

For completeness, I will invite @pcanal to this thread. I am sure he can provide more information on this.

Cheers.

Since the cycle are backup/fallback old copies of the object, TDirectoryFile::Get when asked to load a different cycle of an item it retains shared ownership of (i.e. TTree and TH1*), will first delete the previous one (because internally it as a map of name to address and only support one item per name). In your use case, the new object just happens to be re-allocated in a different location (you should be able to see that by looking at the result of calling Print() on the objects. (It you need both object loaded at the same time, you need to use directory the TKey interfaces (GetKey)).

How can they be loaded at the same time?

>>> fil.GetKey(“tree;2”)
<cppyy.gbl.TKey object at 0x(nil)>
>>> fil.Get(“tree;2”)
<cppyy.gbl.TTree object at 0x55b418f568f0>

As I understand, to get that directory I call

>>> fil.GetKey(“tree”)
<cppyy.gbl.TKey object at 0x55b419046e90>

How can I get the needed version from there or even list contents of that?

I also note that file.GetListOfKeys outputs existing keys several times. Should I eliminate duplicate keys manually, or is there another method for that?

Thanks for all replies! Sorry, my forum notifications got into spam folder.

@jalopezg - I do nothing special. I create a tree and use File.Write or something like that. I filled it manually. Not sure what could be important here. I can reproduce that if really needed.

@ynikitenko (unless you have a reason you have not discussed yet), just change the name of the second TTree when creating it, this will solve your problem and simplify everything (Dealing with TKeys is straightforward but an unnecessary complication in your use case).

Unfortunately this is not the case. I’m writing a class in Python to read ROOT files (actually it’s already published, just for reference).

I want users of the class to read all objects from a TFile. Either to read only the most recent versions or to read arbitrary specified versions. I should not make assumptions about contents of their files, and have to make sure that my class is always reliable (or at least to warn users about special cases).

Fair enough. In this use case (writing a framework) it does then make sense.

You have two choice:

(a) Remove the created object from the TFile’s list (you then own it)

root_directory->GetList()->Remove(obj)

(note that the code your pointed at those yet support sub-directories)

(b) Deal with the key directly.

// Instead of recording the key name and calling get do:
obj = key.ReadObj()

a) l = fil.GetList(); l.Remove(obj) - this works.
b) key.ReadObj() also works.
However, in both these cases I don’t know which object version I’m using.
I think I’ve found a more reliable way to get the needed object version:

TKey* TDirectoryFile::GetKey(const char* name, short cycle = 9999)

accepts the cycle as its argument!

I managed to create a key with the name “tree;3” in my file. Now it looks like this:

>>> fil.ls()
TFile** file.root
TFile* file.root
OBJ: TTree tree;3 : 0 at: 0x55d9b7a6f960
KEY: TTree tree;2 another tree
KEY: TTree tree;1 tree
KEY: TTree привет world;1 hello
KEY: TTree tree;3;1

When I use GetKey, it gives the key with name “tree;3”

>>> fil.GetKey(“tree;3”)
<cppyy.gbl.TKey object at 0x55d9b7e41450>
>>> fil.GetKey(“tree;3”, 1)
<cppyy.gbl.TKey object at 0x55d9b7e41450>
>>> fil.GetKey(“tree”, 3)
<cppyy.gbl.TKey object at 0x55d9b7af99e0>

(in the latter case this is the same as fil.GetKey(“tree”, 2)
<cppyy.gbl.TKey object at 0x55d9b7af99e0>), because “tree” with the cycle 3 was not written to file yet).

So I think that to use the syntax like “tree;3” is at least ambiguous and at most unreliable. I think that the default ROOT behaviour to give the most recent object version is the most sane.

I also must add that when I used file.Get(“tree;1”) and file.Get(“tree;2”) today, I got really different results! I got very confused. It seems that this method gives some flaky behaviour.

Thanks for all your help!

And yes, it seems that I shall check for duplicates in GetListOfKeys manually.

BTW. Search the ROOT’s documentation for “cycle” and/or “namecycle”.

1 Like

With the code, I saw in the link you provided in both case you do know which cycle it is … i.e. the cycle you just requested. In particular for (b), it is key->GetCycle();

I am confused why you would call GetKey since you are already iterating over the key.

Also, one important note, in the list of keys, the keys are order from newer to older keys. So the highest cycle is guaranteed to be encountered first.

I also must add that when I used file.Get(“tree;1”) and file.Get(“tree;2”) today, I got really different results! I got very confused. I

I assume this is because of the (likely unfortuntate) feature of return an already read object if any.

In my view point, direct cycle manipulation like that is intended to be a rare usage (for example, for recovery of a partially broken/corrupted ROOT files).

On the other hand, for generic framework, one should just “ignore” cycles (and only use the newest cycle) or deal directly with the list of keys (and then decide how to deal with/consider older cycles).

Thank you! I think in this case I’ll allow to read all versions via a non-default option. If that is really used, this should not be prohibited (though discouraged).

Yes, I read the first of your links (there was little information on cycles though, nothing I didn’t know when I wrote this post). The other links point to User’s Guide, which is said to be outdated (but anyway I read that a long time ago, and probably will read that more).

@pcanal I think that I use GetKey because it returns an object of the correct type (I don’t have to change its type in Python code, and I don’t know how to do that).

Thanks for the info about cycle numbering!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.