Reflex + schema evolution

Dear ROOT,

I have a class for which I use reflex I/O, which formerly contained a member:

std::vector<TrackConstraintOnSurface*> m_constraints;

I have replaced this std::vector with a custom object:

TrackConstraintList m_constraints;

I am providing conversion from the old type to the new type through a constructor and an operator=:

TrackConstraintList(const std::vector<MuonAlign::TrackConstraintOnSurface*>&); TrackConstraintList& operator= (const std::vector<MuonAlign::TrackConstraintOnSurface*>&);

However ROOT’s automatic schema evolution is confused by that when I try to read an old file:

What is the a proper way to tell ROOT schema evolution about this change of type?

As a side question, I’ve tried to implement the manual schema evolution documented here:

http://root.cern.ch/root/html532/io/DataModelEvolution.html

However, this needs me to set the version of my class in Reflex. Could you tell me how to do that? (i.e. how do I ClassDef with Reflex?)

Many thanks,

Pierre-François

[quote]I am providing conversion from the old type to the new type through a constructor and an operator=:[/quote]Currently ROOT I/O does not currently use this information to decide whether a something can be converted from one type to the other (and to some extent in the general case it is also the wrong granularity to provide the necessary information for an efficient conversion).

[quote] (i.e. how do I ClassDef with Reflex?)[/quote]The same way as for rootcint … by adding the macro “ClassDef(classname,version_number);” at the end of the class declaration.

In selection.xml you can also (instead of adding a ClassDef) specify a class version in the class tag:<class name="WithClassVersion" ClassVersion="42" />or in LinkDef file:#pragma link C++ options=version(3) class WithClassVersion+;

[quote]However, this needs me to set the version of my class[/quote]It is strongly recommended by not necessary. You can also use the class layout checksum instead of version number. To get the checksum of a class layout, open a file containing an object of that class in base root and call class->GetCheckSum():root [0] f = TFile::Open("Event.root") Warning in <TClass::TClass>: no dictionary for class Event is available Warning in <TClass::TClass>: no dictionary for class EventHeader is available Warning in <TClass::TClass>: no dictionary for class Track is available (class TFile*)0x2696e50 root [1] TClass::GetClass("Track")->GetCheckSum() (const unsigned int)11159

Something like:#pragma read sourceClass="TopClass" targetClass="TopClass" source="std::vector<TrackConstraintOnSurface*> m_constraints" version="[8]" target="m_constraints" \ code="{ m_constraints = onfile.m_constraints ; }"

Cheers,
Philippe.

Dear Philippe,

I’ve been trying to do what you describe, and here is what happens:

  • If I do direct I/O of my top-level object via TFile::WriteObject and TFile::ReadObject, the schema evolution works very well!

  • however if I try to do the same with objects written in a tree, then it does not work at all: I’ve observed two different behaviours depending on what I put in my top level object: either the schema evolution code in not called, or it is called, but there is a segmentation fault later on…

Trying to understand what is going on, I have put some verbose at the end of the method TStreamerInfo::BuildOld:

cout << "BuildOld verbose " << GetName() << " " << fClassVersion << endl; fElements->Print();

and among the output I get, I see the generated schema evolution:

BuildOld verbose MuonAlign::Track 1 Collection name='TObjArray', class='TObjArray', size=32 OBJ: TStreamerArtificial @@alloc OBJ: TStreamerObjectAnyPointer m_parameters OBJ: TStreamerObjectAnyPointer m_backTrackedParameters OBJ: TStreamerObjectAnyPointer m_innerDetParameters OBJ: TStreamerObjectAnyPointer m_innerDetExtrapolatedParameters OBJ: TStreamerSTL m_constraints OBJ: TStreamerSTL m_backTrackingConstraints OBJ: TStreamerSTL m_alignmentConstants OBJ: TStreamerBasicType m_fitStatus OBJ: TStreamerBasicType m_nstep OBJ: TStreamerBasicType m_chi2 OBJ: TStreamerBasicType m_ndf OBJ: TStreamerBasicType m_nOutliers OBJ: TStreamerBasicType m_nHoles OBJ: TStreamerBasicType m_eventNumber OBJ: TStreamerBasicType m_runNumber OBJ: TStreamerSTL m_segments (MuonAlign::TrackSegment) OBJ: TStreamerArtificial m_constraints OBJ: TStreamerArtificial @@dealloc

I see that to read an old abject, you have a special streamer for allocation at the beginning, and at the end, you have a TStreamerArtificial for m_constraints, which is the thing holding a function pointer to my code snippet, and then a @@dealloc.

When I read an object directly on file, adding verbose in TBufferFile, I see this sequence of operations is happening.

However, when reading an object in a tree, this sequence is modified in TBranchElement. More specifically the following line of code assumes a wrong structure for the above generated sequence of streamers:

http://root.cern.ch/root/html532/src/TBranchElement.cxx.html#1974

This loop means that it is expecting a TStreamerArtificial to follow immediately the TStreamerSTL m_constraints. As you can see above, this is not the case. And it also looks like the alloc/dealloc are ignored. Putting some verbose, I confirm that the vector fIDs does not contain the index of the TStreamerArtificial containing the schema evolution code.

So it seems that TBranchElement should be adapted for this special mechanism of schema evolution. Do you agree with my analysis of the problem? Do you know a way to solve this?

Many thanks,

Pierre-François

Hello,

I’ve also made a simple example program that demonstrates the feature. See attachment.

Go to directory v1, run the script run.sh. This will produce a file tmp.root, containing data written with version 1 of the test schema.

Go to directory v2, run the script run.sh. This will attempt to read the file tmp.root, attempting to convert to version 2 of the test schema. You should see the output:

[quote]12345
Direct IO: track has nHits = 1
Tree IO: track has nHits = 0
[/quote]

The last line should read nHits = 1 if the schema evolution had worked properly.

Cheers,

Pierre-François
testio.tgz (2.37 KB)

[quote]As you can see above, this is not the case. And it also looks like the alloc/dealloc are ignored. [/quote]They are intentionally ignored as they need to be executed only once per branch (i.e at the same time as the allocation of the data object itself) rather than for every entries …

Also it seems that it is ‘almost’ working, when I set gDebug = 7 around the GetEntry I see:

StreamerInfoAction, class:Track, name=m_hits, fType[1]=300, TStreamerSTL, bufpos=84, arr=0x2b33800, eoffset=0, Redirect=0x2b340e0 ReadBuffer, class:MdtHit, name=Hit, fType[0]=0, TStreamerBase, bufpos=119, arr=0x2ad6810, offset=0 ReadBuffer for class: Hit has read 6 bytes StreamerInfoAction, class:MdtHit, name=m_value, fType[1]=8, TStreamerBasicType, bufpos=129, arr=0x2ad6810, offset=8 ReadBuffer for class: MdtHit has read 24 bytes StreamerInfoAction, class:Track, name=m_a, fType[2]=8, TStreamerBasicType, bufpos=81, arr=0x2b33800, offset=24 StreamerInfoAction, class:Track, name=m_b, fType[3]=8, TStreamerBasicType, bufpos=81, arr=0x2b33800, offset=32 StreamerInfoAction, class:Track, name=m_c, fType[4]=8, TStreamerBasicType, bufpos=81, arr=0x2b33800, offset=40 StreamerInfoAction, class:Track, name=m_d, fType[5]=8, TStreamerBasicType, bufpos=81, arr=0x2b33800, offset=48 StreamerInfoAction, class:Track, name=m_e, fType[6]=8, TStreamerBasicType, bufpos=81, arr=0x2b33800, offset=56 StreamerInfoAction, class:Track, name=m_f, fType[7]=8, TStreamerBasicType, bufpos=81, arr=0x2b33800, offset=64 StreamerInfoAction, class:Track, name=m_g, fType[8]=8, TStreamerBasicType, bufpos=81, arr=0x2b33800, offset=72 StreamerInfoAction, class:Track, name=m_h, fType[9]=8, TStreamerBasicType, bufpos=81, arr=0x2b33800, offset=80 StreamerInfoAction, class:Track, name=m_i, fType[10]=8, TStreamerBasicType, bufpos=81, arr=0x2b33800, offset=88

And so what seems to be missing is (indeed as you pointed out) the running of the rule … I am investigating.

Cheers,
Philippe.

[quote]This loop means that it is expecting a TStreamerArtificial to follow immediately the TStreamerSTL m_constraints. As you can see above, this is not the case.[/quote]Yes, this is exactly the problem … more precisely the rule was supposed to be executed as part of the last data member … but in your case it happened to be a split collection and thus even though the rule is properly schedule as part of the node for the collection itself no rule is executed there (only on its sub-branches) …

Cheers,
Philippe.