Home | News | Documentation | Download

Strange problem with schema evolution, streamer info and class versions

Hello ROOT Forum,

I’m having trouble handling backwards compatibility for some of my classes after a recent evolution in their inheritance. I used to have two classes, Event and OtherEvent, with an inheritance like this (class version numbers in brackets):

Base(1) < Event(4) < OtherEvent(3)

I have a lot of data written with these class versions. Now I want to change the structure using a new templated base class with the following inheritance:

Base(1) < BaseEvent(1) < ETempl<T>(1)

and the inheritance of the two previous classes becomes (with new class versions):

ETempl<Nuc> < Event(5)
ETempl<OtherNuc> < OtherEvent(4)

When I try to read some old data using the new classes, I expect to need to manually tweak the Streamers if the automatic schema evolution fails (which is the case with my real world problem for OtherEvent, although not for Event). However, this relies on being able to read the class versions in a file and act accordingly, and this is not happening. The data is stored as objects of the classes contained in an unsplit branch of a TTree. If I read an old object which was simply written to a TFile (not in a TTree), the custom streamers in my new classes recognize correctly the class versions:

auto event = f.Get<Event>("event");
Info in <Event::Streamer>: On-file version = 4
auto oevent = f.Get<OtherEvent>("other_event");
Info in <OtherEvent::Streamer>: On-file version = 3

But not when reading the objects from the TTree written in the same file:

auto tree = f.Get<TTree>("tree");
tree->SetBranchAddress("events",&event);
tree->SetBranchAddress("other_events",&oevent);
tree->GetEntry(0);

Info in <Event::Streamer>: On-file version = 1
Error in <TClass::RegisterStreamerInfo>: Register StreamerInfo for Event on non-empty slot (5).
Error in <TBufferFile::CheckByteCount>: object of class ETempl<Nuc> read too few bytes: 6 instead of 20
Info in <OtherEvent::Streamer>: On-file version = 4
Error in <TBufferFile::CheckByteCount>: object of class EBase read too few bytes: 6 instead of 20

I have attached the simplest possible example to demonstrate. Instructions are included.
I would be grateful for any help!

Cheers
Johnexample.tgz (1.8 KB)


Please read tips for efficient and successful posting and posting code

ROOT Version: 6.23/01 (master)
Platform: Ubuntu 20.10
Compiler: g++10.2.0


Hello @j.d.frankland,

I am inviting @jblomer to this thread, as he may know the answer to your previous question.

Cheers,
J.

I think @pcanal is the right person to help here.

If @pcanal can help, that would be great!

Just to note that I am looking at it …

1 Like

The challenge in the example provided is that (inadvertently) switching from the default streamer to a custom streamer is not working. Namely the default streamer case does not store the version number inside the basket while the custom streamer expects it.

What lead you to need a custom streamer in the new version? (i.e. could it be done using I/O customization rule).

Hi Philippe
The custom streamers in the new version are there in order to try to do whatever needs to be done in order to be able to read data written with the old version. Of course this begins with knowing which version you’re reading, hence my problem.

In the real world problem, the equivalent of ‘Event’ had a custom Streamer in its old version, while in the new version it has an automatic Streamer, and reading back old ‘Event’ objects from a TTree works fine.

However I have several real world equivalents of ‘OtherEvent’ which used to inherit from ‘Event’ but now (in the new version) don’t. Some of these had a custom Streamer, some had an automatic Streamer, but when I try to read old versions back with an automatic Streamer in the new ‘OtherEvent’, there are no errors, but the data is not read back. To be more precise, the old ‘Event’ class contained a TClonesArray of ‘Nuc’ objects, whereas the ‘OtherEvent’ class stored ‘OtherNuc’ objects in Event::TClonesArray. In the new version the TClonesArray is in the ‘EBase’ class from which the templated event classes derive.

I hope this additional information can give you some idea.

As far as I/O customization rules, I don’t know what they are, can you tell me where I can find the information?

Thanks a lot
John

I have noticed that replacing the automatic Streamer of one of my ‘OtherEvent’ classes with a custom Streamer which is an exact copy of the previously-generated automatic Streamer leads to errors when reading back objects of the same version previously written with the automatic Streamer:

Error in <TBufferFile::ReadClassBuffer>: Could not find the StreamerInfo for version 4 of the class KVBaseEvent, object skipped at offset 5851
Error in <TBufferFile::CheckByteCount>: object of class KVBaseEvent read too few bytes: 2 instead of 25
Error in <TBufferFile::CheckByteCount>: object of class kv::event<KVSimNucleus> read too few bytes: 31 instead of 2475

(translation compared to example classes: I am trying to read back KVSimEvent (=OtherEvent) objects, which inherit from kv::event<KVSimNucleus> (=ETempl<OtherNuc>), itself inheriting from KVBaseEvent (=BaseEvent)).
So putting a ‘+’ in the LinkDef.h totally changes the way objects are read back (from a TTree)?

PS. if I/O customization could help, could somebody please tell me where to look?

Dear @pcanal

It seems we have had a similar discussion before: Read object from TBranch using new custom streamer
There, as here, the problem was objects written in TTrees with automatic streamers with missing version information. At the time (2016), this was declared a bug and promptly fixed (in 5.34)! It seems there has been a regression? (I have checked that I have exactly the same problem with the tip of the 5.34 branch as with the master).

Meanwhile, I found a very very small doc about I/O customization in the Users Guide chapter on I/O, and I’m trying to use it. The good news is that setting a “version=[-3]” flag works here, i.e. only for versions up to 3 does the rule print up a message (code={ std::cout << “coucou” << std::endl; }"), so I should be able to handle the backwards compatibility. However I can’t figure out how to do it.

I have objects on file like this:

class KVSimEvent : public KVEvent
{
   ClassDef(KVSimEvent, 3)
};
class KVEvent : public TObject
{
   TClonesArray* fParticles;//->
   KVNameValueList fParameters;

   ClassDef(KVEvent,4)
};

and I want to read them back into objects like this:

class KVSimEvent : public event<KVSimNucleus>
{
   ClassDef(KVSimEvent, 4)
};
template <typename Nuc>
class event : public KVBaseEvent {};
class KVBaseEvent : public TObject
{
   TClonesArray* fParticles;//->
   KVNameValueList fParameters;
};

So basically the size and layout hasn’t changed and when I see a KVSimEvent on disk with version<4 I should stream the TClonesArray and KVNameValueList objects from the buffer into KVBaseEvent::fParticles and KVBaseEvent::fParameters inside my new KVSimEvent object.

I have tried to do it like this:

#pragma readraw                              \
   sourceClass="KVSimEvent"                   \
   source=""                                  \
   version="[-3]"                             \
   targetClass="KVSimEvent"                   \
   target="TClonesArray* KVBaseEvent::fParticles; KVNameValueList KVBaseEvent::fParameters"  \
   code="\
{\
   KVBaseEvent::fParticles->Streamer(buffer);\
   KVBaseEvent::fParameters.Streamer(buffer);\
}";

but I just get warnings from rootcint/cling that

WARNING: IO rule for class KVSimEvent data member: TClonesArray* KVBaseEvent::fParticles; KVNameValueList KVBaseEvent::fParameters was specified as a target in the rule but doesn't seem to appear in target class

and no code is generated in the dictionary.

Can anybody help?

See https://root.cern.ch/root/html530/io/DataModelEvolution.html or for context: https://root.cern.ch/root/SchemaEvolution.pdf

Humm … The introduction of the new base does change things.

You might be able to just use:

#pragma read sourceClass="KVSimEvent" targetClass="KVBaseEvent";

which says that those 2 are “equivalent”
And if when passing the address of the event<...> object you also make sure to tell the TTree it is actually the address of the KVBaseEvent then it should work fine.
For example:

KVBaseEvent *ev = new event<....>(...);
....
tree->SetBranchAddress(branchname, &ev);

`

Good morning @pcanal
I have just done exactly as you said, but although there are no errors, the data is still not read back i.e. the TClonesArray is still empty.

I can’t understand why this is so difficult, especially as for KVEvent objects just having an automatic Streamer is sufficient for them to work. The old definition of KVEvent is given above (former base class of KVSimEvent), the new definition is

class KVEvent : public event<KVNucleus>
{
   ClassDef(KVEvent, 5)
};
template <typename Nuc>
class event : public KVBaseEvent {};
class KVBaseEvent : public TObject
{
   TClonesArray* fParticles;//->
   KVNameValueList fParameters;
};

Humm … I am confused (I still have to re-check the old report your pointed at to see if we have a regression), are you still using a custom Streamer?

Maybe the easiest is to upload an updated version of your (failing) example?

Sorry Philippe, my last comments weren’t very clear.
I meant that the old KVEvent objects, originally written using a custom Streamer, can be read back with the new KVEvent definition which now has an automatic streamer, i.e. appears in LinkDef.h as

#pragma link C++ class KVEvent+;

But in fact I have now found a slightly different class architecture which gives me what I want (‘events’ as templated container classes for different types of ‘nuclei’) while still being able to read all data written up to now. The new layout for KVSimEvent is

class KVSimEvent : public KVTemplateEvent<KVSimNucleus>
{
   ClassDef(KVSimEvent, 5)
};
template <typename Nuc>
class KVTemplateEvent : public KVEvent {};
class KVEvent : public TObject
{
   TClonesArray* fParticles;//->
   KVNameValueList fParameters;

   ClassDef(KVEvent, 4)
};

I realised that in fact we (should) have no legacy data consisting of KVEvent objects written in TTrees, only objects which inherit from KVEvent (like KVSimEvent). Therefore KVEvent has become an abstract base class, although the data is still stored in KVEvent (and which therefore has the same custom streamer as always), all the previous KVEvent functionality (iterators over the nuclei in the event, etc.) is now implemented in the template class.

I increased the class version of KVSimEvent by 2, because the previous version 3 objects (written by automatic Streamer, thus without/with wrong version info) are read back with apparent version 4, but any future data written with the new custom Streamer will have their version correctly written. As the version is read back wrong, R__b.ReadClassBuffer doesn’t work, even if I give it the right version number by hand. Instead, the following custom KVSimEvent::Streamer works fine:

if (R__b.IsReading()) {
     UInt_t R__s, R__c;
     UInt_t R__v  = R__b.ReadVersion(&R__s, &R__c);
     if (R__v < 5) {
         TObject::Streamer(R__b);
         fParticles->Streamer(R__b);
         fParameters.Streamer(R__b);
      }
      else
         R__b.ReadClassBuffer(KVSimEvent::Class(), this, R__v, R__s, R__c);
} ...

which is basically the explicit instructions for how to read an old KVEvent object.

Thanks for all the help!
Cheers
John