Schema evolution for changed base class

tobi_s · December 18, 2014, 12:21pm

Hi,

I’ve been trying to figure out how to write schema evolution rules for the case when a base class changes. I have a somewhat convoluted situation because of some bad design in code that I’m working with. The core of my problem is, I think, this:

class A : public B {
  ClassDef(A, 1);
};

is changed to

class A : public C {
  ClassDef(A, 2);
};

and I want to have a schema evolution rule that reads version 1 and translates the contents of the base class B to the corresponding stuff in C, thus giving me an object of class A, version 2. I haven’t managed to figure out schema evolution rules for this. In principle, I could write a rule that evolves B to C, but B may have other users, so that is out of the question. The data members that I want to change are actually protected in B, so I could read them from A, but that doesn’t help: neither does it work mentioning them as source in the schema evolution rule (no code is generated), nor could it work from my understanding of the generated code in other cases.

Help appreciated! From how I understand how all this works, it would probably be sufficient, if there was a name by which I can access B while A is read.

If you are interested: This situation comes about because both B and C inherit from a common base class, which defines a public interface. B contains stuff not actually suitable for A and thus diskspace is wasted, whic I want to fix.

ps I’m aware of the well-hidden, but very useful documentation in iopscience.iop.org/1742-6596/219 … 032004.pdf

pps on second thought, I actually couldn’t write a schema evolution rule that evolves B into C because A, B, and C live in different shared libraries and the dictionaries are generated in totally different places.

pcanal · January 5, 2015, 6:25pm

[quote]In principle, I could write a rule that evolves B to C, but B may have other users, so that is out of the question. [/quote]Why not? The rule would only apply/be used when attempting to read a B on file object into a C object.

[quote] I actually couldn’t write a schema evolution rule that evolves B into C because A, B, and C live in different shared libraries and the dictionaries are generated in totally different places.[/quote]Especially we assume that you need this translation only for A, it is perfectly fine to place the rule along side the dictionary for A. Note that the I/O rule only need the compiled version of C, so you could technically also place it alongside the dictionary for C.

Cheers,
Philippe.

tobi_s · January 9, 2015, 12:41pm

[quote=“pcanal”][quote]In principle, I could write a rule that evolves B to C, but B may have other users, so that is out of the question. [/quote]Why not? The rule would only apply/be used when attempting to read a B on file object into a C object.[/quote]That’s what I thought. But how can I know that I’m reading a B that is stored as part of a C instead of an independent B? Thinking about the second paragraph of your reply, I think I see now how I could deal with this problem using a custom streamer (where I could control the reading of the base class). But I don’t see how I could do this without introducing these maintenance hazards.

[quote][quote] I actually couldn’t write a schema evolution rule that evolves B into C because A, B, and C live in different shared libraries and the dictionaries are generated in totally different places.[/quote]Especially we assume that you need this translation only for A, it is perfectly fine to place the rule along side the dictionary for A. Note that the I/O rule only need the compiled version of C, so you could technically also place it alongside the dictionary for C.[/quote]Hm, I think I’m missing something. When building an object of a derived class, the base class is constructed first, and the schema evolution also appears to be applied first. The generated code for schema evolution uses some magic to access the objects members (basically, black magic concerning the memory layout) and from reading the generated code it doesn’t appear like it allows me to access anything in the base class. But I may well be dense.

ps sorry about the delay, I didn’t get a notification e-mail. I will check my settings.

pcanal · January 13, 2015, 6:22pm

Hi,

[quote]
But how can I know that I’m reading a B that is stored as part of a C instead of an independent B? Thinking about the second paragraph of your reply, [/quote]Well, you may not need to know. If the content of A and B is strictly the same (or has change convered by the automatic schema evolution), what would happen (once you set a rule allowing the reading of an A into a B) is that the I/O engine would some akin to:

time to read the base class of C
I see that I need to read an B but I have a A on disk
alright this is fine there is a rule saying A and B are equivalent
let me proceed as usual (i.e. as if the data on disk was part of A).

If there a need for more complex schema evolution, then instead of ‘proceed as usual’, the I/O engine would create a conversion StreamerInfo (a set of I/O operations) to read the ondisk A data into a B. For example you could have a rule ‘renaming’ A::fVal into B::fOtherName and this rule would be used to create an I/O operation used only in the conversion StreamerInfo.

Similarly you can provide a custom (free standing) Streamer function that actually take 3 parameters: the address of the obj in memory, the TBuffer to read the data from and the TClass corresponding to the data type used when writing (so in your case it would be called with the address a B, the TBuffer and the TClass for A).

[quote] The generated code for schema evolution uses some magic to access the objects members (basically, black magic concerning the memory layout) and from reading the generated code it doesn’t appear like it allows me to access anything in the base class.[/quote]I am not quite sure how it relates to your use case. Still you are correct the base class is populated first. In the rules for the derived class you would have access to the in-memory object which has been first initialized by the default (or I/O) constructor and then had its base class populated with the data from the disk which you can access using the ‘normal’ API of the object.

To clarify further we ought to get a more concrete example of the content of A and B.

Cheers,
Philippe.