Order of array index and array in class/struct for Reflex

jwimberl · April 6, 2017, 6:05pm

As noted in the comments in TStreamerInfo, the length of an array to be reflected should be annotated thusly:

            //
            // look for a pointer data member with a counter
            // in the comment string, like so:
            //
            //      int n;
            //      double* MyArray; //[n]
            //

Is the reverse order allowed, e.g.

            //
            // look for a pointer data member with a counter
            // in the comment string, like so:
            //
            //      double* MyArray; //[n]
            //      int n;
            //

For reasons that have to do with the order of bits in memory, I must do the latter, but I’m not sure if this is responsible for some of the errors I get, which look like

Error in <TStreamerInfo::Build>: epm_gsl_block, discarding: double* data, illegal [reflex_size]

where epm_gsl_block is the struct containing these entries, data corresponds to MyArray, and reflex_size corresponds to n.

pcanal · April 6, 2017, 7:14pm

Is the reverse order allowed, e.g.

Not at the moment. This is a side effect of storing the data members in the strict order they are listed (and of course myArray can not be read until n has been read).

This is indeed a limitation that we could lift by updating TStreamerInfo::Build to detect this case and to update the order in which the data member are stored on the file (to have n comes before MyArray). This is doable but would require external help to have the resource to implement and test this.

Cheers,
Philippe.

jwimberl · April 6, 2017, 7:24pm

Hi Philippe,

OK, thank you for the clarification. Sadly, this does make things a bit more difficult for me, but should be circumventable with a small penalty overhead. Due to the limitation you explained to me in another thread, where the index variable must be 32 bit, I have worked on creating structs of the form

epm_gsl_block {
   std::size_t size;
   double* data; //[reflex_size]
   uint32_t reflex_size;
}

epm_gsl_vector {
   std::size_t size;
   std::size_t stride;
   double* data; //[reflex_size]
   epm_gsl_block* block;
   std::size_t order;
   uint32_t reflex_size;
}

With this structure, a pointer of type epm_gsl_vector* v can be converted to a gsl_vector pointer via reinterpret_cast<gsl_block*>(v). Importantly, this correctly handles the epm_gsl_block* member.

However, things will be more difficult once I move my reflex_size members to the front of the structs. Such as epm_gsl_vector* v can be cast via reinterpret_cast<gsl_vector*>(&(v->size)), but the block member of the reinterpreted pointer will point to a section of memory beginning with the uint32_t reflex_size member rather than the std::size_t size member, as is required for bit-compatibility.

I’ll have to circumvent this by temporarily adjusting the block pointer and then adjusting it back. So, unlike my previous attempted solution, there will be run-time costs involved instead of just compile-time reinterpreting (but it should have the advantage of working…).

jwimberl · April 6, 2017, 7:29pm

I wish I could offer such resources, for this and the 32 bit limitation, but sadly I cannot.

pcanal · April 6, 2017, 7:56pm

You might be able to do something like:

epm_gsl_block_io_size {
   uint32_t reflex_size;   
};

epm_gsl_block { };

epm_gsl_block_real : public epm_gsl_block_io_size, epm_gsl_block {
   std::size_t size;
   double* data; //[reflex_size]
};

which this the part known by the compiler as ‘epm_gsl_block’ ina epm_gsl_block_real object would start at the ‘right’ address (and the members would be in the right order for ROOT).

Cheers,
Philippe.

jwimberl · April 6, 2017, 8:06pm

Hi Phillippe,

I’m intrigued by your suggestion. However, why is the epm_gsl_block struct empty? Did you perhaps mean

epm_gsl_block_io_size {
   uint32_t reflex_size;   
};

epm_gsl_block {
   std::size_t size;
   double* data; //[reflex_size]
};

epm_gsl_block_real : public epm_gsl_block_io_size, epm_gsl_block { };

On second thought, you couldn’t have meant this, because then //[reflex_size] would mean nothing without the inheritance - but then an epm_gsl_block pointer is a pointer to an empty struct. Should that then be reinterpreted?

pcanal · April 6, 2017, 8:22pm

[quote=“jwimberl, post:6, topic:24347”]
On second thought, you couldn’t have meant this, because then //[reflex_size] would mean nothing without the inheritance [/quote]Exactly.

but then an epm_gsl_block pointer is a pointer to an empty struct. Should that then be reinterpreted?

Yes, in your code, you would only use epm_gsl_block_real but then the other code (to which you are trying to be binary compatible) will have a different definition of epm_gsl_block but that lines up nicely (member-wise in memory) with yours.

The reason to have the ‘empty’ struct is to get the compiler to do (automatically) the offset shuffling you were mentioning.

Cheers,
Philippe.

jwimberl · April 6, 2017, 8:36pm

Not to get too far off topic, but I can’t get this work in a simple test:

#include <iostream>

struct block_original {
	size_t size;
	double val;
};

struct block_io {
	uint32_t reflex_size;
};

struct block {
};

struct block_real : public block_io, block {
	size_t size;
	double val; //[reflex_size]
};

using namespace std;
int main(int argc, char *argv[]) {
	block_real* test = new block_real;
	test->reflex_size = 1;
	test->size = 1;
	test->val = 5.0;
	auto stest = reinterpret_cast<block_original*>(static_cast<block*>(test));
	std::cout << stest->size << std::endl;
	delete test;
}

I believe this is a faithful but simplified implementation of your suggestion. However, the printout of stest->size is 10376293541461622785 rather than 1. (Actually, it’s unpredictable, and does give 1 sometimes).

pcanal · April 6, 2017, 9:04pm

The empty class is not working out as I thought it would.
This works:

#include <iostream>
#include <cstdint>

struct block_original {
        size_t size;
        double val;
};

struct block_io {
        uint32_t reflex_size;
};

struct block {
        size_t size;
};

struct block_real : public block_io, block {
        double val; //[reflex_size]
};

using namespace std;
int main(int argc, char *argv[]) {
        block_real* test = new block_real;
        test->reflex_size = 1;
        test->size = 1;
        test->val = 5.0;
        block* b = test;
        auto stest = reinterpret_cast<block_original*>(static_cast<block*>(test));
        std::cout << (void*)test << '\n';
        std::cout << (void*)b << ' ' << sizeof(block) << '\n';
        std::cout << (void*)(static_cast<block*>(test)) << '\n';
        std::cout << (void*)stest << '\n';
        std::cout << (void*)reinterpret_cast<block_original*>(b) << '\n';
        std::cout << &(test->size) << std::endl;
        std::cout << &(stest->size) << std::endl;
        std::cout << stest->size << std::endl;
        delete test;
}

jwimberl · April 6, 2017, 9:29pm

Your solution has a wonderful extra feature that you might not have realized - it permits (with caution) casting backwards! Adding the following in main

	block_original* d = new block_original;
	d->size = 3;
	d->val = 7.0;
	auto e = reinterpret_cast<block*>(d);
	auto f = static_cast<block_real*>(e);
	std::cout << "d: " << (void*)d << '\n';
	std::cout << "e: " << (void*)e << '\n';
	std::cout << "f: " << (void*)f << ' ' << sizeof(block_io) << '\n';
	std::cout << "d.size: " << &(d->size) << std::endl;
	std::cout << "e.size: " << &(e->size) << std::endl;
	std::cout << "f.size: " << &(f->size) << std::endl;
	std::cout << "f.reflex_size = " << f->reflex_size << std::endl;
	std::cout << "f.size = " << f->size << std::endl;
	std::cout << "f.val = " << f->val << std::endl;
	delete d;

produces the output

d: 0x7fa54a4033f0
e: 0x7fa54a4033f0
f: 0x7fa54a4033e8 4
d.size: 0x7fa54a4033f0
e.size: 0x7fa54a4033f0
f.size: 0x7fa54a4033f0
f.reflex_size = 0
f.size = 3
f.val = 7

I’m not quite sure why the difference between e and f is only 2. However, this code successfully takes a pointer to a block_original and casts it to a block_real pointer, which starts some point earlier in memory. The size and val elements of the casted block_real pointer refer to the same memory as for the block_original pointer; the reflex_size element refers to whatever part of memory precedes the block_original, of course, and its behavior is undefined and should not be used. With caution, though, this should allow me to take GSL pointers created/allocated by GSL functions and treat them as though they are my type for the purposes of using C++ operator overloads.

pcanal · April 6, 2017, 9:45pm

Indeed, this is one of the virtue of this option that I did not make clear! (sorry ).

I’m not quite sure why the difference between e and f is only 2.

The numbers/addresses are in hexadecimal notation, so the difference is actually 8.

Cheers,
Philippe.

system · April 20, 2017, 9:52pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.