How does Reflex treat pointers to same array?

jwimberl · April 6, 2017, 6:35pm

I am trying to make Reflex compatible versions of the GSL linear algebra types. Before my modifications for reflection, these are structs of the form

struct gsl_block {
	std::size_t size;
	double *data;
};

struct gsl_vector {
	std::size_t size;
	std::size_t stride;
	double *data;
	epm_gsl_block *block;
	int owner;
};

Within GSL, memory is allocated and types initialized such that for a gsl_vector* v object, the pointers v->data v->block->data point to the same array in memory.

When I create reflex-compatible versions of these structs (which might not be possible; several other questions I’ve posted here deal with various other issues), will Reflex recognize that these are two pointers to the same array? Or will it duplicate the array when saving it to persistence?

As an additional note, I recognize now that gsl_vector.block is a non-array pointer. How would this be annotated for Reflex – //[1]?

pcanal · April 6, 2017, 7:10pm

As an additional note, I recognize now that gsl_vector.block is a non-array pointer. How would this be annotated for Reflex – //[1]?

Pointers to numeric type are assumed to be arrays and thus need to be decorated.
Pointers to objects are assumed to be non-array and a single element is stored.

will Reflex recognize that these are two pointers to the same array?

In ROOT I/O we do have, for objects only, code to recognize that the same objects is being stored twice (within an I/O operation). Consequently, you example would have worked if the data has been struct/classes. For numeric type, this detection has not been implemented (as it is very rare but would still come to a cost to all cases).

Instead in this case, I recommend to set gsl_vector::data as a transient member (never saved) and add an I/O customization rules that sets gsl_vector::data to the value of gsl_vector::block::data at the end of the streaming.

Cheers,
Philippe.

jwimberl · April 6, 2017, 7:14pm

Hello Philippe,

Thank you for the very clear explanation. Indeed, I was wondering if declaring something as transient was the way to go, but I didn’t know about the I/O customization rules so I was worried it would leave code in a broken state. I am searching online for instructions on how to use these I/O customization rules, and I’ve found many presentation slides that you’ve written but they all seem to be high-level summaries, without instructions. Is there a resource explaining how to do this?

Cheers,

Jack

pcanal · April 6, 2017, 7:35pm

See https://root.cern.ch/root/html/io/DataModelEvolution.html

jwimberl · April 6, 2017, 7:54pm

OK, based on the contents of the documentation I believe

<lcgdict>
  ...
  <class name="epm_gsl_block" />
  <class name="epm_gsl_vector">
    <field name = "data" transient = "true"/>
  </class>
  <class name="epm_gsl_matrix">
    <field name = "data" transient = "true"/>
  </class>
  <ioread sourceClass="epm_gsl_vector"
	  source="epm_gsl_block* block"
	  targetClass="epm_gsl_vector"
	  target="double* data"
	  >
<![CDATA[
   data = onfile.block->data;
]] >
  </ioread>
  <ioread sourceClass="epm_gsl_matrix"
	  source="epm_gsl_block* block"
	  targetClass="epm_gsl_matrix"
	  target="double* data"
	  >
<![CDATA[
   data = onfile.block->data;
]] >
  </ioread>
</lcgdict>

should implement your suggestion.

jwimberl · April 7, 2017, 3:08am

Would the reverse be possible, setting epm_gsl_block::data as a transient member and adding a customization rule to set epm_gsl_vector::block::data equal to epm_gsl_vector::data, via

<lcgdict>
  ...
  <class name="epm_gsl_block">
    <field name = "data" transient = "true"/>
  </class>
  <class name="epm_gsl_vector" />
  <class name="epm_gsl_matrix" />
  <ioread sourceClass="epm_gsl_vector"
          source="double* data"
	  targetClass="epm_gsl_vector"
	  target="epm_gsl_block* block"
	  >
<![CDATA[
   data.block->data = onfile.data;
]] >
  </ioread>
  <ioread sourceClass="epm_gsl_matrix"
	  source="double* data"
	  targetClass="epm_gsl_matrix"
	  target="epm_gsl_block* block"
	  >
<![CDATA[
   data = onfile.block->data;
]] >
  </ioread>
</lcgdict>

I ask because, combining all your helpful responses from several threads, I think my best is apply your struct inheritance trick to epm_gsl_vector and epm_gsl_matrix. For the former (the latter follows by analogy),

struct epm_gsl_block {
        size_t size;
        double* data;
};

struct epm_gsl_vector_io {
        uint32_t reflex_size;
};

struct epm_gsl_vector_substruct {
        size_t size;
};

struct epm_gsl_vector : public epm_gsl_vector_io, epm_gsl_vector_substruct {
        size_t stride;
        double* data; //[reflex_size]
        epm_gsl_block *block;
        int owner;
};

In this setup, I believe than I can call

gsl_vector* GSL(epm_gsl_vector* ev) {
       auto gv =  reinterpret_cast<gsl_vector*>(static_cast<epm_gsl_vector_substruct*>(ev));
      return gv;
}

epm_gsl_vector* DeGSL(gsl_vector* gv) {
       auto ev =  static_cast<epm_gsl_vector*>(reinterpret_cast<epm_gsl_vector_substruct*>(ev));
      return ev;
}

The advantage this has is that the pointer epm_gsl_vector::block does not have to be modified in any way to be cast to a pointer to a gsl_block. If epm_gsl_block instead carried the modifications, I cannot find an easy way for casting operations on epm_gsl_vector to appropriately modify the epm_gsl_vector::block element.

pcanal · April 7, 2017, 9:41pm

Hi,

I think that a further simplification (and reduction of on-file size) is to mark both epm_gsl_vector_substruct::size and epm_gsl_vector::block as transient.

<ioread sourceClass="epm_gsl_vector"
          source="double* data"
	  targetClass="epm_gsl_vector"
	  target="data,block,size"
	  >
<![CDATA[
   data = onfile.data; // copy pointer value in place.
   newObj->size = newObj->reflex_size;
   newObj->block = new epm_gsl_block;
   newObj->block->data = data;
  newObj->block->size = newObj->reflex_size;
]] >

Cheers,
Philippe.

jwimberl · April 8, 2017, 2:18pm

Hi Philippe,

Thanks for this suggestion; I will try it. I’m currently trying to diagnose a segfault, however, in TStreamerInfo::InitCounter, related to my epm_gsl_matrix struct (renamed to espresso_matrix), which follows your prescription of inheriting from an espresso_matrix_io whose single field is reflex_size:

(gdb) info f
Stack level 0, frame at 0x7ffffffd64c0:
 rip = 0x7fffeed01be0 in InitCounter
    (/mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Debug/COMPILER/gcc49/LABEL/slc6/build/projects/ROOT-6.08.02/src/ROOT/6.08.02/core/meta/src/TStreamerElement.cxx:74); saved rip = 0x7fffeed04fad
 called by frame at 0x7ffffffd64f0
 source language c++.
 Arglist at 0x7ffffffd64b0, args: countClass=0x7edb350 "espresso_matrix_io", countName=0x7edb319 "reflex_size", directive=0x7ed25d0
 Locals at 0x7ffffffd64b0, Previous frame's sp is 0x7ffffffd64c0
 Saved registers:
  rbx at 0x7ffffffd64a8, rbp at 0x7ffffffd64b0, rip at 0x7ffffffd64b8

(gdb) info args
countClass = 0x7edb350 "espresso_matrix_io"
countName = 0x7edb319 "reflex_size"
directive = 0x7ed25d0

(gdb) info locals
info = 0x7ed25d0
rdCounter = 0x7ffff07d25e4 <TString::GetPointer() const+40>
dmCounter = 0x7ffffffd6490
cl = 0x7edb328
counter = 0x0

The cause of the segfault is not immediately obvious to me, because none of the local variables that have been set at this line (so, excluding counter) are null pointers.

pcanal · April 8, 2017, 2:24pm

strange at that location the code is:

         if (!rdCounter) return 0;
         TDataMember *dmCounter = rdCounter->GetDataMember();

Can you run the failing example with valgrind?

Cheers,
Philippe.

jwimberl · April 8, 2017, 3:05pm

Hi Philippe,

I think this is the relevant output of valgrind:

==26713== Invalid read of size 8
==26713==    at 0xF564BE0: InitCounter(char const*, char const*, TObject*) (TStreamerElement.cxx:74)
==26713==    by 0xF567FAC: TStreamerBasicPointer::Init(TObject*) (TStreamerElement.cxx:948)
==26713==    by 0xEAF69A2: TStreamerInfo::BuildOld() (TStreamerInfo.cxx:2000)
==26713==    by 0xEAF0924: TStreamerInfo::Build() (TStreamerInfo.cxx:635)
==26713==    by 0xEA70457: TBufferFile::WriteClassBuffer(TClass const*, void*) (TBufferFile.cxx:3934)
==26713==    by 0xF53207C: TClass::StreamerStreamerInfo(TClass const*, void*, TBuffer&, TClass const*) (TClass.cxx:6414)
==26713==    by 0xDBF1D66: TClass::Streamer(void*, TBuffer&, TClass const*) const (TClass.h:547)
==26713==    by 0xEA6C9ED: TBufferFile::WriteObjectClass(void const*, TClass const*) (TBufferFile.cxx:2588)
==26713==    by 0xEA6CC08: TBufferFile::WriteObjectAny(void const*, TClass const*) (TBufferFile.cxx:2646)
==26713==    by 0xEA6BEE1: TBufferFile::WriteFastArray(void**, TClass const*, int, bool, TMemberStreamer*) (TBufferFile.cxx:2331)
==26713==    by 0xEC99043: int TStreamerInfo::WriteBufferAux<char**>(TBuffer&, char** const&, TStreamerInfo::TCompInfo* const*, int, int, int, int, int) (TStreamerInfoWriteBuffer.cxx:449)
==26713==    by 0xEB0DA11: TStreamerInfoActions::GenericWriteAction(TBuffer&, void*, TStreamerInfoActions::TConfiguration const*) (TStreamerInfoActions.cxx:174)
==26713==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==26713==

 *** Break *** segmentation violation

pcanal · April 8, 2017, 4:31pm

Can you remind me which version of ROOT you are using?

jwimberl · April 8, 2017, 4:40pm

I’m using 6.08/02 – hopefully the cause of the error isn’t because of some error I’m making somewhere else. I get these errors when directly adding one my classes (of which the GSL structs are members of members) to a TFile and saving it, or when saving it through a RooWorkspace.

pcanal · April 8, 2017, 8:07pm

Hi,

This is odd since there is already a protection against nullptr derefence.

Back to gdb, what does gdb shows as the content of error line?

Cheers,
Philippe.

jwimberl · April 10, 2017, 1:05pm

I printed all the GDB information that I have, unfortunately – “list” tells me nothing, and gdb just reports line 74 as the issue I’m attaching gdb to a PyRoot script, so perhaps the intermediate python layer limits the available debug information. It is odd that valgrind says there is a null pointer when GDB does not. I’m trying to compile a equivalent C++ program and investigate this in more detail – I’ll post here when I have more information. Thanks again for all of your help!

system · April 24, 2017, 1:05pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.