Persistent 2D arrays

filimon · July 4, 2011, 2:37pm

Hi, I have a
class {
Int_t fRows, fColumns
Float_t** fMatrix; //[fRow][fColumns]
};

fMatrix is allocated the usual way ie
fMatrix=new Float_t*[fRows];
for(i=0; i<fRows; ++i) fMatrix[i]=new Float_t[fColumns];
I assume the streamer is not correct since there is no guarantee that the allocation of each row is done continuously in memory. Indeed I do not get correct results… What would be the best way to achieve the indented result ie store a matrix?
Is there any lighter version of TMatrixF? Thanks,
filimon

pcanal · July 5, 2011, 1:36pm

Hi,

The syntax://[fRow][fColumns]is not supported and you would need to allocate a single array:Int_t fRows, fColumns Int_t fElements; // always = to fRows * fColumns Float_t* fMatrix; //[fElements]The easiest solution is to use a vector<vector<Float_t> > (for which you will need to provide the dictionary).

Cheers,
Philippe.

filimon · July 5, 2011, 5:28pm

Hi Philipe,
using vector is not a good solution for me because I cannot afford the extra overhead of saving it on the disk. I will try to implement it with a flat array but since the discussion started I have 2 comments/questions that may also solve the problem in a different away and help me understand.

I get a seg viol when I instantiate objects in a TClonesArray::New(idx) containining a Int_t** fArray; //[fSize]. fSize (=frow*fcol) is set to 0 before this TClonesArray statement (in the initialization list which also contains fArray(0)). How to treat the *ptr initialization properly then? The solution I randomly found involves some ugly new Int_t[0] and new Int_t[0] (where 0 is actually the values of the “fSize” or equivalent in the default ctor.
Checking user guide page 172 bottom I see that in principle double indirection could be used. Assuming the above problem gets solved, how is this compatible with the fact that each row is assigned in a non-continuous address on my original code snippset? (I assume it is not unless your streamer follows the pointers one by one in the first indirection). Should I do the trick of
int *a = new int[m];
a[0] = new int[m*n];
for ( int i = 1 ; i < m ; i++) a[i] = a[i-1] + n;
which allows continuous allocation? If done this way, would it work? Thanks,
filimon

pcanal · July 5, 2011, 5:58pm

[quote]using vector is not a good solution for me because I cannot afford the extra overhead of saving it on the disk.[/quote]Why? What is the extra cost that push past your size budget?

Int_t** fArray; //[fSize]ROOT can not really handle this construct as it has no way of knowing the size of the inner arrays (in the case of object, the size is assumed to always be 1).

2. Checking user guide page 172 bottom I see that in principle double indirection could be used.Yes as long as the inner pointer is not allocated as an array (so new MyObject() and/or new int).

The non-vector solution is something like:Int_t fRows, fColumns Int_t fElements; // always = to fRows * fColumns Float_t* fMatrix; //[fElements] .... fMatrix = new Float_t[fElements]; .... float GetElement(Int_t rows, Int_t columns) { return fMatrix[ rows*fRows + columns ]; // or vice et versa }

Cheers,
Philippe.

filimon · July 6, 2011, 10:58am

The data structure is supposed to be digits of hits and/or events. Especially in the case of hits where each array can hold <10 entries (but then many such arrays) the extra overhead of a double vector seems a little bit too much to me…
In any case, it is now implemented with the well known way of [i][j]==i*ncol+j
Then just to be clear, the only possible interpretation by the ROOT streamer at the moment is that your example construct
Int_t** fArray; //[fSize]
will actually store fSize addresses to a file since there is no way for the streamer to follow the allocated addresses., right? It is not clear to me why this limitation exists though… There is the parsing possibility obviously of comments like //[fSizeRow, fSizeCol] already used in the Double32_t case. Then I assume the streamer already handles “deep” following of pointers at some point inside streaming objects of arbitrary types anyway, assuming ROOT streaming seems quite complex . So putting it all together it should in principle work, or do I miss something fundamental (this is not a feature request, just for my understanding…)
Finally, my use case would be to do something like myTree->Draw(“fArray[myrow][mycol]”). This is already doable for matrices of many indices with compile-time constant dimensions but not possible through the way we discuss, since ROOT TTree correctly sees my array as one dimensional now (with the flattening trick that you proposed). The workaround it to provide inline member functions that do the index calculation (done already), so to have myTree->Draw("<class_that_contains_the_fArray>.ArrayValAt(row, col)"). cheers,
filimon

pcanal · July 6, 2011, 7:41pm

[quote]Especially in the case of hits where each array can hold <10 entries (but then many such arrays) the extra overhead of a double vector seems a little bit too much to me…[/quote]Humm … so you are worried about the size in memory? If so I agree (if you have a lot of them).

[quote]Int_t** fArray; //[fSize]
will actually store fSize addresses to a file since there is no way for the streamer to follow the allocated addresses., right?[/quote]Not quite, it would follow the pointer but only for exactly one element … because there is no (implemented) way to know the size (Indeed we could add support for this case via “//[fRow][fColumns]” but this is not implemented).

[quote] There is the parsing possibility obviously of comments like //[fSizeRow, fSizeCol] already used in the Double32_t case.[/quote]Yes, it is implementable … it just has not been done so far.

[quote]So putting it all together it should in principle work[/quote]Yes

[quote]This is already doable for matrices of many indices with compile-time constant dimensions but not possible through the way we discuss, since ROOT TTree correctly sees my array as one dimensional now (with the flattening trick that you proposed).[/quote]You are correct.

Cheers,
Philippe.