This might be a misunderstanding, rather than a bug.
I examined the file from CERNBox (https://cernbox.cern.ch/s/4mOHVGU509efdAS
) and created a new one using @umute97’s reproducer (https://root-forum.cern.ch/t/wrong-branch-type-detection/59236/19
). In both cases, I see a TTree with TBranches whose type is double[]
(variable length arrays), and they have a common counter TLeaf in a TBranch named n
. They are not std::vector<double>
because Uproot does not write this data type. (https://github.com/scikit-hep/uproot5/issues/257
has been open for a long time, but it would be a big project.)
The variable length arrays happen to all have the same length, 1024, but that’s because the Awkward Arrays were constructed this way, with a for loop over chunk_size
filling a Pandas DataFrame with dtype=object
(Python lists, not arrays), and then that was converted into an Awkward Array (by iteration over the Python lists in the DataFrame). That is, the array’s type is
10 * var * float64
instead of
10 * 1024 * float64
If you wanted arrays of fixed-size data, in which 1024 is part of the data type, you could construct it with NumPy and pass that to Awkward, or just convert the Awkward data ak.to_regular (https://awkward-array.org/doc/main/reference/generated/ak.to_regular.html
) after the fact, replacing
output_file[tree] = {"": ak.zip(data)}
with
output_file[tree] = {k: ak.to_regular(v) for k, v in data.items()}
(I also made the dict explicit, instead of concatenating ""
to record field names.) With a construction like that, the ROOT file would be filled with
******************************************************************************
*Tree :tree : *
*Entries : 10 : Total = 165306 bytes File Size = 156980 *
* : : Tree compression factor = 1.05 *
******************************************************************************
*Br 0 :w0 : w0[1024]/D *
*Entries : 10 : Total Size= 82482 bytes File Size = 77811 *
*Baskets : 1 : Basket Size= 32000 bytes Compression= 1.05 *
*............................................................................*
*Br 1 :w1 : w1[1024]/D *
*Entries : 10 : Total Size= 82482 bytes File Size = 77826 *
*Baskets : 1 : Basket Size= 32000 bytes Compression= 1.05 *
*............................................................................*
instead of
******************************************************************************
*Tree :tree : *
*Entries : 10 : Total = 165973 bytes File Size = 157659 *
* : : Tree compression factor = 1.05 *
******************************************************************************
*Br 0 :n : n/I *
*Entries : 10 : Total Size= 577 bytes File Size = 92 *
*Baskets : 1 : Basket Size= 32000 bytes Compression= 1.17 *
*............................................................................*
*Br 1 :w0 : w0/D *
*Entries : 10 : Total Size= 82590 bytes File Size = 77881 *
*Baskets : 1 : Basket Size= 32000 bytes Compression= 1.05 *
*............................................................................*
*Br 2 :w1 : w1/D *
*Entries : 10 : Total Size= 82590 bytes File Size = 77872 *
*Baskets : 1 : Basket Size= 32000 bytes Compression= 1.05 *
*............................................................................*
if that’s what you’re trying to do.
But re-reading the whole thread again, it doesn’t seem to be about fixed-length versus variable-length types; it seems to be about double[]
arrays versus std::vector<double>
(both of which are variable-length types). Uproot doesn’t write std::vector
, and the tree.show()
table shows arrays, not vectors, as the C++ type (middle column):
>>> tree.show()
name | typename | interpretation
---------------------+--------------------------+-------------------------------
n | int32_t | AsDtype('>i4')
w0 | double[] | AsJagged(AsDtype('>f8'))
w1 | double[] | AsJagged(AsDtype('>f8'))
w10 | double[] | AsJagged(AsDtype('>f8'))
w11 | double[] | AsJagged(AsDtype('>f8'))
w12 | double[] | AsJagged(AsDtype('>f8'))
w13 | double[] | AsJagged(AsDtype('>f8'))
w14 | double[] | AsJagged(AsDtype('>f8'))
w15 | double[] | AsJagged(AsDtype('>f8'))
w2 | double[] | AsJagged(AsDtype('>f8'))
w3 | double[] | AsJagged(AsDtype('>f8'))
w4 | double[] | AsJagged(AsDtype('>f8'))
w5 | double[] | AsJagged(AsDtype('>f8'))
w6 | double[] | AsJagged(AsDtype('>f8'))
w7 | double[] | AsJagged(AsDtype('>f8'))
w8 | double[] | AsJagged(AsDtype('>f8'))
w9 | double[] | AsJagged(AsDtype('>f8'))
trg0 | double[] | AsJagged(AsDtype('>f8'))
trg1 | double[] | AsJagged(AsDtype('>f8'))
(Dang, I had to go back and remove a bunch of links, too!)