Extending arrays in PyROOT

tianlu · August 21, 2013, 4:08pm

So I’ve been following the method outlined in the ROOT manual for storing lists of primitives by using Python’s array module. So the following code:

x = array.array('i',[0]*10)
t = ROOT.TTree()
t.Branch('x', x, 'x[4]/I')

x[1]=1
x[2]=2
x[3]=3

t.Fill()

will fill the branch “x” with the first 4 elements of x (i.e. [0,1,2,3]). But if I modify the last part to

x.extend([0]*10)
t.Fill()

the branch “x” is filled with gibberish. Is this a known effect? I checked the memory address of
x using id(x) and it remains the same before and after the x.extend call. It’d be quite useful to be able to extend the array when the size of the number of elements that’s being filled is unknown. However, perhaps there’s a better class for handling list-like branches in PyROOT?

honk · August 21, 2013, 6:05pm

AFAIK this is not how arrays work on the C++ side (e.g. you already promised the TTree that x had 4 elements). A better solution would be a vector.

import ROOT as r
x = r.vector('int')(4)
t = r.TTree()
t.Branch('x', x)

x[0] = 0
x[1] = 1
x[2] = 2
x[3] = 3
t.Fill()

x.resize(10)
t.Fill()

tianlu · August 21, 2013, 8:26pm

I see. I wasn’t aware of the stdlib classes in PyROOT (duh), but I think that works. I’ve just been using large arrays so far. This seems better functionality wise, but maybe not so much performance wise if we’re calling resize a bunch. Thanks for the help.

tianlu · August 21, 2013, 8:56pm

I’m still curious as to why resizing the array in python should cause issues when TTrees are filled. It’s not resizing the (C++) array in the branch, and the address of the python array remains unchanged… or does it?

honk · August 21, 2013, 9:15pm

On the C++ side arrays have a fixed size determined at compile time (whatever that means for CINT and the thing parsing your branch constructor). This size can never change: if you say you’ll take a size of 4 that’s what you’ll have forever.

Python’s array class is something completely different and seems to mostly map on std::vector functionalitywise. I suggested you to use the CINT facade for a vector to avoid any friction on the interface between the languages, but it might be that Python’s arrays can be used with some appropriate branch type (probably not C arrays). Note that vector::resize can be used both to make a vector larger as well as smaller.

wlav · August 26, 2013, 6:25pm

Hi,

[quote=“tianlu”]I checked the memory address of x using id(x) and it remains the same before and after the x.extend call.[/quote]but that does not actually check the address of the array: it only confirms that the wrapper object on the python side that holds the array is still the same (as it should be as all python objects are held as references).

To see the underlying address of the held array, look at the result of the buffer_info() member function. Note that underneath, the extend call does the equivalent of a realloc. I.e., sometimes the address may remain the same, if there is additional memory available directly after the buffer, in other times the whole array gets copied over to a new, larger, memory location.

Cheers,
Wim