ROOT buffers and Numpy

jfcaron · September 27, 2012, 10:01pm

This question has been posed many times, and I found several solutions to the problem of converting ROOT buffers (like returned from a TGraph.GetY()) into a usable Numpy array. Could an expert chime in regarding the differences between the following three ways of proceeding? Are they completely equivalent, or is there some subtle difference?

In the code examples tgraph is a TGraph filled with Double_t values, so that tgraph.GetY() returns a “<read-write buffer ptr 0x… >” object.

y_buff = tgraph.GetY() N = tgraph.GetN() y_arr = numpy.frombuffer(y_buff,count=N)

y_buff = tgraph.GetY() N = tgraph.GetN() y_arr = numpy.ndarray(N,'d',y_buff)

y_buff = tgraph.GetY() N = tgraph.GetN() y_buff.SetSize(N) yarr = numpy.array(y_buff,copy=True)

I tried using the first two for my purposes, but I found inconsistent behavior (worked some of the time, mostly with hand-typed example rather than in scripts), only the third way works consistently for me.

Edit: relevant old posts:

wlav · September 28, 2012, 3:04pm

Hi,

what does “inconsistent behavior” mean? The first two options use the buffer as their underlying memory on which they provide the array interface. Iow., if the TGraph modifies the original buffer, the numpy array will reflect that change. Are there further operations on the TGraph that could cause it to change the Y values? Or does it perhaps get deleted (and the array with it)? The last option creates an independent array for numpy’s use.

Cheers,
Wim

jfcaron · September 28, 2012, 6:16pm

In my code I take existing TGraphs, then I would GetY() and GetX() and perform some transformation on the values using numpy, then I’d put the transformed arrays back into a TGraph constructor to make the transformed graph.

If I was using one of the first two options, with the numpy array and the original TGraph sharing the same memory, I guess the transformations would change the TGraph? Or does GetY() return a copy of the values in the graph?

When I use a numpy array whose memory is shared with a TGraph to construct another TGraph, are the values copied then, or would the two TGraphs now share the same data?

I think my “inconsistent behavior” was because I would run my script, get unexpected results, then try doing the same thing by typing in the interpreter, getting the expected ones. The difference is likely that I use a lot more temporary values while doing things manually, so there is less room for accidentally getting TGraphs deleted and such.

I’ll use the third option from now on, since it does what I really want to do: get values from a graph in order to do transformations, without keeping them coupled. Thanks for the answers.

wlav · September 28, 2012, 10:37pm

Hi,

[quote=“jfcaron”]Or does GetY() return a copy of the values in the graph?[/quote]From the code, it simply returns the pointer to the internal array:Double_t *GetY() const {return fY;} ... Double_t *fY; //[fNpoints] array of Y points

That really does sound like the original TGraph had been deleted: when typing into the interpreter, the variables will all be in the global scope, and presumably stay alive.

Cheers,
Wim

steve12 · December 24, 2012, 5:30am

[quote=“jfcaron”]This question has been posed many times, and I found several solutions to the problem of converting ROOT buffers (like returned from a TGraph.GetY()) into a usable Numpy array. Could an expert chime in regarding the differences between the following three ways of proceeding? Are they completely equivalent, or is there some subtle difference?

In the code examples tgraph is a TGraph filled with Double_t values, so that tgraph.GetY() returns a “<read-write buffer ptr 0x… >” object.

y_buff = tgraph.GetY() N = tgraph.GetN() y_arr = numpy.frombuffer(y_buff,count=N)

y_buff = tgraph.GetY() N = tgraph.GetN() y_arr = numpy.ndarray(N,'d',y_buff)

y_buff = tgraph.GetY() N = tgraph.GetN() y_buff.SetSize(N) yarr = numpy.array(y_buff,copy=True)

I tried using the first two for my purposes, but I found inconsistent behavior (worked some of the time, mostly with hand-typed example rather than in scripts), only the third way works consistently for me.

Edit: relevant old posts:

[url=http://sqlfury.com]Problem reading from buffer with numpy
https://root-forum.cern.ch/t/convert-of-root-double-t-buffer-to-list-numpy/14451/1[/quote]

The format of these values is Left channel MSB, LSB, Right channel MSB, LSB, … and so on. So, to convert this into a numpy array, here’s my function:

def bufToNumpy( inBuf ):
bufLength = len(inBuf)
outBuf = np.zeros(bufLength/2)
for i in range(bufLength/2):
outBuf[i] = struct.unpack('h', inBuf[2*i] + inBuf[2*i+1])[0]
return outBuf