Problem reading from buffer with numpy

In my TTree I have branch mag[6][172]/D

It works fine with the array module:

ua = array('d', t.mag)
len(ua)

returns 1032. However, reading in this way is very slow, so I decided to give numpy a try.

ua2 = np.frombuffer(t.mag, dtype='d')
len(ua2)

gives 129 elements. Those elements are the same as in the tree, but why only 129? Same problem exists when I try to use numpy ndarray.

It seems that numpy does not know how to handle this buffer - it treats it’s number of elements as number of bytes…

Anyway, I found that converting buffer to an array with the array module is very slow (a few times slower then reading the buffer from the TTree) and I am afraid numpy won’t be much faster. The question is: what is the most efficient way to convert buffer to a python array?

Hi,

didn’t have time to try this out, but the numpy version should be faster, as it shouldn’t copy the buffer (like the array does).

The number 129 comes in from 6*176/sizeof(double) = 129. The problem is that internally, buffers are character-based (i.e. byte-sized) and PyROOT fudges this by changing the stride on access. There have been improvements in the buffer interface for numpy, but that was for python3, with some compatibility in p2.6 and later, and I’ve never taken the time to flesh support for this out in detail in PyROOT. The array OTOH does a normal iteration over the buffer and does not use the C-level buffer interface.

Note that, when it comes to speed, I’ve written off PyROOT completely in favor of CppyyROOT, although I still have to do builtin arrays from TTrees for it (data structures and objects work).

Anyway, again haven’t tried, but adding a “count = len(t.mag)” argument to np.frombuffer() should work.

Cheers,
Wim

Unfortunately “count=len(t.mag)” does not work. It gives: ValueError, buffer is smaller than requested size… So I am still stuck with a very slow buffer conversion :frowning:

An itermediate solution is to use fromiter(), which is ~30% faster than array(). However, still “GetEntry()” only loop on my TTree takes ~5 s, while with filling an array with fromiter it takes ~25 s, which is a very significant time difference.

Hi,

I’ll find some time to look into this in more detail, but what if you first do:t.mag.SetSize(len(t.mag)*8)
Or even easier, set t.mag.SetSize(sys.maxint)? After that, you can no longer iterate over t.mag (not until you reset the size anyway) as it won’t terminate properly, but the frombuffer() with properly given count may work.

Cheers,
Wim

Works :slight_smile: It is about 30% faster then fromiter. However, it is still very slow compared to simply GetEntry(). I guess it may be not simple to improve it, but it seems like it could be a bottle neck for reading from TTree with PyROOT.

Hi,

at issue is that any access into an array from Python involves a lot more work than what is done in C++, where an array access is virtually free (memory needs to be loaded in the CPU, but that’s about it). Anything in CPython requires at least the getting of the buffer, the unpacking of the index, and the packing of the value.

For comparison, simple TTree benchmarks currently run at a range of 1.6 - 2.3x of compiled, optimized C++ in CppyyROOT as it requires no such unwrapping/wrapping. I expect to be able to reach 1x with Cling, as that should allow me to get rid of the stub overhead. CPython will never be able to touch that.

Cheers,
Wim

I understand. So, I’ll wait for the new version with Cling :slight_smile:

Ups, sorry, I made a mistake when checking. It seems that t.mag.SetSize(len(t.mag)*8) does not work - frobuffer still returns 129 elements, and with count=1032 claims that buffer is too small…

No, pypy-cint is available today, just not fully compatible with PyROOT yet (and therefore not announced):[code]% source /afs/.cern.ch/sw/lcg/external/pypy/x86_64-slc5/setup.sh
% pypy-cint
Python 2.7.2 (a10072d752f3+, Jul 22 2012, 23:38:16)
[PyPy 1.9.1-dev0 with GCC 4.3.5] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.

import CppyyROOT as ROOT

etc …[/code]

Note also that this is the .cern.ch afs disk, not the cern.ch one, and is therefore a tad slow. TTree usage works for objects. I’ll get around to numpy or array arrays soon enough.

Another option would be to create the numpy array first, then use SetBranchAddress(). I’ll give it a try.

Cheers,
Wim

Hi,

so the following works for me for CPython and numpy. This write_tree.py:[code]from ROOT import TFile, TTree
import numpy

N = 5000
D1, D2 = 6, 172

f = TFile(“test.root”, “RECREATE”)
t = TTree(“test”, “test tree”)

a = numpy.zeros((D1, D2), dtype=‘d’)
t.Branch(“mag”, a, “mag[%d][%d]/D” % (D1, D2))

for event in range(N):
for i in range(D1):
for j in range(D2):
a[i][j] = N + i*D1 + j
t.Fill()

f.Write()
f.Close()[/code]
and this read_tree.py:[code]from ROOT import TFile
import numpy

D1, D2 = 6, 172

f = TFile(“test.root”)
t = f.test

a = numpy.zeros((D1, D2), dtype=‘d’)
t.SetBranchAddress(“mag”, a)

N = t.GetEntriesFast()
assert N == 5000
for event in range(N):
t.GetEntry()
for i in range(D1):
for j in range(D2):
assert a[i][j] == N + i*D1 + j

f.Close()[/code]
It does not work for pypy-cint, as somewhat expected. I think that the numpypy array does not implement the raw buffer interface at the interpreter level. The use of a normal python array from module array does work, but that’s not nice for a 2-dim array.

EDIT: and if I do implement the code with array on pypy-cint, the writing/reading is fully I/O bound, as opposed to almost CPU-bound on CPython. :slight_smile:

Cheers,
Wim

Hi,

so theoretically SetSize() works for me (i.e. in that it works with frombuffer(). However, numpy.frombuffer() causes some memory overwrite. I’m not sure where, but according to valgrind, numpy free()s the original buffer (maybe its relocating it?), which of course does not work (ROOT will still write in the old array address).

Cheers,
Wim

Well, anyway, setting the branch address to the numpy array works and is nicely manydimensional. Thanks! :slight_smile: