This is not exactly a PyROOT topic. But, I think this may be useful for people who are using numpy and root especially ipython notebook.
I found myself spending a lot of time waiting for reading in large datafile and convert it to numpy array.
So, I wrote a little CPython extension to read in rootfile and convert it to numpy structure array. It’s a C++ extension and call Root library directly so that’s where most of the speed improvement is from.
Tutorial is also given in the package.
Very nice! Would you be interested in including this in rootpy?
Your C extension could greatly speed up these functions.
Hello, now I want to read a dataset from *.root, then deal with it by python and numpy, but there seems to be that root_numpy cannot deal with complex branch, such as a branch is a vector, is it right? do you have some useful suggestion how to read such structure data:
*Tree :TruthTree : Truth Tree *
*Entries : 1000 : Total = 94806134 bytes File Size = 48397513 *
- : : Tree compression factor = 1.94 *
*Br 0 :Event : Event/I *
*Entries : 1000 : Total Size= 7278 bytes File Size = 238 *
*Baskets : 2 : Basket Size= 16017 bytes Compression= 17.45 *
*Br 1 :Event_Number : vector *
*Entries : 1000 : Total Size= 36518 bytes File Size = 4153 *
*Baskets : 2 : Basket Size= 16017 bytes Compression= 5.34 *
*Br 2 :Event_Nparticles : vector *
*Entries : 1000 : Total Size= 30070 bytes File Size = 4977 *
*Baskets : 2 : Basket Size= 16017 bytes Compression= 3.66 *
*Br 3 :Event_ProcessID : vector *
*Entries : 1000 : Total Size= 30062 bytes File Size = 1929 *
*Baskets : 2 : Basket Size= 16017 bytes Compression= 9.43 *
*Br 4 :Event_Weight : vector *
*Entries : 1000 : Total Size= 36518 bytes File Size = 1990 *
*Baskets : 2 : Basket Size= 16017 bytes Compression= 11.15 *
what is the goal? An std::vector can be read as an std::vector. For it to be used as a numpy array, it would need to be copied over. The problem with vectors is that they’re templates, so you’d need a converter method for each different type if you’d want that code to be in C++ such as in done in root_numpy.
My goal is read the data from a tree (its structure I have posted before), then using this data to do some machine learning tasks.
Now the urgent task is read the data and put them into a array.
I’m not familiar with ROOT and PyROOT, I want to know whether there is an easy method can read the data directly. thx
something simple like this should work for access:
for event in mytree:
npart = event.Event_Nparticles
for N in npart:
Not too efficient (and can’t be in pure python, but that’s why I’m working on CppyyROOT). There’s no way to get to the underlying memory of an std::vector, so element-wise copy is the only option. You could hand it, since it’s iterable, to a python array, though:[code]>>> from ROOT import std
array(‘i’, [0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Like Wim said, each for vector type needs to be instantiated by hand.
I just added support for vector of couple frequently use type(int float double long char) in the HEAD version.
This is done by the magic of memcpy and &((*v)) guaranteeing contiguous array.