I have a tree which contains events which consist of arrays with multiple values. After some filtering with the RDataFrame functionalies I’d like to make plots with matplotlib. I run into problems when I want to convert my data into numpy arrays.
{'mycolumn': ndarray([<cppyy.gbl.ROOT.VecOps.RVec<float> object at 0x145885600>,
<cppyy.gbl.ROOT.VecOps.RVec<float> object at 0x145885628>,
<cppyy.gbl.ROOT.VecOps.RVec<float> object at 0x145885650>,
...............)
Hi @Tim_Buktu ,
you data contains, for every event, collections of possibly different sizes. That cannot be described by numpy arrays, which must be rectangular, so we return a numpy array of RVecs instead.
You can loop over that numpy array of RVecs and convert it to a list of numpy arrays, it should be just:
[numpy.array(v).tolist() for v in filtered_data]
Cheers,
Enrico
P.S.
the concatenation will not give the list of lists that you ask for, but rather a single flattened list of all the elements. I can’t tell where the error comes from, but you can run your code through the python debugger and see where the zero-dimensional numpy array comes from.
Yes sorry, the code was just to give you an idea. As filtered_data is a dictionary, for v in filtered_data loops over the keys of the dictionary. The correct code should be (it might need minor adjustments, it’s just to give you the idea):
[numpy.array(v).tolist() for v in filtered_data["mycolumn"]]
Thank you very much it works now. I should have remembered how dicts work!
Maybe since we are at it:
How would I plot it using only root?
So I do my filtering and then I want to plot one of the filtered events using root. And let us assume all events consists of an array with 1000 elements. It should work with TGraph somehow I guess.