Trying to convert RDF generated histogram into numpy array

Hello,

I have created a histogram from an RDataFrame object df, and I am trying to convert it to a numpy array by using the following commands (as recommended from another post):

hist_mHT = df.Histo1D(("", "", 10, 0, 100),"var")
yvals = hist_mHT.GetArray()
yvals.SetSize(hist_mHT.GetNbinsX())
yvals = np.array(yvals)

but this yields the following error:

yvals.SetSize(hist_mHT.GetNbinsX())
AttributeError: 'cppyy.LowLevelView' object has no attribute 'SetSize'

So I’m guessing I have to convert hist_mHT into a TH1D object first?

Hi @yburkard ,

as far as I can tell after you call yvals = hist_mHT.GetArray() RDataFrame and its result are completely out of the picture.

I guess the question is how to build a numpy array from a cppyy.LolLevelView object:

In [1]: import ROOT
In [2]: h = ROOT.RDataFrame(10).Histo1D("rdfentry_")
In [3]: import numpy as np
In [4]: arr = h.GetArray()
In [5]: type(arr)
Out[5]: cppyy.LowLevelView

Searching the forum I found one method:

arr = h.GetBuffer()
npa = np.ndarray((h.GetNbinsX()+2,), dtype=np.float64, buffer=arr, order='C')

although I’m not sure it’s the best method. @vpadulan or @moneta might know more (in particular why we need GetBuffer instead of GetArray here).

Cheers,
Enrico

1 Like

Hi,

h1.GetBuffer() returns the unbin content of the histogram (the original data) and the weights you have used to fill the histograms (the default is 1).
I guess you want to convert the bin histogram contents in a numpy array. So in this case you need to use h.GetArray().
So here is the correct code, noting that in case the histogram has automatic binning (i.e. it has a buffer), you might ned to call before calling h.GetArray(), hBufferEmpty()

h.BufferEmpty()
arr = h.GetArray()
npa = np.ndarray((h.GetNbinsX()+2,), dtype=np.float64, buffer=arr, order='C')

Cheers

Lorenzo

2 Likes

Hi @yburkard ,

In addition to what the others have already written, which is correct, I see that in the cppyy docs the conversion from LowLevelView to numpy array is done by using the reshape method of the class (which acts as the SetSize method you tried to use in your original example).
Cheers,
Vincenzo

2 Likes

Ah, this was the missing ingredient, thanks!

This worked great, thank you very much!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.