If I have an RDataFrame
with a Bool_t
column, AsNumpy
fails:
df.AsNumpy()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-10-e856f5516a02> in <module>()
----> 1 df.AsNumpy()
/Applications/root_build/lib/ROOT.pyc in _RDataFrameAsNumpy(df, columns, exclude)
429 else:
430 tmp = numpy.empty(len(cpp_reference), dtype=numpy.object)
--> 431 for i, x in enumerate(cpp_reference):
432 tmp[i] = x # This creates only the wrapping of the objects and does not copy.
433 py_arrays[column] = ndarray(tmp, result_ptrs[column])
AttributeError: 'vector<bool>' object has no attribute 'data'
Reproducer below.
First, we create the data frame with an integer column and a boolean column
In [1]: import ROOT
In [2]: df = ROOT.RDataFrame(10).Define('e', 'rdfentry_').Define('b', 'rdfentry_ == 1')
Then we verify the types of the columns:
In [6]: df.Snapshot('temp', 'temp.root')
Out[6]: <ROOT.ROOT::RDF::RResultPtr<ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> > object at 0x7fe77bdf3fb0>
In [7]: f = ROOT.TFile.Open('temp.root')
In [8]: t = f.Get('temp')
In [9]: for b in t.GetListOfBranches():
...: print b.GetName(), t.GetLeaf(b.GetName()).GetTypeName()
...:
e ULong64_t
b Bool_t
Then we try to convert to numpy, and notice that it works with the ULong64_t
column but not the Bool_t
column:
In [10]: df.AsNumpy()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-10-e856f5516a02> in <module>()
----> 1 df.AsNumpy()
/Applications/root_build/lib/ROOT.pyc in _RDataFrameAsNumpy(df, columns, exclude)
429 else:
430 tmp = numpy.empty(len(cpp_reference), dtype=numpy.object)
--> 431 for i, x in enumerate(cpp_reference):
432 tmp[i] = x # This creates only the wrapping of the objects and does not copy.
433 py_arrays[column] = ndarray(tmp, result_ptrs[column])
AttributeError: 'vector<bool>' object has no attribute 'data'
In [11]: df.AsNumpy(columns=['e'])
Out[11]: {'e': ndarray([0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L], dtype=object)}
ROOT Version: master
Platform: macOS
Compiler: Not Provided