Bool branch becomes python object with RDataFrame.AsNumpy()


ROOT Version: 6.36.04
Platform: macosxarm64
Compiler: Apple clang version 17.0.0 (clang-1700.0.13.3)

Python Version: 3.13.7


Hi all,

I see some unexpected behaviour when creating a NumPy arrays dictionary from a RDataFrame. It seems the same reported here, but I assume it was fixed after. Basically the boolean array from RDataFrame is returned as object dtype numpy array. The issue is seen only when the RDataFrame is created from a tree.

Minimal reproducer in python:

>>> import ROOT
>>> df = ROOT.RDataFrame(10).Define('e', 'rdfentry_').Define('b', 'rdfentry_ == 1')
>>> aa = df.AsNumpy()
>>> aa['b'].dtype
dtype('bool') # as expected
>>> df.Snapshot("temp","temp.root")
<cppyy.gbl.ROOT.RDF.RResultPtr<ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> > object at 0x600000593a20>

## close and reopen python

>>> import ROOT
>>> df = ROOT.RDataFrame("temp","temp.root")
>>> aa = df.AsNumpy()
>>> aa['b'].dtype
dtype('O') # not expected

Thanks in advance for the help

cheers,

Federico

Hi @fbetti ,

Thanks for reaching out to the forum! I’m not surprised that the example you show does not work because the input type of the column when you snapshot it to disk with TTree will be Bool_t :


>>> df = ROOT.RDataFrame(10).Define('e', 'rdfentry_').Define('b', 'rdfentry_ == 1')
>>> snap = df.Snapshot("temp","temp.root")
>>> snap.GetColumnType("b")
'Bool_t'

Whereas the issue that you report was fixed in this PR [PyROOT][RDF] Support conversion of `bool` columns to NumPy arrays by guitargeek · Pull Request #15180 · root-project/root · GitHub by simply considering the case where the input column is bool. So we need to update the logic to also include the Bool_t wording, or probably find a better logic, to be seen.

I have created this issue to keep track of progress Also consider `Bool_t` in numpy array conversion · Issue #20081 · root-project/root · GitHub

Cheers,

Vincenzo

Hi Vincenzo,

thanks a lot for the explanation! I was too naive, I was assuming bool and Bool_t were equivalent.

I will follow the github issue.

cheers,

Federico