Problem with RDataFrame Histo1D with vector inputs and weighs

I have a tree with these two branches:

*............................................................................*
*Br    2 :HGamTruthHiggsBosonsAuxDyn.pt : vector<float>                      *
*Entries :    20000 : Total  Size=     387205 bytes  File Size  =     168111 *
*Baskets :      200 : Basket Size=       3584 bytes  Compression=   2.28     *
*............................................................................*

and

******************************************************************************
*Br    0 :HGamEventInfoAuxDyn.weightInitial :                                *
*         | HGamEventInfoAuxDyn.weightInitial/F                              *
*Entries :    20000 : Total  Size=     106417 bytes  File Size  =     102000 *
*Baskets :      200 : Basket Size=       1536 bytes  Compression=   1.00     *
*............................................................................*

Unforunately the first is a vector, but it has always exactly one element.

If I make an histogram of the quantity, without considering weights

df.Histo1D({"", "", 100, -10E3, 800E3}, "HGamTruthHiggsBosonsAuxDyn.pt").GetValue()

it works smootly. I guess that the logic is the same as the TTree::Draw: for each event it loops on all the element of the vector (in my case I have only and always one element).

Now, with weights:

df.Histo1D({"", "", 100, -10E3, 800E3}, "HGamTruthHiggsBosonsAuxDyn.pt", "HGamEventInfoAuxDyn.weightInitial").GetValue()

it triggers a bunch or error that saying that it cannot resolve the function (attached)
error2.txt (11.7 KB)

So: why? Why it cannot handle a std::vector<float> when using weights? I would expect the same capabilities with/without weights.

I tried to solve this problem in the following way:

df.Define("ptH", "HGamTruthHiggsBosonsAuxDyn.pt[0]").Histo1D({"", "", 100, -10E3, 800E3}, "ptH", "HGamEventInfoAuxDyn.weightInitial").GetValue()

seems to work, but quite ugly that with/without weights I have different behaviour.


_ROOT Version:6.16/00
Platform: Not Provided
Compiler: Not Provided


Hi @wiso,
The difference in the two snippets you provided is the type of the argument you are passing to Histo1D. I suggest you take a look at the docs: you have to pass a container-like object to the function. So HGamTruthHiggsBosonsAuxDyn.pt is fine, but HGamTruthHiggsBosonsAuxDyn.pt[0] is the value of the first element, not the vector itself. Passing the first element to Define like you did is a quick workaround.
Best,
Vincenzo

Update:
I believe that the issue comes down to this

note: candidate template ignored: requirement ‘IsContainer::value’ was not satisfied [with T = ROOT::VecOps::RVec, W = float]
void Exec(unsigned int slot, const T &vs, const W &ws)

So I assume that there is an internal cast of the value of the first column to an RVec (which is an std::vector-like container) and that’s why you need to redefine the column first.

Hello Valerio, thanks for the answer. I am not sure why you said that the arguments are different. Actually there was a typo in my original post. I am passing {"", "", 100, -10E3, 800E3} in both cases.

With or without the weights the arguments are identical (excep the addition of the weight argument).

Or are you saying that the type of the variable I am plotting and the type of the weight should be the same (or at least the same dimentionality)? I guess that the the weights (which are float, e.g. 0D) should be broadcasted to the shape of the the variable (which is a 1D vector)

Hi,
can you check whether you still see this behavior in v6.18 please?

actually not: I am using atlas software, which is needed to open my root file, and atlas does not support 6.18 yet.

So I have created a dummy ntuple with hvector.C, where I have added a float branch x.

With 6.18

******************************************************************************
*Tree    :tvec      : Tree with vectors                                      *
*Entries :    25000 : Total =         4311724 bytes  File  Size =    3216373 *
*        :          : Tree compression factor =   1.34                       *
******************************************************************************
*Br    0 :vpx       : vector<float>                                          *
*Entries :    25000 : Total  Size=    1052577 bytes  File Size  =     804472 *
*Baskets :       36 : Basket Size=      32000 bytes  Compression=   1.31     *
*............................................................................*
*Br    4 :x         : x/F                                                    *
*Entries :    25000 : Total  Size=     100741 bytes  File Size  =      30047 *
*Baskets :        4 : Basket Size=      32000 bytes  Compression=   3.34     *
*............................................................................*

ROOT::RDataFrame df("tvec", "hvector.root")
auto h2 = df.Histo1D({"", "", 100, -10, 10}, "vpx", "x").GetValue()

seems to work

Alright, thanks for checking.
I believe this was ROOT-10092 (does it look like ROOT-10092 to you, or am I missing something?) ROOT-9985.
I will check whether it’s possible to backport the fix to v6.16 and make it available in the next patch release.

In the meanwhile the simplest workaround is the one you already found, using a helper Define.

Cheers,
Enrico

Exactly, thanks. <filling to get 20 chars>

Errata: the relevant jira ticket is ROOT-9985 (array of values, scalar weight) and the fix is already in v6-16-00-patches (i.e. it will be available in v6.16/02 when released).

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.