Filtering individual elements of vector branch with RDataFrame

Hi!

Using a RDataFrame is there a way to filter a dataframe according to individual elements of a vector branch? (Rather than creating many Define())

I have tried RVecOps::All() and RVecOps::Any(), in this manner ROOT.RDataFrame(…).Filter(“ROOT::VecOps::Any(PID==-11)”), but this does not produce the desired outcome.

I.e. Given a vector branch like this:

+-----+---------------+----------+
| Row | N             |   PID    |
+-----+---------------+----------+
| 0   | 8             | -11      |
|     |               | -11      |
|     |               | 22       |
|     |               | 11       |
|     |               | 22       |
|     |               | 11       |
+-----+---------------+----------+

What I want to do is to select for each row, PID==-11:

+-----+---------------+----------+
| Row | N             |   PID    |
+-----+---------------+----------+
| 0   | 8             | -11      |
|     |               | -11      |
+-----+---------------+----------+

Many thanks for any help!

Hi @hepFanatic ,

the tutorials should help, as well as the RDF users guide (in particular this section).
Filter selects or discards whole events, in your case you probably want to Define a mask with the condition you want that you can then apply as needed:

// for every event, pid_mask is a boolean array that you can use to select just the array elements you want 
df.Define("pid_mask", "PID == -11")
  .Define("my_selected_vector", "some_vector[pid_mask]") // for example

I hope this helps!
Cheers,
Enrico

1 Like

Thanks Enrico, that’s a little clearer for me now :smiley:

I was wondering if Filter can partially filter out an event but I guess not?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.