Subtracting bad events using RDataframe

Hi, I would like to subtract some unwanted events to recreate a leaf using:

	auto new_events = df.Define("unwanted_events", "Take(myleaf1, myleaf1.size() > myleaf2.size())") // exclude these events
					    .Define("total_events", "myleaf1")
		   		        .Define("new_events", "total_events - unwanted_events"); // the problem is here

	auto h = new_events.Histo1D("new_events");

I am getting the error:

Cannot call operator - on vectors of different sizes.

I understand that both vectors total_events and unwanted_events have different sizes, but what other way I can use to subtract them?


EDIT: example of data:

Notice event 4037. I want the whole event or preferably only particle [3] out.

I’m not sure I understand the use of Take(). To me it looks like you’d like to just filter some events, like this

df.Define("isBadEvent", "myleaf1.size() > myleaf2.size()")

Hi @jblomer

It’s to define the bad events then filter for them.

I tried:

auto c1 = df.Define("unwanted_events", "Take(myleaf1, myleaf1.size() > myleaf2.size())") // exclude these events
					    .Define("total_events", "myleaf1") //original leaf
		   		        .Filter("!unwanted_events") // filter bad events out

	auto h = c1.Histo1D("new_events"); // draw original leaf (bad events excluded)

But I am getting an error:

error: no viable conversion from returned value of type 'RVec<unsigned int>' to
      function return type 'bool'

EDIT: Actually it’s one bad particle located in index [3] of some events. So if I can say somehow, exclude myleaf1[3] and return myleaf1, that would be it. I managed to define those indices using first code line, but I am so far unable to filter it out.

If I understand correctly, you should be able to do something like this

df.Define("cleaned_myleaf1", "myleaf1[ myleaf1 < 42 ]")

What this does is it creates a new vector column “cleaned_myleaf1” that has only those elements from “myleaf1” for which the condition element < 42 holds. Of course you’d need to adjust the condition myleaf1 < 42 to whatever renders a particle as “good particle”.

Thanks for your answer.

Correct, and what this does is, defining for the first column, which is events (see photo above). What I would like to do is to control the index (column 2). So maybe something like:

df.Define("cleaned_myleaf1", "myleaf1[3].empty()")

So this should take out all index 3 and return column. But unfortunately it does not work.

I hope it is more clear now what I want to do :slight_smile:

Oh, now I also understand the use of Take. So that should work

df.Define("cleaned_myleaf1", "Take(myleaf1, myleaf2.size())");

I.e. cleaned_myleaf1 takes only the first elements of myleaf1, where “the first” are as many as the size of myleaf2.

Actually it is my fault. I was defining bad events. Instead, I could define good events and see what to do with them.

Apologies and thanks for your help.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.