Getting photon with maximum momentum?

I am trying to access the photon with maximum momentum from a root data frame. Can someone explain why this command does not accomplish this? I am receiving errors that the filter can not be interpreted as a boolean.

    df = df.Alias("Photon0", "Photon#0.index")
    df = df.Define("photons_all", "FCCAnalyses::ReconstructedParticle::get(Photon0, ReconstructedParticles)")
    df = df.Define("photons_p", "FCCAnalyses::ReconstructedParticle::get_p(photons_all)")
    max_p = (df.Max("photons_p").GetValue())

    # Filter the DataFrame to include only rows with the maximum photon momentum
    df_max_photons = df.Filter("photons_p == max_p")

Welcome to the ROOT Forum!
Maybe I’m wrong, @vpadulan can correct me, but maybe you have to add:

df.Define("max_p ", max_p)

Dear @somebody.nobody ,

First, a caveat. By calling GetValue and then perform new operations on the same histogram, you are triggering a computation graph run (i.e. an event loop) and then at some point you will call another, so you have more than one event loop over the same dataset. I believe in your case this is unavoidable, since you need to compute a quantity (the maximum pt) over all the events in the dataset first.

Back to your question, @bellenot is right in the sense that your Filter expression "photons_p == max_p" supposes that the quantity max_p is reachable by RDataFrame somehow, but you have given no way for the RDataFrame to know about it. The variable max_p in your snippet is just a normal Python variable holding a value (in this case a float representing the maximum pt of your events). So this quantity needs to be declared to RDataFrame somehow. @bellenot is showing you one way, which is defining a new column in the dataset with this quantity. Another way which avoids repeating the same float for each event is as follows:

max_p = (df.Max("photons_p").GetValue())

df_max_photons = df.Filter(f"photons_p == {max_p}")

Where practically you are inserting the constant float value held by max_p in the string expression, practically embedding it inside the expression that will be JIT-compiled by cling.

Cheers,
Vincenzo

2 Likes

I think the program may be running into an issue on the line max_p = (df.Max(“photons_p”).GetValue()).
I get an extended error message, but the key problem seems to be here: /tmp/root/spack-stage/spack-stage-root-6.28.06-dgx5r6vwya5ynyeoef36d2aw6vk6v2jc/spack-build-dgx5r6v/include/ROOT/RDF/InterfaceUtils.hxx:313:4: error: static_assert failed due to requirement ‘std::is_convertible<ROOT::VecOps::RVec, bool>::value’ “filter expression returns a type that is not
convertible to bool”

The error talks about a filter expression, so it’s hard that the problem is at the Max call. Do you have any other Filter calls in your applications? Double check that the expressions you write in a Filter return a boolean value, that is a requirement.

Cheers,
Vincenzo

I fixed the filter problem, but I’m running into another difficulty with the selection. Is there a way to get the particle with max. momentum for each event, rather than out of every particle in the data frame?

Dear @somebody.nobody ,

I see from your previous snippet that you already have a column named photons_p which holds the values of the pt for all the particles in the event. Supposing that the elements of this column are of a type akin to std::vector, you could do something like

df.Define(
    "max_p_of_event",
    "*(std::max_element(std::begin(photons_p), std::end(photons_p)))"
)

Note the extra de-reference * which is needed since max_element returns an iterator to the maximum element of the vector.

Cheers,
Vincenzo