Exploding a list-like column

My TTree has branches of std::vector<T> . Is there a way to make an RDataFrame ‘explode’ the vector branches, just like Pandas can do?

I want to be able to do

d = d.Explode("v_pt")

where v_pt labels a branch of std::vector<float> type so that a tuple that looks like:
entry | v_pt
0 | {1.5, 2.0, 1.8}
1 | {5.2}

becomes
entry | pt
0 | 1.5
0 | 2.0
0 | 1.8
1 | 5.2

Is that possible?

ROOT Version: 6.30/04
Platform: archlinux
Compiler: gcc 12.3.0

@vpadulan and @mczurylo might help you.

Hi @danj1011,

thanks for your inquiry, but unfortunately such operation does not yet exist in RDF - it is, however, a feature that is already on our list of items.

Have you tried using RVec’s: ROOT: ROOT::VecOps::RVec< T > Class Template Reference instead of std::vector’s? That could make it easier to deal with the collections within RDF.

Cheers,
Marta

Thanks a lot @mczurylo . At the moment I’m just going to take snapshot the dataframe, open with uproot, and explode in pandas, until the same functionality exists in ROOT.

I’ve played with RVecs a little bit but, as is often the case, the std::vector structure is inherited and not something I have control over. How would it make my life easier?

Hi @danj1011,

I’ve played with RVecs a little bit but, as is often the case, the std::vector structure is inherited and not something I have control over. How would it make my life easier?

You can convert std::vector into an RVec and continue with RDF (as described under the provided link). Here is a forum post with a similar issue to yours: What is the best way to create a flatten RDataFrame?

Cheers,
Marta

Why were @Edeen1976 's posts removed? They appeared to be really useful

Having said that, it might have only worked for a single branch, not exploding the other branches

The fact that Edeen replied to themselves makes it looks like it was auto-generated content.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.