Suggestion for documentation improvement for RDataFrame

maurik · June 13, 2025, 2:19pm

If I may suggest some improvements to the documentation for the RDataFrame? Is this the best place to suggest improvements?

My reason here is that while I am capable of “translating” the suggested code on the RDataFrame documentation page (ROOT: ROOT::RDataFrame Class Reference) into functioning code, a new user without much C++ experience would simply get frustrated and give up.

An example of non functioning code (about halfway down the page) is:

RDataFrame d(100); // an RDF that will generate 100 entries (currently empty)
int x = -1;
auto d_with_columns = d.Define("x", [&x] { return ++x; })
                       .Define("xx", [&x] { return x*x; });
d_with_columns.Snapshot("myNewTree", "newfile.root");

On my machine, this code does not actually work (MacOS and Alma Linux). First, RDataFrame is not found. It needs to be specied as “ROOT::RDataFrame”. The next problem is that the lambda does not work, because the variable “x” cannot be captured since it does not have automatic storage duration. If we are not passing a captured variable, we need to specify the return type. So the correct code would be:

ROOT::RDataFrame d(100); // an RDF that will generate 100 entries (currently empty)
int x = -1;
auto d_with_columns = d.Define("x", []()->int { return ++x; })
                       .Define("xx", []()->int { return x*x; });
d_with_columns.Snapshot("myNewTree", "newfile.root");

This seems minor, but it gets frustrating to my students when the official documentation is so full of little issues like these.

Thank you for your attention.

mczurylo · June 13, 2025, 3:03pm

Hi @maurik,

Thank you for your post and yes, you are absolutely right that for someone unexperienced this may be very frustrating. Posting here is okay but generally best would be opening GH issues so nothing is lost in the meantime - GitHub · Where software is built. For this particular change I will open a PR ASAP, but if you see something else that is not working, please let us know on GH.

Cheers,
Marta

mczurylo · June 13, 2025, 3:30pm

Hi @maurik,

here is the PR fixing the problem that you mentioned directly and adding replacing RDataFrame to ROOT::RDataFrame in all the other examples, if you have more examples on hand, feel free to comment on this pull request.

Cheers,
Marta

maurik · June 17, 2025, 1:08pm

Thanks Marta!

system · July 1, 2025, 1:09pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.