RDataFrame

Dear ROOT experts,

I am having trouble running a RDataFrame MT task that is initiated by a TGTextButton. I have a function linked to TGTextButton and this function executes some data analysis on the RDataFrame. This happens only when I have ROOT::EnableImplicitMT() at the beginning. When I don’t use MT and only use 1 core, everything finished correctly. When I execute all the commands in a separate program, in the main body, or in the TGMainFrame constructor, everything works. Only the button is the problem.

The code I am attaching is able to recreate the condition. In this case, I am calling a simple Mean() on “column1”. One call is in the main function buttonChangelabel(); this executes well. The other call I put in ChangeStartLabel() that is executed when a button is pressed. Here the whole application freezes. Removing the ROOT::EnableImplicitMT() will make it work, but only single thread.

Do you know what am I doing wrong? How can I run my DF commands from the “button function”?

buttonChangelabel.cpp (2.8 KB)

ROOT Version: 6.26/10
Built for linuxarm64 on Nov 16 2022, 10:42:54
From tags/v6-26-10@v6-26-10

Welcome to the ROOT Forum!
Can you provide also the data file in order to be able to run your code?

What do you expect to see in the example you sent when clicking on Start? If I use as data file “hsimple.root” and “px” instead of “column1”, it seems to work, or at least it does not seem to freeze. When running your macro (ROOT 6.32.04, on WSL2), there are many messages, but the window with the buttons shows up; when clicking on Start, the button changes to Stop, and the mean value is printed; clicking again (the button keeps alternating between “Stop” and “Start”) keeps printing the mean every time, and clicking on Exit quits ROOT:

 root
   ------------------------------------------------------------------
  | Welcome to ROOT 6.32.04                        https://root.cern |
  | (c) 1995-2024, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Aug 14 2024, 04:01:00                 |
  | From tags/v6-32-04@v6-32-04                                      |
  | With c++ (Ubuntu 13.2.0-23ubuntu4) 13.2.0                        |
  | Try '.help'/'.?', '.demo', '.license', '.credits', '.quit'/'.q'  |
   ------------------------------------------------------------------

root [0] .x buttonChangelabel.cpp
In file included from input_line_8:1:
/mnt/c/d/buttonChangelabel.cpp:21:4: warning: 'CheckTObjectHashConsistency' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
   ClassDef(MyMainFrame, 0)
   ^
input_line_4:22:28: note: expanded from macro 'ClassDef'
#define ClassDef(name, id) \
                           ^
input_line_4:9:27: note: expanded from macro '\
_ClassDefInterp_'
   virtual_keyword Bool_t CheckTObjectHashConsistency() const overrd { return true; } \
                          ^
/home/daniel/root/include/TGFrame.h:494:4: note: overridden virtual function is here
   ClassDefOverride(TGMainFrame,0)  // Top level window frame
   ^
/home/daniel/root/include/Rtypes.h:342:4: note: expanded from macro 'ClassDefOverride'
   _ClassDefOutline_(name,id,,override)              \
   ^
/home/daniel/root/include/Rtypes.h:304:4: note: expanded from macro '_ClassDefOutline_'
   _ClassDefBase_(name,id, virtual_keyword, overrd)
       \
   ^
/home/daniel/root/include/Rtypes.h:275:55: note: expanded from macro '_ClassDefBase_'
   /** \cond HIDDEN_SYMBOLS */ virtual_keyword Bool_t CheckTObjectHashConsistency() const overrd
       \
                                                      ^
In file included from input_line_8:1:
/mnt/c/d/buttonChangelabel.cpp:21:4: warning: 'IsA' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
   ClassDef(MyMainFrame, 0)
   ^
input_line_4:22:28: note: expanded from macro 'ClassDef'
#define ClassDef(name, id) \
                           ^
input_line_4:12:28: note: expanded from macro '\
_ClassDefInterp_'
   virtual_keyword TClass *IsA() const overrd { return name::Class(); } \
                           ^
/home/daniel/root/include/TGFrame.h:494:4: note: overridden virtual function is here
   ClassDefOverride(TGMainFrame,0)  // Top level window frame
   ^
/home/daniel/root/include/Rtypes.h:342:4: note: expanded from macro 'ClassDefOverride'
   _ClassDefOutline_(name,id,,override)              \
   ^
/home/daniel/root/include/Rtypes.h:304:4: note: expanded from macro '_ClassDefOutline_'
   _ClassDefBase_(name,id, virtual_keyword, overrd)
       \
   ^
/home/daniel/root/include/Rtypes.h:294:76: note: expanded from macro '_ClassDefBase_'
   /** \return TClass describing current object */ virtual_keyword TClass *IsA() const overrd
       \
                                                                           ^
In file included from input_line_8:1:
/mnt/c/d/buttonChangelabel.cpp:21:4: warning: 'ShowMembers' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
   ClassDef(MyMainFrame, 0)
   ^
input_line_4:22:28: note: expanded from macro 'ClassDef'
#define ClassDef(name, id) \
                           ^
input_line_4:13:25: note: expanded from macro '\
_ClassDefInterp_'
   virtual_keyword void ShowMembers(TMemberInspector&insp) const overrd { ::ROOT::Class_ShowMembers(name::Class(), this, insp); } \
                        ^
/home/daniel/root/include/TGFrame.h:494:4: note: overridden virtual function is here
   ClassDefOverride(TGMainFrame,0)  // Top level window frame
   ^
/home/daniel/root/include/Rtypes.h:342:4: note: expanded from macro 'ClassDefOverride'
   _ClassDefOutline_(name,id,,override)              \
   ^
/home/daniel/root/include/Rtypes.h:304:4: note: expanded from macro '_ClassDefOutline_'
   _ClassDefBase_(name,id, virtual_keyword, overrd)
       \
   ^
/home/daniel/root/include/Rtypes.h:296:53: note: expanded from macro '_ClassDefBase_'
   /** \cond HIDDEN_SYMBOLS */ virtual_keyword void ShowMembers(TMemberInspector &insp) const overrd            \
                                                    ^
In file included from input_line_8:1:
/mnt/c/d/buttonChangelabel.cpp:21:4: warning: 'Streamer' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
   ClassDef(MyMainFrame, 0)
   ^
input_line_4:22:28: note: expanded from macro 'ClassDef'
#define ClassDef(name, id) \
                           ^
input_line_4:14:25: note: expanded from macro '\
_ClassDefInterp_'
   virtual_keyword void Streamer(TBuffer&) overrd { ::Error("Streamer", "Cannot stream interpreted class."); } \
                        ^
/home/daniel/root/include/TGFrame.h:494:4: note: overridden virtual function is here
   ClassDefOverride(TGMainFrame,0)  // Top level window frame
   ^
/home/daniel/root/include/Rtypes.h:342:4: note: expanded from macro 'ClassDefOverride'
   _ClassDefOutline_(name,id,,override)              \
   ^
/home/daniel/root/include/Rtypes.h:314:25: note: expanded from macro '_ClassDefOutline_'
   virtual_keyword void Streamer(TBuffer&) overrd;
                        ^
: -0.00382645
root [1] : -0.00382645
: -0.00382645
: -0.00382645
: -0.00382645
user@pc $

If this is expected but not what you see, maybe try a more recent version of ROOT.

I apologize for the warning messages, I just took some simple tutorial code and added a few lines.

I have played with my data and it seems that everything works when I reduce number of events in the tree from 1000 to about 200.

I have generated a random dataframe like this:

ROOT::RDataFrame rdf(10000000);
auto rdf_x = rdf.Define("column1", [](){ return gRandom->Rndm(); });
rdf_x.Snapshot("ttree", "testMT.root");

With this generated dataframe, my code freezes when I press the start button. Changing number of events to e.g. 100 will work OK. The file is quite large, if you want me to upload the same file I have, I will. The relevant output I get from ~100 events is:

: 0.499589
: 0.499589

However, if I generate 1e7 events as in the code above, I get:

: 0.499589
:

and everything freezes.

I have compiled ROOT 6.32 and it does the same thing. I am getting the same issue with Ubuntu and with Mac.

Looks like you are hitting some allocation limit. With your case, the highest number of entries (in your RDF) that works is 5501452; with 5501453 the code freezes; for both cases ROOT shows, respectively:

ttree->Print()
******************************************************************************
*Tree    :ttree     : ttree                                                  *
*Entries :  5501452 : Total =        44143283 bytes  File  Size =   30012439 *
*        :          : Tree compression factor =   1.47                       *
******************************************************************************
*Br    0 :column1   : column1/D                                              *
*Entries :  5501452 : Total  Size=   44142942 bytes  File Size  =   29999995 *
*Baskets :     1379 : Basket Size=      32000 bytes  Compression=   1.47     *
*............................................................................*
ttree->Print()
******************************************************************************
*Tree    :ttree     : ttree                                                  *
*Entries :  5501453 : Total =        44143291 bytes  File  Size =   30012446 *
*        :          : Tree compression factor =   1.47                       *
******************************************************************************
*Br    0 :column1   : column1/D                                              *
*Entries :  5501453 : Total  Size=   44142950 bytes  File Size  =   30000001 *
*Baskets :     1379 : Basket Size=      32000 bytes  Compression=   1.47     *
*............................................................................*

So the issue appears when “File Size” > 30000000.
If you don’t need doubles, you could use floats instead (e.g. auto rdf_x = rdf.Define("column1", [](){ return (float)gRandom->Rndm(); });), and in this case it works. I think you can also increase the basket size in the trees, which might solve the issue, but I’m not sure, I’ll let an expert comment on that :slight_smile:

@dastudillo thank you for checking that!

I have generated two dataframes, both with 5e6 events. One has ones and one has zeroes in the column:

ROOT::RDataFrame rdf(5000000);
auto rdf_x = rdf.Define("column1", [](){ return 0.;});
//auto rdf_x = rdf.Define("column1", [](){ return 1.;});
rdf_x.Snapshot("ttree", "testMT.root");

I have hadded them, to have one big dataframe with 1e7 events. The column mean is obviously 0.5. The reason I was doing this was to check that if I have a big dataframe, second half after file size 30000000 is not ignored. It isn’t. And to my surprise, the code finishes without freezing with the hadded dataframe. So hadding dataframes makes it work.

The real dataframe I want to use, has TArrayD in the column and a waveform from the oscilloscope. I made a small GUI using root to analyze my data. Everything works in a single threaded mode. But with more data, I need to go parallel.

The real enigma is: Why does it not work when inside the button invoked function, but it does when called outside the GUI.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.