Home | News | Documentation | Download

Parallel filling of histogram (PROOF)

tree

#1

Hi, rooters!

Does anybody use PROOF in your analysis?
I’ve tried to do it, because it’s strange to have a machine with 12 threads and use only one or two.
So for me now the problem is filling of histogram. I have such code:
I’ll skip the lines that I didn’t touch:

TreeParProc.h

class TreeParProc : public TSelector {
public:
....
  Float_t TrackPara[177][17];
  Int_t TriggerType[2];
  Int_t nTracklets;
  Int_t nTracks;
  TH1F* fhpt;
....
void TreeParProc::Init(TTree *tree)
{
  fTree = tree;
  fTree->SetBranchAddress("TrackPara", TrackPara);
  fTree->SetBranchAddress("nTracks", &nTracks);
  fTree->SetBranchAddress("nTracklets", &nTracklets);
  fTree->SetBranchAddress("TriggerType", TriggerType);
}


TreeParProc.C

void TreeParProc::Begin(TTree * /*tree*/)
{
  ::Info("","Begin of parallel data processing");
  fhpt = new TH1F("pt4p", "pt4p", 100, 0., 2.);
  TString option = GetOption();
}
Bool_t TreeParProc::Process(Long64_t entry)
{
  fTree->GetEvent(entry);
  Int_t nGoodTracks{};
  Int_t charge{};
  Float_t tempPt{};
  Float_t tempPx{};
  Float_t tempPy{};
  
  for (auto j = 0; j < nTracks; ++j) {
    if (!TrackPara[j][5] && !TrackPara[j][6]) return kTRUE;
    ::Info("", "Event num: %lld | track num: %d | px: %f | py: %f\n", entry, j, TrackPara[j][11], TrackPara[j][12]);
    nGoodTracks++;
    charge += TrackPara[j][7];
    tempPx += TrackPara[j][12];
    tempPy += TrackPara[j][12];
 }  
 if (nGoodTracks < 4) return kTRUE;
   
 tempPt = TMath::Sqrt(tempPx*tempPx+tempPy*tempPy);
 if (charge == 0) fhpt->Fill(tempPt);
   return kTRUE;
}

void TreeParProc::Terminate()
{
  auto fot  = new TFile("parallel.hist.root", "RECREATE");
  fhpt->Write();
  fot->Close();
}

In output of my info I see proper values of my data, but when I do hpt->Fill() I have the next one message in proof session:

14:52:28 11297 Wrk-0.3 | Info: Event num: 442 | track num: 2 | px: -0.492885 | py: 0.083113
14:52:28 11297 Wrk-0.3 | Info: Event num: 442 | track num: 3 | px: -0.053113 | py: -0.203568
14:52:28 11297 Wrk-0.3 | *** Break ***: segmentation violation

===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0  0x00007f22d69fd687 in __GI___waitpid (pid=11721, stat_loc=stat_loc
entry=0x7ffd8274b168, options=options
entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:30
#1  0x00007f22d6968067 in do_system (line=<optimized out>) at ../sysdeps/posix/system.c:149
#2  0x00007f22d7826c23 in TUnixSystem::Exec (shellcmd=<optimized out>, this=0x55b9aa7b47c0) at /home/bdrum/Projects/cern/root-dev/root/core/unix/src/TUnixSystem.cxx:2099
#3  TUnixSystem::StackTrace (this=0x55b9aa7b47c0) at /home/bdrum/Projects/cern/root-dev/root/core/unix/src/TUnixSystem.cxx:2393
#4  0x00007f22d7829614 in TUnixSystem::DispatchSignals (this=0x55b9aa7b47c0, sig=kSigSegmentationViolation) at /home/bdrum/Projects/cern/root-dev/root/core/unix/src/TUnixSystem.cxx:3624
#5  <signal handler called>
#6  0x00007f22c8b4d3b9 in TreeParProc::Process(long long) () from /home/bdrum/.proof/cache/TreeParProc_C.so
#7  0x00007f22c973f5c7 in TProofPlayer::Process (this=0x55b9abac6cb0, dset=0x55b9ab6a79f0, selector_file=0x7ffd8274e2a9 "TreeParProc.C+", option=0x7ffd8274e2c9 "", nentries=-1, first=-1) at /home/bdrum/Projects/cern/root-dev/root/proof/proofplayer/src/TProofPlayer.cxx:1314
#8  0x00007f22cabc8199 in TProofServ::HandleProcess (this=0x55b9ab242b20, mess=<optimized out>, slb=0x0) at /home/bdrum/Projects/cern/root-dev/root/proof/proof/src/TProofServ.cxx:4047
#9  0x00007f22cabc3747 in TProofServ::HandleSocketInput (this=0x55b9ab242b20, mess=0x55b9ab5f27e0, all=<optimized out>) at /home/bdrum/Projects/cern/root-dev/root/proof/proof/src/TProofServ.cxx:1653
#10 0x00007f22cabb6a02 in TProofServ::HandleSocketInput (this=0x55b9ab242b20) at /home/bdrum/Projects/cern/root-dev/root/proof/proof/src/TProofServ.cxx:1364
#11 0x00007f22cabcd7e7 in TProofServLiteInputHandler::Notify (this=<optimized out>) at /home/bdrum/Projects/cern/root-dev/root/proof/proof/src/TProofServLite.cxx:177
#12 TProofServLiteInputHandler::ReadNotify (this=<optimized out>) at /home/bdrum/Projects/cern/root-dev/root/proof/proof/src/TProofServLite.cxx:169
#13 0x00007f22d7828a00 in TUnixSystem::CheckDescriptors (this=this
entry=0x55b9aa7b47c0) at /home/bdrum/Projects/cern/root-dev/root/core/unix/src/TUnixSystem.cxx:1302
#14 0x00007f22d782a398 in TUnixSystem::DispatchOneEvent (this=0x55b9aa7b47c0, pendingOnly=<optimized out>) at /home/bdrum/Projects/cern/root-dev/root/core/unix/src/TUnixSystem.cxx:1057
#15 0x00007f22d773fb91 in TSystem::InnerLoop (this=0x55b9aa7b47c0) at /home/bdrum/Projects/cern/root-dev/root/core/base/src/TSystem.cxx:412
#16 TSystem::Run (this=0x55b9aa7b47c0) at /home/bdrum/Projects/cern/root-dev/root/core/base/src/TSystem.cxx:362
#17 0x00007f22d76d2caf in TApplication::Run (this=0x55b9ab242b20, retrn=<optimized out>) at /home/bdrum/Projects/cern/root-dev/root/core/base/src/TApplication.cxx:1176
#18 0x000055b9a9169670 in main (argc=<optimized out>, argv=0x7ffd8274f6c8) at /home/bdrum/Projects/cern/root-dev/root/main/src/pmain.cxx:260
===========================================================

The lines below might hint at the cause of the crash.
You may get help by asking at the ROOT forum http://root.cern.ch/forum
Only if you are really convinced it is a bug in ROOT then please submit a
report at http://root.cern.ch/bugs Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#6  0x00007f22c8b4d3b9 in TreeParProc::Process(long long) () from /home/bdrum/.proof/cache/TreeParProc_C.so
#7  0x00007f22c973f5c7 in TProofPlayer::Process (this=0x55b9abac6cb0, dset=0x55b9ab6a79f0, selector_file=0x7ffd8274e2a9 "TreeParProc.C+", option=0x7ffd8274e2c9 "", nentries=-1, first=-1) at /home/bdrum/Projects/cern/root-dev/root/proof/proofplayer/src/TProofPlayer.cxx:1314
#8  0x00007f22cabc8199 in TProofServ::HandleProcess (this=0x55b9ab242b20, mess=<optimized out>, slb=0x0) at /home/bdrum/Projects/cern/root-dev/root/proof/proof/src/TProofServ.cxx:4047
#9  0x00007f22cabc3747 in TProofServ::HandleSocketInput (this=0x55b9ab242b20, mess=0x55b9ab5f27e0, all=<optimized out>) at /home/bdrum/Projects/cern/root-dev/root/proof/proof/src/TProofServ.cxx:1653
#10 0x00007f22cabb6a02 in TProofServ::HandleSocketInput (this=0x55b9ab242b20) at /home/bdrum/Projects/cern/root-dev/root/proof/proof/src/TProofServ.cxx:1364
#11 0x00007f22cabcd7e7 in TProofServLiteInputHandler::Notify (this=<optimized out>) at /home/bdrum/Projects/cern/root-dev/root/proof/proof/src/TProofServLite.cxx:177
#12 TProofServLiteInputHandler::ReadNotify (this=<optimized out>) at /home/bdrum/Projects/cern/root-dev/root/proof/proof/src/TProofServLite.cxx:169
#13 0x00007f22d7828a00 in TUnixSystem::CheckDescriptors (this=this
entry=0x55b9aa7b47c0) at /home/bdrum/Projects/cern/root-dev/root/core/unix/src/TUnixSystem.cxx:1302
#14 0x00007f22d782a398 in TUnixSystem::DispatchOneEvent (this=0x55b9aa7b47c0, pendingOnly=<optimized out>) at /home/bdrum/Projects/cern/root-dev/root/core/unix/src/TUnixSystem.cxx:1057
#15 0x00007f22d773fb91 in TSystem::InnerLoop (this=0x55b9aa7b47c0) at /home/bdrum/Projects/cern/root-dev/root/core/base/src/TSystem.cxx:412
#16 TSystem::Run (this=0x55b9aa7b47c0) at /home/bdrum/Projects/cern/root-dev/root/core/base/src/TSystem.cxx:362
#17 0x00007f22d76d2caf in TApplication::Run (this=0x55b9ab242b20, retrn=<optimized out>) at /home/bdrum/Projects/cern/root-dev/root/core/base/src/TApplication.cxx:1176
#18 0x000055b9a9169670 in main (argc=<optimized out>, argv=0x7ffd8274f6c8) at /home/bdrum/Projects/cern/root-dev/root/main/src/pmain.cxx:260
===========================================================


14:52:33 11297 Wrk-0.3 | Error in <TProofServLite::HandleException>: caugth exception triggered by signal '1' while processing dset:'TDSet:events', file:'/home/bdrum/Projects/cern/alice/data/result.root' - check logs for possible stacktrace - last event: 442

So, what I missed, because in tutorial for parallel filling of histogram via random values they use just hist->Fill(x)?

Cheers,
Boris


ROOT Version: 6.17/01
Platform: ubuntu 18.04
Compiler: gcc 7.3.0



#2

You might want to use RDataFrame instead of PROOF, as the latter is deprecated.


#3

Small caveat: RDataFrame does not support multidimensional arrays yet – just mentioning it because @bdrum asked about 2D arrays in another post


#4

Thanks! Wow, I didn’t know that PROOF is deprecated. Based on root docs, I thought that it’s most common method for parallel data processing with root. Ok, if so and also as @eguiraud already mentioned I have to use in my data 2d array. So I’ve two questions:
Is RDataFrame the modern and easiest way to use parallelism in my data processing?
I saw the jira-ticket dedicated to implementation of working with multi arrays in RDataFrame, so Is it means that you are moving in this direction? So, You see, I can redesign my data structures in tree to 1d arrays.


#5

Well, maybe deprecated is too strong. It’s considered legacy, but it won’t be removed from ROOT any time soon.


#6

Hi Boris,

our policy is not to leave behind anybody. Whenever possible, we try to support the users that depend on PROOF although its development is frozen since a while.

Currently RDataFrame is the reccomended way to process data in a parallel fashion since the very same code runs sequentially and in MT mode. We believe it is the simplest approach to express parallelism and we have demonstrated that we can scale analysis workflows to O(100) threads successfully (and here) and we have a working prototype which allows to spread the calculations described by RDataFrame based analysis code on distributed systems.

As you correctly note, we are still working to the reading of multi-dimensional arrays stored in columns of ROOT datasets via the TTreeReader/TTreeReaderValue mechanism on which RDataFrame relies. You mention the option to move to monodimensional arrays the data structures on which you base your data model. If this is a possibility, it would be definitively useful for us to collect also the feedback relative to RDataFrame from an experienced PROOF-Lite user.

Cheers,
Danilo


#7

Hi Danilo,

thanks for your response.

First of all, I figure out that my question in topic has the obviously answer. The problem is the container inside TH1 which I tried to fill doesn’t have concurrent version, so it means I can’t fill it in parallel mode. I just clarify it, may be somebody will have such question. I don’t know why I thought that it should be by default.

What about RDataFrame and PROOF:

For me RDataFrame is really good idea, because it’s way to make analysis more easier and it looks like attempt to not overload physicist by c++ code like it in proof, but for developers which use root like one of component of their software proof looks really nice.

So, I play with RDataFrame a bit, my first impression it’s a bit confusing. Just one example:
But before, I should clarify that I don’t familiar with RDataFrame syntax, and from the one side it’s problem, but from another side, e.g. in python it’s not a problem because python could forgive many of my mistakes. So, example:

ROOT::RDataFrame rd("events", "results.root");

 rd.Count()
(ROOT::RDF::RResultPtr<ULong64_t>) @0x55d4e5042860 // Here of course I expected value, not address, 
//But ok I have to use
rd.Count().GetValue() // or *
(unsigned long long) 7259517 //and it's works fine
rd.GetColumnNames()
(ROOT::RDF::RInterface::ColumnNames_t) { "dV0", "EnZDC", "dAD", "vertex", "nTracks", "nTracklets", "eventinfo", "TDCa", "TDCc", "TriggerType", "dca0", "dca1", "ITSNcls", "TPCNcls", "HasPointOnITSLayer0", "HasPointOnITSLayer1", "charge", "NumberOfSigmasTPCPion", "NumberOfSigmasTPCElectron", "NumberOfSigmasITSPion", "NumberOfSigmasITSElectron", "NumberOfSigmasITSKaon", "StatusAndTPCRefit", "StatusAndITSRefit", "Px", "Py", "Pz", "Pt" }
rd.Filter("HasPointOnITSLayer0 == 1 && HasPointOnITSLayer1 == 1").Count()
(ROOT::RDF::RResultPtr<ULong64_t>) @0x55d4e61ae8f0
rd.Filter("HasPointOnITSLayer0 == 1 && HasPointOnITSLayer1 == 1").Count().GetValue()
"filter functions must return a bool"
 *** Break *** segmentation violation

Actually I really like root, but how I could see it, RDataFrame is python like way for analysis, and from the one side it’s really good, because, you know people who work in data science especially so popular right now machine learning, they just open jupyter notebooks and code something even have no idea what is class, pointers, inheritance or so on, but their code will work, because it’s python. Of course one of popular python implementation is CPython via C, but for me cling so far from that level of comfort.
Them main problem after such seg viol, the reason of which is obviously, I didn’t know about what is filter, what types it needed and so on, but after seg viol RDataFrame not available anymore. I should re-run everything, of course I could use macro and it’s just .L or .x, but I don’t know.

Ok that’s just some really philosophy questions, anyway I sure that this is right direction and I belive that you be able to improve RDataFrame and all point from this direction, so it’s fresh, so users and developers need more time, one for getting new habits another for test, improve and fixes.

Good luck!

Boris


#8

Hi Danilo,

I solved my problem with RDataFrame.
Just will share my code, perhaps it will help for somebody:

key words: RDataFrame, Arrays, Lambda, RVec

void Analyse() {
  auto stpw = new TStopwatch();
  stpw->Start();
  ::Info("", "New analysis method via using RDataFrame is starting:");
  auto rd = new ROOT::RDataFrame("events", "newResult.root");
  auto hm = new ROOT::RDF::TH1DModel("Pt", "Pt", 100,0,2);
  
  auto ptl = [](Int_t n, ROOT::VecOps::RVec<Int_t> x, ROOT::VecOps::RVec<Int_t> y, ROOT::VecOps::RVec<Int_t> z) {
    int nGoodTracks = 0;
    int q = 0;
    if (n>177) return kFALSE;
    for(auto i=0;i<n;++i) {
      if(!x[i] && !y[i]) continue;
      nGoodTracks++;
      q+=z[i];
    }
    if (nGoodTracks < 4 || q!= 0) return kFALSE; 
    return kTRUE;
  };
    
  auto h = rd->Filter(ptl, {"nTracks", "HasPointOnITSLayer0", "HasPointOnITSLayer1", "charge"}).Histo1D(*hm,"Pt");
  auto c = new TCanvas("c","Pt");
  h.OnPartialResult(100000, [&c](TH1D &h_) { c->cd(); h_.Draw(); c->Update(); });
  h->DrawClone(); // event loop runs here, this `Draw` is executed after the event loop is finished
  
  ::Info("", "Elapsed time:  %f, sec", stpw->RealTime());
  std::cout << stpw->RealTime() << std::endl;
}

One interesting moment: it doesn’t show last ::Info, but show std::cout, do you have any idea why it so?

So my impressions really good. I really enjoy using RDataFrame. The reason why I started to investigate parallel working with root is performance improvement. Now it’s really fast. Just for information:
I have 7.2Gb file with data. If I will run my code not in parallel mode it takes me

Info: 99% completed. Time has passed: 12563.855918, sec

Now, processing of the same data with RDataFrame and ROOT::EnableImplicitMT(8) will take:

5.72845 sec

By the way with PROOF it takes something about 5000sec, but For sure the main reason of such speed it’s your code optimization in Filter.
So It’s really great work! Congratulation!
But of course working with 2d array like matrices really comfort for plenty of users, I guess, so we will wait when this feature will be available.

Cheers,
Boris


#9

Hi Boris,

great news! Thanks for sharing.

I am not sure why the second log printout is not appearing: how is the stream handled by the logger?

You can probably even gain some more performance changing the signature of your lambda into

auto ptl = [](Int_t n, const &ROOT::VecOps::RVec<Int_t> x, const &ROOT::VecOps::RVec<Int_t> y, const &ROOT::VecOps::RVec<Int_t> z) 

As we discussed before, reading multidimensional arrays through TTreeReader is an objective we want to meet.

Cheers,
Danilo