Hello,
I’m currently trying to figure out how I can take advantage of the multithreading offered in Root 6 for my existing analysis codes. As a first exercise, I wanted to write a short macro which reads two root trees from separate files and simply goes through the entries using GetEntry(). (I’m aware of the fact, that there are better, more modern options to loop through root trees, however, the logic of my existing analysis codes rely heavily on the GetEntry() entry procedure.) The files “Test1.root” and “Test2.root” are 3 Gb large. Both contain a single tree with six branches, containing typical event by event measurement data.
In the first version of the code, the analysis is done sequentially, processing one tree at a time.
void singlecore2(){
Long64_t TimeStamp;
Long64_t n1, n2;
TStopwatch watch;
watch.Start();
TFile *f1 = new TFile("Test1.root", "read");
TTree *Traw1 = (TTree*)f1->Get("RawData/Tglobal");
Traw1->SetBranchAddress("TimeStampGlobal",&TimeStamp);
TFile *f2 = new TFile("Test2.root", "read");
TTree *Traw2 = (TTree*)f2->Get("RawData/Tglobal");
Traw2->SetBranchAddress("TimeStampGlobal",&TimeStamp);
n1 = Traw1->GetEntries();
n2 = Traw2->GetEntries();
for(Long64_t j=0; j<n1; j++){
Traw1->GetEntry(j);
}
printf("File 1 - done\n");
for(Long64_t j=0; j<n2; j++){
Traw2->GetEntry(j);
}
printf("File 2 - done\n");
f1->Close();
f2->Close();
cout << "(Processing time: " << watch.RealTime() << ")" <<endl;
}
In the second version of the code, I wanted to do the same thing, by analyzing the two trees in parallel on two different CPU cores using multithreading. For this, I used TTaskGroup following the tutorial mt301_TTaskGroupSimple.C.
void multicore2(){
TStopwatch watch;
watch.Start();
ROOT::EnableImplicitMT(2);
ROOT::Experimental::TTaskGroup tg;
tg.Run([]() {
Long64_t TimeStamp1;
Long64_t n1;
TFile *f1 = new TFile("Test1.root", "read");
TTree *Traw1 = (TTree*)f1->Get("RawData/Tglobal");
Traw1->SetBranchAddress("TimeStampGlobal",&TimeStamp1);
n1 = Traw1->GetEntries();
for(Long64_t j=0; j<n1; j++){
Traw1->GetEntry(j);
}
f1->Close();
cout << TimeStamp1 << endl;
printf("File 1 - done\n");
}
);
tg.Run([]() {
Long64_t TimeStamp2;
Long64_t n2;
TFile *f2 = new TFile("Test2.root", "read");
TTree *Traw2 = (TTree*)f2->Get("RawData/Tglobal");
Traw2->SetBranchAddress("TimeStampGlobal",&TimeStamp2);
n2 = Traw2->GetEntries();
for(Long64_t j=0; j<n2; j++){
Traw2->GetEntry(j);
}
f2->Close();
cout << TimeStamp2 << endl;
printf("File 2 - done\n");
}
);
tg.Wait();
ROOT::DisableImplicitMT();
cout << "(Processing time: " << watch.RealTime() << ")" <<endl;
}
Unfortunately, I’m getting considerably worse performance with multithreading enabled than in sequential mode. Moreover, the processing time seems to be longer if I assign more cores to the multithreading:
Multithreading disabled: 68 s
EnableImplicitMT(2): 247s
EnableImplicitMT(4): 429s
While the code is running, one can observe in the task manager that the activity of the set number of cores indeed goes to 100%.
Any hint or suggestion would be highly appreciated.
ROOT Version: 6.34.02
Platform: Windows 11
Compiler: Visual Studio 17.12.2