Dear ROOT experts,
In my physics analysis, I need to construct O(100) RDataFrame instances (for different MC channels and data taking periods). It seems for every define/filter, on each rdf, root needs to generate one line like this ROOT::Internal::RDF::JitDefineHelper<...>(...)
call globally for JIT.
For my use case this means O(10000) such lines need to be JITed before running the event loop with RunGraphs([df1, df2, …]) call. which takes 18 seconds on my local machine. While running the event loop only takes 18 seconds without MT.
By looking at the LoopManager->jit()
method it occurs to me that the JIT is done in sequential order on 1000 line chunks.
I was wondering if it is possible for ROOT to do the compilation in parallel to speed up the process?
Here are my script in which the slow JIT is observed:
rdf_jit_take_long.py (4.5 KB)
Here are some logs from the script and some sample code that is JITed:
Info in <[ROOT.RDF] Info /usr/src/debug/root/root-6.28.04/tree/dataframe/src/RDFHelpers.cxx:69 in void ROOT::RDF::RunGraphs(std::vector<RResultHandle>)>: Just-in-time compilation phase for RunGraphs (198 unique computation graphs) completed in 17.314043 seconds.
Info in <[ROOT.RDF] Info /usr/src/debug/root/root-6.28.04/tree/dataframe/src/RDFHelpers.cxx:91 in void ROOT::RDF::RunGraphs(std::vector<RResultHandle>)>: Finished RunGraphs run (198 unique computation graphs, 18.5s CPU, 20.2613s elapsed).
ROOT::Internal::RDF::CallBuildAction<ROOT::Internal::RDF::ActionTags::Histo2D, double, double, double>(reinterpret_cast<std::shared_ptr<ROOT::Detail::RDF::RNodeBase>*>(0x5586dbec2770), new const char*[3]{"var1_to_be_fill", "var2_to_be_fill", "final_weight"}, 3, 1, reinterpret_cast<shared_ptr<TH2D>*>(0x5586dbec2750), reinterpret_cast<std::weak_ptr<ROOT::Internal::RDF::RJittedAction>*>(0x5586dbec28c0), reinterpret_cast<ROOT::Internal::RDF::RColumnRegister*>(0x5586dbebd220));ROOT::Internal::RDF::JitFilterHelper(R_rdf::func27, new const char*[0]{}, 0, "", reinterpret_cast<std::weak_ptr<ROOT::Detail::RDF::RJittedFilter>*>(0x5586dbec2ac0), reinterpret_cast<std::shared_ptr<ROOT::Detail::RDF::RNodeBase>*>(0x5586dbec2aa0),reinterpret_cast<ROOT::Internal::RDF::RColumnRegister*>(0x5586dbebd1c0));
ROOT::Internal::RDF::JitDefineHelper<ROOT::Internal::RDF::DefineTypes::RDefineTag>(R_rdf::func28, new const char*[1]{"omega"}, 1, "var1_to_be_fill", reinterpret_cast<ROOT::Detail::RDF::RLoopManager*>(0x558691fc6830), reinterpret_cast<std::weak_ptr<ROOT::Detail::RDF::RJittedDefine>*>(0x5586dbec2e60), reinterpret_cast<ROOT::Internal::RDF::RColumnRegister*>(0x5586dbec2ef0), reinterpret_cast<std::shared_ptr<ROOT::Detail::RDF::RNodeBase>*>(0x5586dbec2e40));
ROOT::Internal::RDF::JitDefineHelper<ROOT::Internal::RDF::DefineTypes::RDefineTag>(R_rdf::func29, new const char*[1]{"mutau_colin_p4"}, 1, "var2_to_be_fill", reinterpret_cast<ROOT::Detail::RDF::RLoopManager*>(0x558691fc6830), reinterpret_cast<std::weak_ptr<ROOT::Detail::RDF::RJittedDefine>*>(0x5586dbec4190), reinterpret_cast<ROOT::Internal::RDF::RColumnRegister*>(0x5586dbec4370), reinterpret_cast<std::shared_ptr<ROOT::Detail::RDF::RNodeBase>*>(0x5586dbec4170));
Cheers, Qichen Dong
_ROOT Version: 6.28/04
_Platform: Archlinux
_Compiler: gcc 13.1.1