Error while using RDataFrame with spark cluster (analytix)

wandering_particle · August 11, 2023, 9:17am

I tried to use distributed ROOT dataframe in SWAN with spark cluster.
These are the steps I followed:

login to SWAN. Selecting analytix cluster
Once in the jupyter notebook, clicking on the star button to connect to the spark cluster including “EOS system” option and the latest software stack.
Running the code snippets as below:

import ROOT
RDataFrame = ROOT.RDF.Experimental.Distributed.Spark.RDataFrame
df = RDataFrame("Events",
                "root://eospublic.cern.ch//eos/opendata/cms/derived-data/AOD2NanoAODOutreachTool/Run2012BC_DoubleMuParked_Muons.root",
                npartitions=2,
                sparkcontext=sc)
df.Count().GetValue()

I get an error

File "/cvmfs/sft.cern.ch/lcg/views/LCG_103swan/x86_64-centos7-gcc11-opt/lib/ROOT/_pythonization/_tmva/__init__.py", line 25, in <module>
    hasRDF = gSystem.GetFromPipe("root-config --has-dataframe") == "yes"
ValueError: TString TSystem::GetFromPipe(const char* command) =>
    ValueError: nullptr result where temporary expected

I posted this incident on the CERN service portal and I got a comment form an expert:

I found out that the notebook runs fine if you select, when you are about to start your SWAN session, the software stack called “101” – it is in the list of “Other releases” if you scroll down. That LCG release has ROOT 6.24, so it seems that the issue was introduced in newer ROOT releases

I would like to shed light on this problem which might be caused in the latest ROOT release. I hope the information is enough.

Regards,
Nilima.

bellenot · August 11, 2023, 9:19am

MAybe @vpadulan or @eguiraud can take a look

eguiraud · August 11, 2023, 2:36pm

Hi,

the error does not have much to do with RDF: when PyROOT loads ROOT’s pythonizations, at some point it also loads pythonizations for TMVA (ROOT’s ML module).

While doing so it calls the line shown in the error message, and something goes wrong.

If I try running that same line when ssh’d to LXPLUS, using the same environment as you (source /cvmfs/sft.cern.ch/lcg/views/LCG_103swan/x86_64-centos7-gcc11-opt/setup.sh), I can execute that line without errors.

The only problem I encounter is that I have to call ROOT.gROOT.SetBatch() before I do anything or the runtime complains about a missing X11 error (that’s because I’m ssh’d into LXPLUS without X11 forwarding, sot I’m not sure it’s relevant).

In short I cannot tell what is causing that error. We would need someone to take a deeper look within the same SWAN setup you have. That could be @vpadulan when he’s back from holidays or if he can spare some time maybe @etejedor .

You can also try posting at https://swan-community.web.cern.ch in case someone encountered the same problem.

Again, given the information provided it does not seem to be RDF-related but rather an issue with the environment (that cannot be reproduced simply sourcing that same environment in an ssh session).

Sorry I cannot be of more help,
Enrico

system · August 25, 2023, 2:36pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.