TProofOutputFile Merging problems

Chinmay · October 10, 2018, 9:09am

ROOT Version: 6.08/04
Platform: CentOS 7
Compiler: gcc 4.8.5

Hi, I am trying to use TProofOutputFile class to get data of only few variables (of basic types) of randomly sampled entries from a dataset on the proof cluster, into a single tree in single file on client machine. For this purpose I first create an entrylist using the approach of “h1analysis.C” in tutorials. Then I am setting the created entrylist as an input to the next Process() call. In this call, I am using TProofOutputFile as shown in “ProofNtuple.C” tutorials. The entrylist creation part works fine. However in second step, during merging I get error of the type

TProofOutputFile::AddFile: error from TFileMerger::AddFile(root://mace03.barc.gov.in//data4/EAS_LIB/TEST-POOL/proofbox/maceuser1/session-mace02-1539158141-17890/worker-0.90-mace03-1539157931-13622//SplitMergeTrain.root)

Thus generated output file on the client has unexpected number of entries (sometimes even tree is missing).
The “ProofNtuple.C” works fine on cluster and is able to generate the merged output file without errors. So mostly cluster setup is not an issue. What can be the reason.
Attaching below the part of code and selectors relevant to the problem as well as master and worker logs. The thing that I noticed is that whenever output tree on the worker has non-zero entry (indicated by info about “cleanup flag” in logs) the corresponding file-merger has an error.
SelGetSplitData.C (10.9 KB)
SelSplitData.h (3.2 KB)
SelGetSplitData.h (5.1 KB)
SelSplitData.C (8.6 KB)
MinWorking.C (2.8 KB)
worker-0.94-mace03.txt (167.9 KB)
worker-0.74-mace03.txt (261.6 KB)
master.txt (412.2 KB)

Sorry for so many files.
For the GetSplitDataTree function variable string should be input in the form of “<branch1_to_be_read_name>/<branch1_type>:<branch2_to_be_read_name><branch2_type>: …” . where branch types can be ‘D’, ‘F’ , ‘I’ , ‘i’ , ‘S’ and ‘s’ corresponding to double, float integer, unsigned integer, short and unsigned short respectively. <branch_to_be_read_name> is the variable for which event data needs to retrieved on client.

Chinmay · October 10, 2018, 3:56pm

Hi,
Sorry I uploaded the wrong master.log and currently do not have access to right master.log.
However the master log had following error

TNetXNGFile::Open error : cannot open file "<file url>" : server responded with error code [3010] : read/write to the path is disallowed

Above is obviously not the exact error, but I have put everything important. I can’t understand why the path is not allowed. The worker files are created in “proofbox” of the PROOF session and I have exported that path in my xrootd config file by statement

all.export /home/<user>/easlib/pool/proofbox

/home//easlib is a soft link in home directory that points to the top directory of “sandbox” (pool ).
I am firing xrootd on each machine as

xrootd -c /home/<user>/xpd_proof.cf -n CT -b -l xrootd.log

Does having different name for xrootd session (CT) makes difference since < user > is not “CT”

Chinmay · October 11, 2018, 6:55am

Hi,
Solved the problem. Basically symbolic links used in the xrootd config file in “all.export” was problem.
I could open the remote file from master using

TFile *f = TFile :: Open("root://ser.ver.in//home/< user >/symboliclink/rest-of-the-path-to-proofbox/File.root")

However server error 3010 saying “opening path is disallowed” was returned while doing

TFile *f = TFile :: Open("root://ser.ver.in//absolute-actual-path-to-proofbox/File.root")

on master. I changed the line

all.export /home/<user>/easlib/pool/proofbox

in xrootd config files on all the nodes to

all.export /absolute-path-to-proofbox/pool/proofbox

and it worked.

jblomer · October 11, 2018, 7:47am

Many thanks for sharing the solution with us! @ganis perhaps that’s interesting to you.

Chinmay · October 11, 2018, 4:22pm

Hi,
The merging problem is solved. However now there is another problem. The code worked for the input datasets of total size ~100 GB with each wroker file of the size ~1 GB. However when same was run on input dataset of size ~1 TB with each worker file of the size ~10 GB the Process() part worked smooth but when 99 % of processing was complete, the code hanged. I also found out that it had crashed on worker nodes. The same problem occurred even for “dataset creation” mode. I searched for similar threads on forum and found quite many but did not find definitive answer. The cluster has 2 nodes each with 128 GB RAM and 128 GB swap memory. One node runs 1 master + 47 worker while other runs 48 workers. Is the crash due to insufficient RAM (assuming 48 files each of size 10 GB are opened simultaneously on the nodes ). How do I control the RAM usage in PROOF session.
Also My input datasets are generated by previous call to proof->Process() using TProofOutputFile in dataset creation mode. In normal root session while writing when the size of the tree crosses 2 GB it automatically changes the file from file1.root to file2.root . Is there any such possibility in TProofOutputFile (except manually switching tree files on workers and making TProofOutputFile adopt it. If I understand right even this way TProofOutputFile may not be able to keep track of multiple files generated on the workers.)

jblomer · October 12, 2018, 1:57pm

@ganis, can you comment?

ganis · October 14, 2018, 11:09am

Dear Chinmay,

No, as it is now, this helper class would not deal with the ROOT writing file splitting automatically. The best is to scan the sandbox in SlaveTerminate for the produced files and only then add one TProofOutputFile for each file.

It may be. I suggest that you make a few tries runs (data set of 100 GB, dataset of 200 GB), print out the RAM usage and extrapolate with that (look for the TStatus object in the outputlist and print it).

Also, on which support is your dataset (SSD, NAS, …)? Are you sure you need 96 workers to saturate your bandwidth from two physical nodes? What happens with half the workers?

G Ganis

Chinmay · October 24, 2018, 1:52pm

Hi…
Sorry for very very late reply. I had to get that thing done first somehow… so was after that…
currently I am dividing the Large datasets (1 TB size) into smaller datasets of few files (each with size 10 GB)
through generation of TFileCollection and then processing it. And for now it is working. I will look into the details after some time.

I don’t really know / understand these things. I have just set up small ssh-cluster using one of the standard configuration setup shown in online PROOF documentation.
Thanks for the replies.

system · November 7, 2018, 1:56pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.