TTree sorting on lxplus batch mode

Dear ROOT community,

I am attempting to sort a TTree with the following code:

void sortS(int runid){
TFile * f = new TFile(“/eos/PATH_TO_MY_FILE_ON_MY_EOS.root”,“READ”);

TTree *t;
f->GetObject("C_data",t);
TFile * f_out = new TFile(Form("/eos/PATH_TO_OUTPUT_ALSO_ON_MY_EOS"),"RECREATE");

// Create index on tunix branch (time)
Int_t nb_idx = t->BuildIndex("trigID","0");
TTreeIndex *att_index = (TTreeIndex*)t->GetTreeIndex();
TTree* to = (TTree*)t->CloneTree(0); // Problematic line

to->SetName("C_data");
to->SetDirectory(f_out);
// Loop on t_raw entries and fill t 
//for( Long64_t i = 0; i < att_index->GetN(); i++ ) {
//Running on 10k events for starters because it's faster
for( Long64_t i = 0; i < 10000; i++ ) {
    t->GetEntry( att_index->GetIndex()[i]) ;
    to->Fill();
}

f_out->WriteTObject(to);
f_out->Close();

}

Unfortunately, while it does work on my machine (ArchLinux, ROOT 6.30/04 from pacman), it crashes when started in batch mode on lxplus9.

I use the right shebang (#!/bin/bash) so I’m not sure what is wrong?

Cheers,
EK


ROOT Version: 6.30/06
Platform: lxplus, Red Hat Enterprise Linux release 9.3 (Plow)
Compiler: Not Provided


Hi,

What is the error you are facing?

Best,
D

I just get a Segmentation violation:

*** Break *** segmentation violation

===========================================================
There was a crash.
This is the entire stack trace of all threads:

#0 0x00001497cb7182ca in wait4 () from /lib64/libc.so.6
#1 0x00001497cb661953 in do_system () from /lib64/libc.so.6
#2 0x00001497cc106de4 in TUnixSystem::StackTrace() () from /usr/lib64/root/libCore.so.6.30
#3 0x00001497cc103d05 in TUnixSystem::DispatchSignals(ESignals) () from /usr/lib64/root/libCore.so.6.30
#4
#5 0x00001497ca8eada0 in TBufferFile::WriteFastArray(long long const*, int) () from /usr/lib64/root/libRIO.so
#6 0x00001497b7f80c27 in TTreeIndex::Streamer(TBuffer&) () from /usr/lib64/root/libTreePlayer.so.6.30.06
#7 0x00001497ca8f0db2 in TBufferFile::WriteObjectClass(void const*, TClass const*, bool) () from /usr/lib64/root/libRIO.so
#8 0x00001497ca8f81ec in TBufferIO::WriteObjectAny(void const*, TClass const*, bool) () from /usr/lib64/root/libRIO.so
#9 0x00001497ca8f0874 in TBufferFile::WriteFastArray(void**, TClass const*, int, bool, TMemberStreamer*) () from /usr/lib64/root/libRIO.so
#10 0x00001497cab5cd19 in int TStreamerInfo::WriteBufferAux<char**>(TBuffer&, char** const&, TStreamerInfo::TCompInfo* const*, int, int, int, int, int) () from /usr/lib64/root/libRIO.so
#11 0x00001497ca9ad234 in TStreamerInfoActions::GenericWriteAction(TBuffer&, void*, TStreamerInfoActions::TConfiguration const*) () from /usr/lib64/root/libRIO.so
#12 0x00001497ca8eb2fd in TBufferFile::ApplySequence(TStreamerInfoActions::TActionSequence const&, void*) () from /usr/lib64/root/libRIO.so
#13 0x00001497ca8f17e1 in TBufferFile::WriteClassBuffer(TClass const*, void*) () from /usr/lib64/root/libRIO.so
#14 0x00001497ca942cb9 in TDirectoryFile::CloneObject(TObject const*, bool) () from /usr/lib64/root/libRIO.so
#15 0x00001497cbfe0c51 in TNamed::Clone(char const*) const () from /usr/lib64/root/libCore.so.6.30
#16 0x00001497b8180524 in TTree::CloneTree(long long, char const*) () from /usr/lib64/root/libTree.so.6.30.06
#17 0x00001497cbc2e255 in ?? ()
#18 0x00007ffc45ad1070 in ?? ()
#19 0x00001497cbc4e1f9 in ?? ()
#20 0x00001497cc00b5e0 in ?? () from /usr/lib64/root/libCore.so.6.30
#21 0x00001497cbc2e410 in ?? ()
#22 0x00001497ca95da80 in ?? () from /usr/lib64/root/libRIO.so
#23 0x00001497cbc4e23b in ?? ()
#24 0x0000000000000000 in ?? ()

The lines below might hint at the cause of the crash. If you see question
marks as part of the stack trace, try to recompile with debugging information
enabled and export CLING_DEBUG=1 environment variable before running.
You may get help by asking at the ROOT forum ROOT Forum
preferably using the command (.forum bug) in the ROOT prompt.
Only if you are really convinced it is a bug in ROOT then please submit a
report at Sign in to GitHub · GitHub or (preferably) using the command (.gh bug) in
the ROOT prompt. Please post the ENTIRE stack trace
from above as an attachment in addition to anything else

To sort out the issue with your ROOT code on lxplus, make sure you’re using the same ROOT version as on your local setup. Check the environment settings on lxplus ans use debugging tools if needed.

If you post the ROOT file somewhere, I can try to find out what’s going on.

Here is a file, it’s around 500MB though, thanks a lot!

https://cernbox.cern.ch/s/ZniP9eBaA78eRWA

Ok, here is the macro I used:

#include "TFile.h"
#include "TTreeIndex.h"
#include "TTree.h"

void sortS(){
TFile * f = new TFile("/tmp/Run10_list.root", "READ");

TTree *t;
f->GetObject("C_data",t);
TFile * f_out = new TFile(Form("/tmp/output.root"),"RECREATE");

// Create index on tunix branch (time)
Int_t nb_idx = t->BuildIndex("trigID","0");
TTreeIndex *att_index = (TTreeIndex*)t->GetTreeIndex();
TTree* to = (TTree*)t->CloneTree(0); // Problematic line

to->SetName("C_data");
to->SetDirectory(f_out);
// Loop on t_raw entries and fill t
//for( Long64_t i = 0; i < att_index->GetN(); i++ ) {
//Running on 10k events for starters because it's faster
for( Long64_t i = 0; i < 10000; i++ ) {
    t->GetEntry( att_index->GetIndex()[i]) ;
    to->Fill();
}

f_out->WriteTObject(to);
f_out->Close();

}

With it, I get the following crash using ROOT master, when calling TTree* to = (TTree*)t->CloneTree(0):

Fatal in <TBufferFile::WriteFastArray>: Not enough space left in the buffer (1GB limit). 522027660 elements is greater than the max left of 268317223
aborting

The reason is that, when cloning the TTree, even if you set entries=0, your generated TTreeIndex is also cloned (streamed), and it’s rather big to be processed at once by ROOT buffers.
R__b.WriteFastArray(fIndexValues, fN);
fIndexValues are 64-bit, 522027660 elements, so it exceeds the maximum size of 1GB per object in ROOT. See Overcome 1GB size limit for IO buffers · Issue #6734 · root-project/root · GitHub

So the solution is easy:
Call t->SetTreeIndex(0) before t->CloneTree(0),
or clone the Tree before building the index :wink:
Either way, no crash will happen.

TTree* to = (TTree*)t->CloneTree(0); // No longer problematic line
Int_t nb_idx = t->BuildIndex("trigID","0");

@pcanal if CloneTree is called with nentries different from -1, maybe the tree index should be internally disabled before and reenabled after the call to ::Clone()?

1 Like