Most efficient way to slice TTree in one variable

Hi @eguiraud,

these files are just debug outputs from the mac root.exe process. I thought they might be useful to you.

In my tests, the single-thread version works correctly, can you confirm (i.e. can you just comment the call to ROOT::EnableImplicitMT())?

The fact that I need multi-threading (and at least 10 slices) to reproduce the problem complicates debugging a bit. Work in progressā€¦

1 Like

Hi @eguiraud,
I tried this already in the past and I got

Maybe you are not seeing this because you are not running over all slices/files?

I ran on both files and, I think, on all slices (for (float eta = 0; eta < 5.0; eta += 0.05)).
It took 14 minutes with a single thread, it required 7.5GB of RAM (lots of TTrees opened at the same time, I guess), and it did not print any error message.

Did the machine where you got the Bus error have at least 8 GB or RAM? (easy fix if increasing the number of slices causes memory problems: do just 10-15 slices at a timeā€¦while I try to figure out whatā€™s wrong).

1 Like

Hi @eguiraud,

any news on this? Without MT I am able to use it but it is rather cumbersome as I have very large files which take long to process and in addition for some slices I get

   SysError in <TFile::Flush>: error flushing file 

Let me know if you found a solution :slight_smile:

Edit: And it also get kills if I run over too many data

These Bus error, error flushing file and ā€œget kills if too many dataā€, I cannot reproduce on my machine, so I would attribute them to hitting some quota limits on lxplus or the machine you run on (or I need help to reproduce them). A workaround might be run on a few slices at a time rather than all slices at the same time, see my last post.

I have been looking into the errors:

Error in <TBranchElement::SetAddress>: STL container with fStreamerType: 500
Warning in <TTree::CopyEntries>: The export branch and the import branch do not have the same streamer type. (The branch name is m_vector.)

which I can reproduce (and, occasionally, also result in a segfault). The problem is with the branches of type FCS_matchedcellvector, which multi-thread Snapshot does not deal with correctly. I donā€™t have a workaround other than turning off IMT for now, but we are actively investigating.

Also: is this Snapshot-based solution any better than your original code, in the end?

1 Like

Hi @eguiraud,

okay, I guess I will have to deal with these restrictions for now.

Yes, this is definitely much better as the original copyTree method is much slower.

Cheers

The streamer type problem is now https://sft.its.cern.ch/jira/browse/ROOT-10648

Cheers,
Enrico

1 Like

Hi @eguiraud,

just to let you know that even with one thread the slicing crashes some times (mostly shortly before the slicing is complete) and sometimes I even get a stack trace:

===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0  0x00007f46090d241c in waitpid () from /lib64/libc.so.6
#1  0x00007f460904ff12 in do_system () from /lib64/libc.so.6
#2  0x00007f4609cc0533 in TUnixSystem::StackTrace() () from /cvmfs/sft.cern.ch/lcg/releases/LCG_96b/ROOT/6.18.04/x86_64-centos7-gcc8-opt/lib/libCore.so
#3  0x00007f4609cc2d84 in TUnixSystem::DispatchSignals(ESignals) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_96b/ROOT/6.18.04/x86_64-centos7-gcc8-opt/lib/libCore.so
#4  <signal handler called>
#5  ~TTreeReaderArrayBase (this=0x10478300, __in_chrg=<optimized out>) at /cvmfs/sft.cern.ch/lcg/releases/ROOT/6.18.04-c767d/x86_64-centos7-gcc8-opt/include/TTreeReaderArray.h:75
#6  ~TTreeReaderArray (this=0x10478300, __in_chrg=<optimized out>) at /cvmfs/sft.cern.ch/lcg/releases/ROOT/6.18.04-c767d/x86_64-centos7-gcc8-opt/include/TTreeReaderArray.h:75
#7  TTreeReaderArray<float>::~TTreeReaderArray (this=0x10478300, __in_chrg=<optimized out>) at /cvmfs/sft.cern.ch/lcg/releases/ROOT/6.18.04-c767d/x86_64-centos7-gcc8-opt/include/TTreeReaderArray.h:75
#8  0x00007f460a582bbb in ?? ()
#9  0x00007ffeca0b7618 in ?? ()
#10 0x0000000010478300 in ?? ()
#11 0x0000000010478300 in ?? ()
#12 0x0000000010ae4c08 in ?? ()
#13 0x00007ffeca0b7680 in ?? ()
#14 0x00007f460a58b143 in ?? ()
#15 0x000000001af03e50 in ?? ()
#16 0x00007f460a582b90 in ?? ()
#17 0x00007ffeca0b7670 in ?? ()
#18 0x0000000010ae4c08 in ?? ()
#19 0x0000000010ae4c08 in ?? ()
#20 0x00007f460a59e2e0 in ?? ()
#21 0x0000000010478300 in ?? ()
#22 0x0000000010ae4c08 in ?? ()
#23 0x00007ffeca0b76c0 in ?? ()
#24 0x00007f460a534f71 in ?? ()
#25 0x00007ffeca0b76d0 in ?? ()
#26 0x0000000000000000 in ?? ()
===========================================================


The lines below might hint at the cause of the crash.
You may get help by asking at the ROOT forum http://root.cern.ch/forum
Only if you are really convinced it is a bug in ROOT then please submit a
report at http://root.cern.ch/bugs Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#5  ~TTreeReaderArrayBase (this=0x10478300, __in_chrg=<optimized out>) at /cvmfs/sft.cern.ch/lcg/releases/ROOT/6.18.04-c767d/x86_64-centos7-gcc8-opt/include/TTreeReaderArray.h:75
#6  ~TTreeReaderArray (this=0x10478300, __in_chrg=<optimized out>) at /cvmfs/sft.cern.ch/lcg/releases/ROOT/6.18.04-c767d/x86_64-centos7-gcc8-opt/include/TTreeReaderArray.h:75
#7  TTreeReaderArray<float>::~TTreeReaderArray (this=0x10478300, __in_chrg=<optimized out>) at /cvmfs/sft.cern.ch/lcg/releases/ROOT/6.18.04-c767d/x86_64-centos7-gcc8-opt/include/TTreeReaderArray.h:75
#8  0x00007f460a582bbb in ?? ()
#9  0x00007ffeca0b7618 in ?? ()
#10 0x0000000010478300 in ?? ()
#11 0x0000000010478300 in ?? ()
#12 0x0000000010ae4c08 in ?? ()
#13 0x00007ffeca0b7680 in ?? ()
#14 0x00007f460a58b143 in ?? ()
#15 0x000000001af03e50 in ?? ()
#16 0x00007f460a582b90 in ?? ()
#17 0x00007ffeca0b7670 in ?? ()
#18 0x0000000010ae4c08 in ?? ()
#19 0x0000000010ae4c08 in ?? ()
#20 0x00007f460a59e2e0 in ?? ()
#21 0x0000000010478300 in ?? ()
#22 0x0000000010ae4c08 in ?? ()
#23 0x00007ffeca0b76c0 in ?? ()
#24 0x00007f460a534f71 in ?? ()
#25 0x00007ffeca0b76d0 in ?? ()
#26 0x0000000000000000 in ?? ()
===========================================================


Bus error (core dumped)

I guess I cannot produce a reproducer in this case as it sometimes happens and sometimes it does not, but I thought this might be useful for you.

A reproducer that crashes just some of the times is still a reproducer! Or better a full recipeā€¦it does not happen on my workstation as far as I can tell.

Cheers,
Enrico

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Hi @mark1,
this is just to let you know that thanks to @pcanal the issue with the fStreamerType errors when running a Snapshot on multiple threads has been resolved. The fix is in master and it will be part of the upcoming ROOT release 6.22.

Feel free to open a fresh thread in case you encounter are further issues.

Cheers,
Enrico

1 Like