After finally getting my TSelector working properly with PROOF I’m running into some strange errors. My code seems to work great most of the time, but occasionally (and always when running on large chains) I get the following error and the logs only show a seg fault with no explanation.
Info in <TProofLite::MarkBad>:
+++ Message from master at Darwin : marking 0.0-Darwin-1252752655-22839:-1 (0.0) as bad
+++ Reason: undefined message in TProof::CollectInputFrom(...)
I’m also getting seg faults all of the time when exiting root after running my PROOF code- but I’m not sure that’s related. Is anyway to solve this? How can I see where the error is coming from, and why is the error so random?
I’m using root 5.22 on a gcc 4.4 64bit ubuntu system.
the output isn’t big at all - one histogram and a one printf().
it crashes on large inputs ~ around 0.5 GB.
the only trace I get is:
(no debugging symbols found)...done.
(no debugging symbols found)...done.
0x00007f459193ea8e in waitpid () from /lib/libc.so.6
error detected on stdin
I cannot say much following this error messages that you have.
At this point to try to help you I need to know better your setup and possibly have a look at your selector.
From the first report it seems that you are using PROOF-Lite, which was first released in 5.22 . Could you confirm that? How many cores? If not, could you give more details about your PROOF cluster?
Then, could post your selector and describe the input (a TChain? how many files? Read from where: local disk? remote server?)