Proof precess problem with ATLAS D3PD data

Dear Proof experts,
I am running my analysis job with proof in SFrame, and the input file is ATLAS d3pd MC sample. In last few days, my job crashed at run m_proof->Process( dsets, cycle->GetName(), “”, evmax, id-> GetNEventsSkip () ). The crash error are below.
The thing is strange for me that when I using the old ATLAS d3pd, the jobs is ok, but when I change to
the new D3PD, the jobs crash.
Can you please tell me what is the problem, and any hint to more in investigation? Many thanks
Cheers
Haiping

[quote] ( ERROR ) TUnixSystem::Di… : segmentation violation

===========================================================
There was a crash (kSigSegmentationViolation).
This is the entire stack trace of all threads:

#0 0x0000003f9ea9a115 in waitpid () from /lib64/libc.so.6
#1 0x0000003f9ea3c481 in do_system () from /lib64/libc.so.6
#2 0x00002b03ddce5cdb in TUnixSystem::Exec (this=0xc961280,
shellcmd=0xd2148d8 “/afs/cern.ch/sw/lcg/app/releases/ROOT/5.28.00b/x86_64-slc5-gcc43-dbg/root/etc/gdb-backtrace.sh 30184 1>&2”)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/unix/src/TUnixSystem.cxx:2005
#3 0x00002b03ddce4e94 in TUnixSystem::StackTrace (this=0xc961280)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/unix/src/TUnixSystem.cxx:2227
#4 0x00002b03ddce83d0 in TUnixSystem::DispatchSignals (this=0xc961280, sig=kSigSegmentationViolation)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/unix/src/TUnixSystem.cxx:1131
#5 0x00002b03ddce84fa in SigHandler (sig=kSigSegmentationViolation)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/unix/src/TUnixSystem.cxx:352
#6 0x00002b03ddcdd69c in sighandler (sig=11) at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/unix/src/TUnixSystem.cxx:3496
#7
#8 0x00002b03ddbae445 in TString::Length (this=0xd187ce0) at include/TString.h:345
#9 0x00002b03ddc1c191 in TString::FillBuffer (this=0xd187ce0, buffer=
0x7fffa2ddd9b0)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/base/src/TString.cxx:991
#10 0x00002b03defa7c86 in TStreamerInfoActions::TConfiguredAction::operator() (this=0x2b03de435110, buffer=…, object=0x7fffa2ddd9b0)
at include/TStreamerInfoActions.h:95
#11 0x00002b03defa027e in TBufferFile::ReadSequence (this=0xd187ce0, sequence=…, obj=0x7fffa2ddd9b0)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/io/io/src/TBufferFile.cxx:3623
#12 0x00002b03ddca0900 in TClass::StreamerTObject (this=0xd20fe40, object=0x7fffa2ddd9b0, b=…)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/meta/src/TClass.cxx:4954
#13 0x00002b03ddc9841b in TClass::StreamerDefault (this=0xd20fe40, object=0x7fffa2ddd9b0, b=…, onfile_class=0x0)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/meta/src/TClass.cxx:5017
#14 0x00002b03de09f14e in TClass::Streamer (this=0xd20fe40, obj=0x7fffa2ddd9b0, b=…, onfile_class=0x0) at include/TClass.h:372
#15 0x00002b03defa4db9 in TBufferFile::WriteObjectClass (this=0xd187ce0, actualObjectStart=0x7fffa2ddd9b0, actualClass=0xd20fe40)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/io/io/src/TBufferFile.cxx:2416
#16 0x00002b03defa4b83 in TBufferFile::WriteObjectAny (this=0xd187ce0, obj=0x7fffa2ddd9b0, ptrClass=0xd0d70e0)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/io/io/src/TBufferFile.cxx:2470
#17 0x00002b03ddc6f9f6 in operator<< (buf=…, obj=0x7fffa2ddd9b0) at include/TBuffer.h:392
#18 0x00002b03ddc75c31 in TList::Streamer (this=0xcc5b600, b=…) at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/cont/src/TList.cxx:1027
#19 0x00002b03ddbd02f5 in TDirectory::CloneObject (this=0x2b03de46b1c0, obj=0xcc5b600, autoadd=true)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/base/src/TDirectory.cxx:264
#20 0x00002b03ddbecef6 in TObject::Clone (this=0xcc5b600) at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/base/src/TObject.cxx:204
#21 0x00002b03ddc6f537 in TCollection::Clone (this=0xcc5b600, newname=0x2b03e0ad7718 “”)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/cont/src/TCollection.cxx:134
#22 0x00002b03e09cbd82 in TQueryResult::TQueryResult (this=0xd131960, seqnum=1, opt=0x2b03dd79f336 “”, inlist=0xcc5b600,
entries=9223372036854775807, first=0, selec=0xd131728 “BlackHole::BHAnalysis”)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/tree/tree/src/TQueryResult.cxx:58
#23 0x00002b03e2b13faf in TProofQueryResult::TProofQueryResult (this=0xd131960, sn=1, opt=0x2b03dd79f336 “”, inlist=0xcc5b600,
ent=9223372036854775807, fst=0, dset=0x0, sel=0xd131728 “BlackHole::BHAnalysis”, elist=0x0)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/proof/proof/src/TProofQueryResult.cxx:33
#24 0x00002b03e2b00e8a in TProofLite::MakeQueryResult (this=0xcc8f490, nent=9223372036854775807, opt=0x2b03dd79f336 “”, fst=0, dset=0x0,
selec=0xd131728 “BlackHole::BHAnalysis”) at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/proof/proof/src/TProofLite.cxx:906
#25 0x00002b03e2b03e0a in TProofLite::Process (this=0xcc8f490, dset=0x7fffa2ddda70, selector=0xd0d6c68 “BlackHole::BHAnalysis”,
option=0x2b03dd79f336 “”, nentries=9223372036854775807, first=0)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/proof/proof/src/TProofLite.cxx:1046
#26 0x00002b03dd73ba89 in SCycleController::ExecuteNextCycle (this=0x7fffa2dddf10) at src/SCycleController.cxx:657
#27 0x00002b03dd735e1a in SCycleController::ExecuteAllCycles (this=0x7fffa2dddf10) at src/SCycleController.cxx:322
#28 0x000000000040189c in main (argc=, argv=0x7fffa2dde218) at app/sframe_main.cxx:53

The lines below might hint at the cause of the crash.
If they do not help you then please submit a bug report at
root.cern.ch/bugs. Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.

#8 0x00002b03ddbae445 in TString::Length (this=0xd187ce0) at include/TString.h:345
#9 0x00002b03ddc1c191 in TString::FillBuffer (this=0xd187ce0, buffer=
0x7fffa2ddd9b0)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/base/src/TString.cxx:991
#10 0x00002b03defa7c86 in TStreamerInfoActions::TConfiguredAction::operator() (this=0x2b03de435110, buffer=…, object=0x7fffa2ddd9b0)
at include/TStreamerInfoActions.h:95
#11 0x00002b03defa027e in TBufferFile::ReadSequence (this=0xd187ce0, sequence=…, obj=0x7fffa2ddd9b0)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/io/io/src/TBufferFile.cxx:3623
#12 0x00002b03ddca0900 in TClass::StreamerTObject (this=0xd20fe40, object=0x7fffa2ddd9b0, b=…)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/meta/src/TClass.cxx:4954
#13 0x00002b03ddc9841b in TClass::StreamerDefault (this=0xd20fe40, object=0x7fffa2ddd9b0, b=…, onfile_class=0x0)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/meta/src/TClass.cxx:5017
#14 0x00002b03de09f14e in TClass::Streamer (this=0xd20fe40, obj=0x7fffa2ddd9b0, b=…, onfile_class=0x0) at include/TClass.h:372
#15 0x00002b03defa4db9 in TBufferFile::WriteObjectClass (this=0xd187ce0, actualObjectStart=0x7fffa2ddd9b0, actualClass=0xd20fe40)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/io/io/src/TBufferFile.cxx:2416
#16 0x00002b03defa4b83 in TBufferFile::WriteObjectAny (this=0xd187ce0, obj=0x7fffa2ddd9b0, ptrClass=0xd0d70e0)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/io/io/src/TBufferFile.cxx:2470
#17 0x00002b03ddc6f9f6 in operator<< (buf=…, obj=0x7fffa2ddd9b0) at include/TBuffer.h:392
#18 0x00002b03ddc75c31 in TList::Streamer (this=0xcc5b600, b=…) at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/cont/src/TList.cxx:1027
#19 0x00002b03ddbd02f5 in TDirectory::CloneObject (this=0x2b03de46b1c0, obj=0xcc5b600, autoadd=true)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/base/src/TDirectory.cxx:264
#20 0x00002b03ddbecef6 in TObject::Clone (this=0xcc5b600) at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/base/src/TObject.cxx:204
#21 0x00002b03ddc6f537 in TCollection::Clone (this=0xcc5b600, newname=0x2b03e0ad7718 “”)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/core/cont/src/TCollection.cxx:134
#22 0x00002b03e09cbd82 in TQueryResult::TQueryResult (this=0xd131960, seqnum=1, opt=0x2b03dd79f336 “”, inlist=0xcc5b600,
entries=9223372036854775807, first=0, selec=0xd131728 “BlackHole::BHAnalysis”)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/tree/tree/src/TQueryResult.cxx:58
#23 0x00002b03e2b13faf in TProofQueryResult::TProofQueryResult (this=0xd131960, sn=1, opt=0x2b03dd79f336 “”, inlist=0xcc5b600,
ent=9223372036854775807, fst=0, dset=0x0, sel=0xd131728 “BlackHole::BHAnalysis”, elist=0x0)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/proof/proof/src/TProofQueryResult.cxx:33
#24 0x00002b03e2b00e8a in TProofLite::MakeQueryResult (this=0xcc8f490, nent=9223372036854775807, opt=0x2b03dd79f336 “”, fst=0, dset=0x0,
selec=0xd131728 “BlackHole::BHAnalysis”) at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/proof/proof/src/TProofLite.cxx:906
#25 0x00002b03e2b03e0a in TProofLite::Process (this=0xcc8f490, dset=0x7fffa2ddda70, selector=0xd0d6c68 “BlackHole::BHAnalysis”,
option=0x2b03dd79f336 “”, nentries=9223372036854775807, first=0)
at /build/bellenot/SPI/x86_64-slc5-gcc43-dbg/root/proof/proof/src/TProofLite.cxx:1046
#26 0x00002b03dd73ba89 in SCycleController::ExecuteNextCycle (this=0x7fffa2dddf10) at src/SCycleController.cxx:657
#27 0x00002b03dd735e1a in SCycleController::ExecuteAllCycles (this=0x7fffa2dddf10) at src/SCycleController.cxx:322
#28 0x000000000040189c in main (argc=, argv=0x7fffa2dde218) at app/sframe_main.cxx:53

[/quote]

Hi Heiping,

you will find a thread on this on the atlas pat dev mailing list.
It seems that these D3PDs were produced with AutoFlush disabled.

Deactivating the TreeCache is a workaround.
proof->SetParameter( “PROOF_UseTreeCache”, ( Int_t ) 0 );

And I think that a fix is already on 5.30 trunk.

cheers, carlos

Hi,

This is to confirm that the fix will be in the forthcoming 5.30 and 5.28e and that the workaround if indeed the one indicated.

Gerri Ganis