Root problem with hprod.C segmentation fault bizarre dump

I seem to be having a problem with rootn.exe.

I have two machines running root.

Now I am getting a very weird problem which is why i post here.

The first machine is a P4 running Fedora Core 2.
The second machine is dual PIII running Fedora Core 2.

Now they are both configured with the same commands.
Which was configure linux -prefix=/usr/local

Now they both seemed to work. But recently when I was testing shared memory using
root. I tried hprod.C. On one of the machines the P4, the rootn.exe program crashed
with a segmentation fault, while the other machine, root ran smoothly and hcons.C
ran smoothly.

The hprod.C on the P4 crashed on this line. mfile->Update.

Does anyone know what might have called this problem. The hprod.C files are the
same. I figure since it ran on one and not the other it must be something in
the install that is different. Any hints would be greatly appreciated.

Edit: I apologize, now I just found out that the system that runs root fine was originally a 7.3 RH box and was upgraded to FC2. Now the root running on it was
originally compiled while it was a RH7.3 box, I am attempting to rebuild the code
and see if it being a FC2 box now causes it to not run properly.

Also see 2nd post on some recent observations.

Quick reply here.

I just downloaded a few builds of root.

Now I am not sure here so I apologize if I am wrong.
But this is something I just observed.

But I downloaded three binary versions of root.

Intel x86 Linux for Redhat9.0.93(Severn) and gcc 3.3, version 3.10/02
Intel x86 Linux for Redhat 9.0 and gcc 3.2.2, version 3.10/02
Intel x86 Linux for Redhat 7.3 and gcc 3.2, version 3.10/02

Now on a Fedora Core 1 and Core 2 system if you try to run root from the binaries associated with the Intel x86 Linux for Redhat9.0.93(Severn) and gcc 3.3, version 3.10/02 binaries, running rootn.exe and doing .x hprod.C segmentation faults.

However running the other two sets of binaries for Redhat 9.0 and 7,3 would run correctly. I tried this on multiple Fedora systems that are available to me.
And each had the same results.

Which lends me to ask. Is there something that needs to be done with the Fedora Core systems to run the Redhat 9.0.93 (RH10 . Fedora) binaries and compiling code from source.

I should add this problem of crashing seems to be isolated with using TMapFile and
accessing of shared memory through root.

I can provide any other information that might be useful to help me diagnose this problem.
I appreciate any help that is offered.

I don’t think that you can run with binaries generated on RH7.3
under Fedora Core1 or 2. With version 4.00 of ROOT we provide
binaries for Fedora Core1. see:
root.cern.ch/root/Version400.html

The situation with Linux is becoming a big mess with no signs of improvement. When you meet problems of this type, you should install ROOT from source, otherwise you will always face problems of incompatibilities with the basic system libraries (glibc in particular)

Rene

Thank you for the reply,

Interesting thing,

I downloaded the 4.00/04 source and the binaries for Fedora.

Using the the Fedora binaries give me the same crash as was experiencing from above.
Now an intereting thing. I have two machines running root. When I built from source
on machine gave me a different crash after running.

I built the source code and experience a different problem.

Which I posted about this in a different thread.
http://root.cern.ch/phpBB2/viewtopic.php?t=739

However the other machine had similar problems.
Now I can give system specs if it helps its a Fedora Core 2 system.
Linux scoot 2.6.5-1.358, its using root v4,00/04 compiled from source
with the ./configure linux --prefix=/usr/local (I can give more info if it helps)

Now I got a stack dump from the core dump and noticed a intersting thing
This is a output of the stack

#0 0x0034c1aa in mmalloc () from /usr/local/lib/root/libCore.so
#1 0x0034bb37 in mcalloc () from /usr/local/lib/root/libCore.so
#2 0x00163415 in operator new(unsigned) () from /usr/local/lib/root/libNew.so
#3 0x001635fb in operator new () from /usr/local/lib/root/libNew.so
#4 0x00258a82 in ErrorHandler () from /usr/local/lib/root/libCore.so
#5 0x00258c35 in Break(char const*, char const*, …) () from /usr/local/lib/root/libCore.so
#6 0x00335a61 in TUnixSystem::DispatchSignals(ESignals) () from /usr/local/lib/root/libCore.so
#7 0x00334a97 in SigHandler(ESignals) () from /usr/local/lib/root/libCore.so
#8 0x00338fd1 in sighandler(int) () from /usr/local/lib/root/libCore.so
#9
#10 0x03000000 in ?? ()
#11 0x0010b249 in lseek () from /lib/tls/libpthread.so.0
#12 0x006b6638 in JCR_LIST () from /usr/local/lib/root/libCore.so
#13 0x0034c6f5 in __mmalloc_mmap_morecore () from /usr/local/lib/root/libCore.so
#14 0x0034bf7c in align () from /usr/local/lib/root/libCore.so
#15 0x0034c055 in morecore () from /usr/local/lib/root/libCore.so

Interesting thing, is that this repeats over and over, until is crashes the stack.
Question. What could cause this. I am unfamilar with the root code so
I am not sure if there is something that could cause this group of events to repeat constantly.

If this helps at all, I scrolled to the bottom of the stack. This is the output before its
repeats. Not this is from starting rootn.exe and running .x hprod.C
I am not as familiar with root as others, so if this is a bug and should have a bug
report submitted for it, I will, its just my experience that user error is the norm
so I am questioning if I am missing something on my system or something is error
with the build that would cause this.

Repeats above the same calls from #108072 to #108057

#108057
#108058 0x03000000 in ?? ()
#108059 0x0010b249 in lseek () from /lib/tls/libpthread.so.0
#108060 0x006b6638 in JCR_LIST () from /usr/local/lib/root/libCore.so
#108061 0x0034c6f5 in __mmalloc_mmap_morecore () from /usr/local/lib/root/libCore.so
#108062 0x0034bf7c in align () from /usr/local/lib/root/libCore.so
#108063 0x0034c055 in morecore () from /usr/local/lib/root/libCore.so
#108064 0x0034c476 in mmalloc () from /usr/local/lib/root/libCore.so
#108065 0x0034bb37 in mcalloc () from /usr/local/lib/root/libCore.so
#108066 0x00163415 in operator new(unsigned) () from /usr/local/lib/root/libNew.so
#108067 0x001635fb in operator new () from /usr/local/lib/root/libNew.so
#108068 0x00258a82 in ErrorHandler () from /usr/local/lib/root/libCore.so
#108069 0x00258c35 in Break(char const*, char const*, …) () from /usr/local/lib/root/libCore.so
#108070 0x00335a61 in TUnixSystem::DispatchSignals(ESignals) () from /usr/local/lib/root/libCore.so
#108071 0x00334a97 in SigHandler(ESignals) () from /usr/local/lib/root/libCore.so
#108072 0x00338fd1 in sighandler(int) () from /usr/local/lib/root/libCore.so

#108073
#108074 0x002977c0 in TSystem::ResetErrno() () from /usr/local/lib/root/libCore.so
#108075 0x0016356e in operator delete(void*) () from /usr/local/lib/root/libNew.so
#108076 0x002c0d3f in TListIter::~TListIter() () from /usr/local/lib/root/libCore.so
#108077 0x002d165f in TClass::GetDataMember(char const*) const () from /usr/local/lib/root/libCore.so
#108078 0x002ccca6 in TBuildRealData::Inspect(TClass*, char const*, char const*, void const*) () from /usr/local/lib/root/libCore.so
#108079 0x0082ac20 in TH1::ShowMembers(TMemberInspector&, char*) () from /usr/local/lib/root/libHist.so
#108080 0x0082b96d in TH1F::ShowMembers(TMemberInspector&, char*) () from /usr/local/lib/root/libHist.so
#108081 0x0085dd9c in G__G__Hist_182_5_2(G__value*, char const*, G__param*, int) () from /usr/local/lib/root/libHist.so
#108082 0x00e300e4 in G__CallFunc::Exec(void*) () from /usr/local/lib/root/libCint.so
#108083 0x002d03a9 in TClass::BuildRealData(void*) () from /usr/local/lib/root/libCore.so
#108084 0x002d56ce in TClass::WriteBuffer(TBuffer&, void*, char const*) () from /usr/local/lib/root/libCore.so
#108085 0x0082b8fe in TH1F::Streamer(TBuffer&) () from /usr/local/lib/root/libHist.so
#108086 0x00267c8a in TMapFile::Update(TObject*) () from /usr/local/lib/root/libCore.so
#108087 0x004250b0 in G__G__Base2_202_7_3(G__value*, char const*, G__param*, int) () from /usr/local/lib/root/libCore.so
#108088 0x00dbf323 in G__call_cppfunc () from /usr/local/lib/root/libCint.so
#108089 0x00daf163 in G__interpret_func () from /usr/local/lib/root/libCint.so
#108090 0x00d941d5 in G__getfunction () from /usr/local/lib/root/libCint.so
#108091 0x00e1c8e2 in G__getstructmem () from /usr/local/lib/root/libCint.so
#108092 0x00e159c2 in G__getvariable () from /usr/local/lib/root/libCint.so
#108093 0x00d8b8fe in G__getitem () from /usr/local/lib/root/libCint.so
#108094 0x00d8a524 in G__getexpr () from /usr/local/lib/root/libCint.so
#108095 0x00dd4a82 in G__exec_function () from /usr/local/lib/root/libCint.so
#108096 0x00ddb5c9 in G__exec_statement () from /usr/local/lib/root/libCint.so
#108097 0x00dd81ca in G__exec_if () from /usr/local/lib/root/libCint.so
#108098 0x00ddb169 in G__exec_statement () from /usr/local/lib/root/libCint.so
#108099 0x00dd88b2 in G__exec_loop () from /usr/local/lib/root/libCint.so
#108100 0x00dd8d93 in G__exec_while () from /usr/local/lib/root/libCint.so
#108101 0x00ddb20b in G__exec_statement () from /usr/local/lib/root/libCint.so
#108102 0x00d73b1d in G__exec_tempfile_core () from /usr/local/lib/root/libCint.so
#108103 0x00d73d1b in G__exec_tempfile () from /usr/local/lib/root/libCint.so
#108104 0x00de3361 in G__process_cmd () from /usr/local/lib/root/libCint.so
#108105 0x002c950f in TCint::ProcessLine(char const*, TInterpreter::EErrorCode*) () from /usr/local/lib/root/libCore.so
#108106 0x002c961c in TCint::ProcessLineSynch(char const*, TInterpreter::EErrorCode*) () from /usr/local/lib/root/libCore.so
#108107 0x00241d4b in TApplication::ProcessFile(char const*, int*) () from /usr/local/lib/root/libCore.so
#108108 0x002413fd in TApplication::ProcessLine(char const*, bool, int*) () from /usr/local/lib/root/libCore.so
#108109 0x0011cf4f in TRint::HandleTermInput() () from /usr/local/lib/root/libRint.so
#108110 0x0011bd3a in TTermInputHandler::Notify() () from /usr/local/lib/root/libRint.so
#108111 0x0011d9de in TTermInputHandler::ReadNotify() () from /usr/local/lib/root/libRint.so
#108112 0x00335dd2 in TUnixSystem::CheckDescriptors() () from /usr/local/lib/root/libCore.so
#108113 0x00335627 in TUnixSystem::DispatchOneEvent(bool) () from /usr/local/lib/root/libCore.so
#108114 0x00297940 in TSystem::InnerLoop() () from /usr/local/lib/root/libCore.so
#108115 0x002978e5 in TSystem::Run() () from /usr/local/lib/root/libCore.so
#108116 0x00242069 in TApplication::Run(bool) () from /usr/local/lib/root/libCore.so
#108117 0x0011ca2e in TRint::Run(bool) () from /usr/local/lib/root/libRint.so
#108118 0x08048e4d in main ()

I appreciate any help that might be offered. I am going to attempt to see if the stack
dump is similar for the v4.00/04 binaries and for a v3.10/02 build on a similar system. I will post a reply after. If anyone needs any more information that could provide them assistance in solving this problem I will gladly provide it.

Could you send the shortest possible script reproducing this problem?
Run under gdb. When you get the crash, send the output of
gdb > bt

REne

Hi,

I just compiled the ROOT v4 cvs head on a Fedora Core 2 AMD64 machine and had no problems running the hprod.C and hcons.C macros.

Please compile from source. I did:

cd root
./configure
make
cd tutorials
rootn.exe
.x hprod.C

in other window:
cd root/tutorials
rootn.exe
.x hcons.C

and I get the histogram display.

Let me know.

Cheers, Fons.