AMD barton, Debian(Ubuntu), and ROOT

I’ve had nothing but difficulties with Root and Debian. However, after many trials I believe I installed ROOT successfully on my AMD barton machine runnning Ubuntu Linux. I did so by acquiring the source from APT and building from that source rather than from source available from ROOT CVS.

Okay, so now I have problems with data. I’ve been running analysis code successfully on an Intel Red Hat 9 machine at the university. The data ntuples seem to be fine here. When I run at home the program eventually crashes with a segmentation fault and messages like:
ERROR leaf: len= XXXXXXX ,max = XXX
Where XXXX are some numbers.

My question is, could this problem arise simply from using Debian build of Root. Or could there be problems associated with processor. I’d be happy to post the full error messages if there is interest. If someone said abandon Debian and use Red Hat or Fedora, I would, to solve the problem. If this is a processor issue then forget it.

Much thanks for those willing to reply.

Hi,

Without your code it is hard to say. My best bet is that you should have only one leaf per branch (to avoid issue about alignment) (Or better yet, using a compiled class)

Philippe.

Yes, we have one leaf per branch in our ntuples. All of my programs work fine at the University server (Red Hat 9 dual intel). One of my programs is the skimming program which will produce a new ntuple. Here’s the error message from that program:
Warning in TBasket::ReadBasketBuffers: basket:pi0_p4ErrCovXX has fNevBuf=
104 but fEntryOffset=0, pos=6365939, len=7196, fNbytes=7196, fObjlen=7512,
trying to repair

*** Break *** segmentation violation
Generating stack trace…
0xb795ac2d in TBuffer::ReadFastArray(double*, int) + 0x2b from /home/rmwhi
te/programs/RootSource/root-5.09.01/lib/libCore.so
0xb6e08ea6 in TLeafD::ReadBasket(TBuffer&) + 0xee from /home/rmwhite/progr
ams/RootSource/root-5.09.01/lib/libTree.so
0xb6de0a43 in TBranch::ReadLeaves(TBuffer&) + 0x2b from /home/rmwhite/prog
rams/RootSource/root-5.09.01/lib/libTree.so
0xb6de1a21 in TBranch::GetEntry(long long, int) + 0x267 from /home/rmwhite
/programs/RootSource/root-5.09.01/lib/libTree.so
0xb6e1e959 in TTree::GetEntry(long long, int) + 0xb7 from /home/rmwhite/pr
ograms/RootSource/root-5.09.01/lib/libTree.so
0xb6df72e4 in TChain::GetEntry(long long, int) + 0x42 from /home/rmwhite/p
rograms/RootSource/root-5.09.01/lib/libTree.so
0x0804eac5 in main + 0x2ce3 from SkimTuple
0xb67e8001 in __libc_start_main + 0xd1 from /lib/tls/i686/cmov/libc.so.6
0x0804bc41 in TApplicationImp::ShowMembers(TMemberInspector&, char*) + 0x5
5 from SkimTuple
Aborted

I’ll attach my program, much thanks for any help
SkimTuplehh.txt (8.08 KB)
SkimTuplecc.txt (21.5 KB)

Hi,

Does valgrind (valgrind.kde.org) work in your environment? If it does, try to use it.

One possible issue is that the data file you have somehow have some array that do not fit the memory you reserve (in the .hh file).

Also where was the input file written?

Could you provided enough files (and information how to run) to be able to exactly reproduce the failing case?

Cheers,
Philippe

I am not sure what valgrind is, but I’ll look into this evening if time permits. Let’s see if I can give you everything. Your suggestion that I have not allocated enough memory in the .hh files is the last thing I tried. I set all the arrays to some large number (200) and let the program go, and I produced the same error message that I posted. Other messages stopped appearing such as ERROR leaf: len = XXXXX max = XXX but still crashed.
The ntuples are written on the same partition on my system (why would this have an effect?). I just scp’d them from my University server. Okay so the regular names of the files are as follows:
SkimTuple.cc
SkimTuple.hh
SkimTuple.com
SkimTuple.inp
Ntuplestream.chn

Run SkimTuple.com to compile. The Ntuplestream.chn file is the list of ntuples. The SkimTuple.inp is the input file to chain the list of files together from Ntuplestream.chn.
The output file was produced from this program, here’s the ROOT message when opening:
Attaching file Test.root as _file0…
Warning in TFile::Init: file Test.root probably not closed, trying to recover
Info in TFile::Recover: Test.root, recovered key TTree:ntp100 at address 412810436
Warning in TFile::Init: successfully recovered 1 keys

I’ve taken a quick look at the file and the branches seem okay, but I have not run my analysis code over them. I’ll do some more invetigation this evening

If willing and you have what you need then go at it, but I think you’ll need some ntuples to see what’s really going on. As I said before this only happens on my home machine.

Much thanks
SkimTupleinp.txt (36 Bytes)
SkimTuplecom.txt (243 Bytes)
Ntuplestreamchn.txt (66.5 KB)

Hi,

Can you make some of the (AllEventsSkim-Run4-OnPeak-R16a-100.root) files available? I can not reproduce your problem without them.

Philippe.

Hi,

by the way: the problems you ran into with your build sound suspicious to me. We test ROOT on Debian, so it should just work. We need more details, like how you configured the root sources, and what debian and GCC version you are using.

Also, I hope that you are not copying object or shared library or even binary files from your university to your home computer, expecting them to still work. You will have to re-build your analysis code on your home computer.

Bottom line: if it works on your university computer but not on your home computer then there is most probably a problem with the build of ROOT or of your code on your home computer.

Axel.

Thanks for the interest, I appreciate the comments. Many of the suggestions I’ve considered and am at a loss as to why I have these problems. As far as the ROOT build, I believe I did things properly. I got the ROOT source from APT. Once I had the source, I followed the standard instructions. Like so:
./configure
make
make cintdlls
make install

Do I need to do anything additional for configure, considering I run an AMD barton chip???

Yes, I recompile all of my code at home, as one would expect. Everything compiles just fine, just like our University server.

I am suspicious of the ROOT build, but I have no guess as to what else to do. Any suggestions. Here’s my compiler:
g++ (GCC) 4.0.3 20051201 (prerelease) (Debian 4.0.2-5)
Ubuntu just loves to put the latest and greatest on their dist.

What’s worse is I cannot reproduce the error on the same ntuple or on the same event, and these runtime errors seem to happen when I chain many files together not just a few. I even took a look at the events where I supposedly have errors and see nothing wrong with them when I run Scan() in ROOT.

Again thanks for the interest in my problem, and any help is appreciated. I would really like to have my home system mirror my work on our University server.

I am trying out valgrind now. Can I use valgrind with root. If so, could this give me some idea of whether there is a problem with my build?

Okay, well I just ran my analysis code with valgrind -memcheck and and the memory leaks seem to come from /lib/ld-2.3.5.so. But I don’t see memory leaks pointing back to my program, so I’m still at a loss??? I’ve attached the log the valgrind, any takers?

Hi,

there is no attachment in your posting - note that phpBB2 only allows attachements with certain file extensions. Rename if necessary.

Axel.

Trying to attach log file from valgrind
memcheck.txt (20.8 KB)

Hi,
here are the options I can think of:

  • not enough disk space,
  • corruption while scp’ing or a disk problem (you should be able to check with md5sum on your side and on the uni side)
  • a build problem, either of root or of your code.

If it’s the latter I’ll only be able to help if you give me access to your home machine and detailed instructions how to reproduce it, I’m afraid. For that you can contact me at Axel.Naumann@cern.ch.

Axel.