PyROOT: TTree.Write() Seg Fault

Hello,

I’ve got a TPySelector which runs great, but recently it’s been segfaulting in Terminate() while writing some TTrees to file. I’ve noticed that this only happens if I process a certain number of events, so I assume it has something to do with the size of the trees being written (which may be quite large). Any help would be appreciated!

Here’s the offending code, and a stack trace is attached.

[code] def SlaveTerminate(self):
print “ttSelector.py: slave terminating.”

    for h in self.cutflows.values():
        self.GetOutputList().Add(h)
    for t in self.outTrees.values():
        self.GetOutputList().Add(t.tree)

    print self.nevents, "processed."
    print float(self.nevents)/(time() - self.begin), "events processed per", \
            "second, excluding startup and shutdown overhead."


def Terminate(self):
    print "ttSelector.py: terminating."

    of = TFile("out.root", "RECREATE")
    for item in self.GetOutputList():
        print "ttSelector.py: writing", item.GetName(), "to file."
        stdout.flush()
        item.Write()[/code]

seg_fault.txt (22.6 KB)

Hi,

this error message: “std::bad_alloc (C++ exception)”, would indicate that the job ran out of memory. Is it possible not to keep all information in memory, but to write portions out to disk along the course of the job?

Cheers,
Wim

Thanks for the reply!

Is there a prescription for doing this on the fly?

I’d still like to be able to run on a PROOF cluster, but I’m not sure at which step in the process all the trees from the slaves are added together. I assume it’s at Terminate(), but I suppose it could be in some method that can’t be overloaded in python.

Chris

Hi,

sorry, my PROOF-knowledge doesn’t reach very far. I hope that one of the experts pipes in (otherwise try re-posting in the PROOF subforum).

If there’s any method not overloadable in python (the selection is done by hand …), I can of course add it.

If there’s a specific python problem that causes the growth in memory usage, which of course can be, that I’d need to see the code to figure that out.

Cheers,
Wim

Hey,

OK, well I can wait on the PROOF info for now. There should be a merging of the TSelectorLists from all the slaves on the master node at some point, but I can’t find the method in which that happens. I have the feeling that just understanding at which stage this happens would be helpful.

But I’m interested to know if there’s an automatic way to store chunks of the tree to file when they’re already Fill()'d? That would be extremely handy. Or is there a ‘standard’ manual way to do the same thing? That’d be helpful as well.

I don’t think that there’s a memory growth problem in the python bindings. I’ve tested on a little toy-version in c++ and I get the same problem–the trees are just too big.

Cheers,

Chris

Chris,

you could reduce the maximum amount of memory resident data of the TTree by putting setting lower value with SetMaxVirtualSize() (get the current value with GetMaxVirtualSize() ). Actually, maybe have a look first to see what GetMaxVirtualSize() gives and multiply it with the total number of trees that you have to see whether that is indeed a number large enough to cause memory problems on the machine that you are running.

Cheers,
Wim