Skimming and slimming in pyroot

Hi pyroot users,

I’ve been trying to write a general purpose python script for dropping branches and skimming events in an ntuple, called tree_trimmer.py. You can find it here:
svnweb.cern.ch/trac/penn/browse … trimmer.py

The skim function (that should return True/False if the event passes the skim) is configurable on the command-line. Also, the branches to be kept can be specified by giving a file with a branch name per line. So, I think I have the skimming and dropping branches working ok.

The problem is, I wanted to configurably request that additional trees existing in the input ntuples, be coppied in their entirety. The use case for me is I want to copy the CollectionTree and TrigConfTree that come with ATLAS D3PDs. Keep in mind that the skimming job will combine multiple root files into a single file (or a few). These additional trees only have 1 entry, and in general, it makes sense to me to combine (like hadd) the entries of the additional trees, in the output ntuple. So I’ve tried to CloneTree when a new output file is created, and CopyEntries when a new input file is encountered. Unfortunately, I get a segfault whenever the ouput file changes.

In the following example, I’ve done chain.SetMaxTreeSize to 10 MB, to force the output file to change.

Am I doing something silly, or is there a better way to do this?

./tree_trimmer.py --skim='skim' /exports/project/data_d03_1/reece/datasets/2011/mc10b/group10.perf-tau.mc10_7TeV.109910.SherpabbAtautaulhMA120TB20.e769_s933_s946_r2302_r2300.01-01-01.D3PD.110531150033_TauMEDIUM/*.root skim.py -k 'CollectionTree,tauPerfMeta/TrigConfTree' -M 10 -m 2000
Using rootlogon.py
TClass::TClass:0: RuntimeWarning: no dictionary for class AttributeListLayout is available
TClass::TClass:0: RuntimeWarning: no dictionary for class pair<string,string> is available
Processing event 0 of 2000
Writing to output file: skim.root
Cloning tree CollectionTree ...
  done.
Cloning tree tauPerfMeta/TrigConfTree ...
  done.
Processing event 1000 of 2000
TTree::ChangeFile:0: RuntimeWarning: file skim_1.root already exist, trying with 2 underscores
Fill: Switching to new file: skim__1.root
Writing to output file: skim__1.root
TFile::Append:0: RuntimeWarning: Replacing existing TH1: h_n_events (Potential memory leak).
Cloning tree CollectionTree ...
  done.
Cloning tree tauPerfMeta/TrigConfTree ...

 *** Break *** segmentation violation



===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================

Thread 2 (Thread 0x411d5940 (LWP 14807)):
#0  0x0000003b1ca0cd01 in sem_wait () from /lib64/libpthread.so.0
#1  0x00000000004c39a8 in PyThread_acquire_lock (lock=0x12979da0, waitflag=128) at Python/thread_pthread.h:349
#2  0x0000000000493834 in PyEval_RestoreThread (tstate=0x137d2fd0) at Python/ceval.c:353
#3  0x00002b76c24fc482 in floatsleep (self=<value optimized out>, args=<value optimized out>) at /build/agaspar/work/Python-2.6.5/Modules/timemodule.c:921
#4  time_sleep (self=<value optimized out>, args=<value optimized out>) at /build/agaspar/work/Python-2.6.5/Modules/timemodule.c:206
#5  0x000000000049888c in call_function (f=0x137d9020, throwflag=<value optimized out>) at Python/ceval.c:3750
#6  PyEval_EvalFrameEx (f=0x137d9020, throwflag=<value optimized out>) at Python/ceval.c:2412
#7  0x000000000049a2a7 in PyEval_EvalCodeEx (co=0x2b76bb4e3cd8, globals=<value optimized out>, locals=<value optimized out>, args=0x12fc9628, argcount=1, kws=0x137d3f80, kwcount=0, defs=0x0, defcount=0, 
    closure=0x0) at Python/ceval.c:3000
#8  0x00000000004f2244 in function_call (func=0x2b76c0766758, arg=0x12fc9610, kw=0x137d7680) at Objects/funcobject.c:524
#9  0x00000000004197b8 in PyObject_Call (func=0x2b76c0766758, arg=0x12fc9610, kw=0x137d7680) at Objects/abstract.c:2492
#10 0x000000000049590a in ext_do_call (f=0x137d8e40, throwflag=<value optimized out>) at Python/ceval.c:4063
#11 PyEval_EvalFrameEx (f=0x137d8e40, throwflag=<value optimized out>) at Python/ceval.c:2452
#12 0x0000000000499097 in fast_function (f=0x137d8be0, throwflag=<value optimized out>) at Python/ceval.c:3836
#13 call_function (f=0x137d8be0, throwflag=<value optimized out>) at Python/ceval.c:3771
#14 PyEval_EvalFrameEx (f=0x137d8be0, throwflag=<value optimized out>) at Python/ceval.c:2412
#15 0x0000000000499097 in fast_function (f=0x137d1570, throwflag=<value optimized out>) at Python/ceval.c:3836
#16 call_function (f=0x137d1570, throwflag=<value optimized out>) at Python/ceval.c:3771
#17 PyEval_EvalFrameEx (f=0x137d1570, throwflag=<value optimized out>) at Python/ceval.c:2412
#18 0x000000000049a2a7 in PyEval_EvalCodeEx (co=0x12fc2738, globals=<value optimized out>, locals=<value optimized out>, args=0x12fc95e8, argcount=1, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0)
    at Python/ceval.c:3000
#19 0x00000000004f213d in function_call (func=0x12fd3b18, arg=0x12fc95d0, kw=0x0) at Objects/funcobject.c:524
#20 0x00000000004197b8 in PyObject_Call (func=0x12fd3b18, arg=0x12fc95d0, kw=0x0) at Objects/abstract.c:2492
#21 0x0000000000420b00 in instancemethod_call (func=0x12fd3b18, arg=0x12fc95d0, kw=0x0) at Objects/classobject.c:2579
#22 0x00000000004197b8 in PyObject_Call (func=0x2b76baed9c30, arg=0x2b76b7805050, kw=0x0) at Objects/abstract.c:2492
#23 0x0000000000492df6 in PyEval_CallObjectWithKeywords (func=0x2b76baed9c30, arg=0x2b76b7805050, kw=0x0) at Python/ceval.c:3619
#24 0x00000000004c865d in t_bootstrap (boot_raw=0x1377ad10) at ./Modules/threadmodule.c:425
#25 0x0000003b1ca0673d in start_thread () from /lib64/libpthread.so.0
#26 0x0000003b1bed44bd in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x2b76b7803080 (LWP 14793)):
#0  0x0000003b1be9a14f in waitpid () from /lib64/libc.so.6
#1  0x0000003b1be3c481 in do_system () from /lib64/libc.so.6
#2  0x0000003b1be3c7d7 in system () from /lib64/libc.so.6
#3  0x00002b76bc6ec882 in TUnixSystem::StackTrace() () from /afs/cern.ch/sw/lcg/app/releases/ROOT/5.26.00e_python2.6/x86_64-slc5-gcc43-opt/root/lib/libCore.so
#4  0x00002b76bc6ed285 in TUnixSystem::DispatchSignals(ESignals) () from /afs/cern.ch/sw/lcg/app/releases/ROOT/5.26.00e_python2.6/x86_64-slc5-gcc43-opt/root/lib/libCore.so
#5  <signal handler called>
#6  0x00002b76bf0c1915 in TTree::~TTree() () from /afs/cern.ch/sw/lcg/app/releases/ROOT/5.26.00e_python2.6/x86_64-slc5-gcc43-opt/root/lib/libTree.so
#7  0x00002b76bf0c8bc9 in ROOT::delete_TTree(void*) () from /afs/cern.ch/sw/lcg/app/releases/ROOT/5.26.00e_python2.6/x86_64-slc5-gcc43-opt/root/lib/libTree.so
#8  0x00002b76bc6a5bf7 in TClass::Destructor(void*, bool) () from /afs/cern.ch/sw/lcg/app/releases/ROOT/5.26.00e_python2.6/x86_64-slc5-gcc43-opt/root/lib/libCore.so
#9  0x00002b76bc20c0f9 in PyROOT::(anonymous namespace)::op_dealloc(PyROOT::ObjectProxy*) () from /afs/cern.ch/sw/lcg/app/releases/ROOT/5.26.00e_python2.6/x86_64-slc5-gcc43-opt/root/lib/libPyROOT.so
#10 0x0000000000465995 in subtype_dealloc (self=0x12fdfc80) at Objects/typeobject.c:1019
#11 0x00000000004460e7 in insertdict (mp=0x13c89a10, key=0x13cd48b0, hash=992138433894643587, value=0x12fdfdc0) at Objects/dictobject.c:459
#12 0x0000000000448430 in PyDict_SetItem (op=0x13c89a10, key=0x13cd48b0, value=0x12fdfdc0) at Objects/dictobject.c:701
#13 0x0000000000495ee7 in PyEval_EvalFrameEx (f=0x12a06c80, throwflag=<value optimized out>) at Python/ceval.c:1566
#14 0x000000000049a2a7 in PyEval_EvalCodeEx (co=0x2b76b78d56c0, globals=<value optimized out>, locals=<value optimized out>, args=0x2b76baee76b4, argcount=0, kws=0x12997d80, kwcount=0, defs=0x0, 
    defcount=0, closure=0x0) at Python/ceval.c:3000
#15 0x00000000004987e8 in fast_function (f=0x12997c00, throwflag=<value optimized out>) at Python/ceval.c:3846
#16 call_function (f=0x12997c00, throwflag=<value optimized out>) at Python/ceval.c:3771
#17 PyEval_EvalFrameEx (f=0x12997c00, throwflag=<value optimized out>) at Python/ceval.c:2412
#18 0x000000000049a2a7 in PyEval_EvalCodeEx (co=0x2b76b78db5d0, globals=<value optimized out>, locals=<value optimized out>, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0)
    at Python/ceval.c:3000
#19 0x000000000049a3a2 in PyEval_EvalCode (co=0x14201390, globals=0x0, locals=0x18d81db0) at Python/ceval.c:541
#20 0x00000000004ba9aa in run_mod (fp=0x1297e340, filename=0x7fff8d871754 "./tree_trimmer.py", start=<value optimized out>, globals=0x12931380, locals=0x12931380, closeit=1, flags=0x7fff8d86f180)
    at Python/pythonrun.c:1339
#21 PyRun_FileExFlags (fp=0x1297e340, filename=0x7fff8d871754 "./tree_trimmer.py", start=<value optimized out>, globals=0x12931380, locals=0x12931380, closeit=1, flags=0x7fff8d86f180)
    at Python/pythonrun.c:1325
#22 0x00000000004bac9d in PyRun_SimpleFileExFlags (fp=0x1297e340, filename=0x7fff8d871754 "./tree_trimmer.py", closeit=1, flags=0x7fff8d86f180) at Python/pythonrun.c:935
#23 0x00000000004150a3 in Py_Main (argc=13, argv=0x7fff8d86f2a8) at Modules/main.c:572
#24 0x0000003b1be1d994 in __libc_start_main () from /lib64/libc.so.6
#25 0x00000000004141d9 in _start ()
===========================================================


The lines below might hint at the cause of the crash.
If they do not help you then please submit a bug report at
http://root.cern.ch/bugs. Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#6  0x00002b76bf0c1915 in TTree::~TTree() () from /afs/cern.ch/sw/lcg/app/releases/ROOT/5.26.00e_python2.6/x86_64-slc5-gcc43-opt/root/lib/libTree.so
#7  0x00002b76bf0c8bc9 in ROOT::delete_TTree(void*) () from /afs/cern.ch/sw/lcg/app/releases/ROOT/5.26.00e_python2.6/x86_64-slc5-gcc43-opt/root/lib/libTree.so
#8  0x00002b76bc6a5bf7 in TClass::Destructor(void*, bool) () from /afs/cern.ch/sw/lcg/app/releases/ROOT/5.26.00e_python2.6/x86_64-slc5-gcc43-opt/root/lib/libCore.so
#9  0x00002b76bc20c0f9 in PyROOT::(anonymous namespace)::op_dealloc(PyROOT::ObjectProxy*) () from /afs/cern.ch/sw/lcg/app/releases/ROOT/5.26.00e_python2.6/x86_64-slc5-gcc43-opt/root/lib/libPyROOT.so
#10 0x0000000000465995 in subtype_dealloc (self=0x12fdfc80) at Objects/typeobject.c:1019
#11 0x00000000004460e7 in insertdict (mp=0x13c89a10, key=0x13cd48b0, hash=992138433894643587, value=0x12fdfdc0) at Objects/dictobject.c:459
#12 0x0000000000448430 in PyDict_SetItem (op=0x13c89a10, key=0x13cd48b0, value=0x12fdfdc0) at Objects/dictobject.c:701
#13 0x0000000000495ee7 in PyEval_EvalFrameEx (f=0x12a06c80, throwflag=<value optimized out>) at Python/ceval.c:1566
#14 0x000000000049a2a7 in PyEval_EvalCodeEx (co=0x2b76b78d56c0, globals=<value optimized out>, locals=<value optimized out>, args=0x2b76baee76b4, argcount=0, kws=0x12997d80, kwcount=0, defs=0x0, 
    defcount=0, closure=0x0) at Python/ceval.c:3000
#15 0x00000000004987e8 in fast_function (f=0x12997c00, throwflag=<value optimized out>) at Python/ceval.c:3846
#16 call_function (f=0x12997c00, throwflag=<value optimized out>) at Python/ceval.c:3771
#17 PyEval_EvalFrameEx (f=0x12997c00, throwflag=<value optimized out>) at Python/ceval.c:2412
#18 0x000000000049a2a7 in PyEval_EvalCodeEx (co=0x2b76b78db5d0, globals=<value optimized out>, locals=<value optimized out>, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0)
    at Python/ceval.c:3000
#19 0x000000000049a3a2 in PyEval_EvalCode (co=0x14201390, globals=0x0, locals=0x18d81db0) at Python/ceval.c:541
#20 0x00000000004ba9aa in run_mod (fp=0x1297e340, filename=0x7fff8d871754 "./tree_trimmer.py", start=<value optimized out>, globals=0x12931380, locals=0x12931380, closeit=1, flags=0x7fff8d86f180)
    at Python/pythonrun.c:1339
#21 PyRun_FileExFlags (fp=0x1297e340, filename=0x7fff8d871754 "./tree_trimmer.py", start=<value optimized out>, globals=0x12931380, locals=0x12931380, closeit=1, flags=0x7fff8d86f180)
    at Python/pythonrun.c:1325
#22 0x00000000004bac9d in PyRun_SimpleFileExFlags (fp=0x1297e340, filename=0x7fff8d871754 "./tree_trimmer.py", closeit=1, flags=0x7fff8d86f180) at Python/pythonrun.c:935
#23 0x00000000004150a3 in Py_Main (argc=13, argv=0x7fff8d86f2a8) at Modules/main.c:572
#24 0x0000003b1be1d994 in __libc_start_main () from /lib64/libc.so.6
#25 0x00000000004141d9 in _start ()
===========================================================

Ryan,

I see a warning about h_n_events, which appears to be recreate in a tighter loop than the one that fills it.

But the crash itself, I’m willing to bet that it is a TTree whose associated TFile has gone dodo. The clone of a tree stays connected to its ancestor and gets notified of its removal. I’m guessing that if it is kept alive, it should at least updates its addresses (which were shared with the ancestor). To make the (old) clone share the addresses with the newly read tree, something like “tree.CopyAddresses(clones[tree_name])” needs to be done.

HTH,
Wim

Hi Wim,

Thanks a lot for your help.

Sorry, I’ve been messing with the code in the trunk link I sent you. The previous version referenced in my first post is here:
svnweb.cern.ch/trac/penn/browse … v=6919#L75

I’m not sure what to do with your suggestion

It may be worth restating plainly what my requirements are: I have several input root files that contain a main “physics” tree, and some smaller “configuration” trees. I want to write a general script to skim the physics tree, dropping branches and/or events, and save the output in one one or more root files. I want to save the skim efficiency in a histogram: h_n_events, in each output file. One should also be able to configurably ask to keep the additional config trees as well.

The trouble is the script needs to handle cases when the input file changes and possibly when the output file changes. For each output file, I want to store a h_n_events histogram that stores the skim efficiency (total and passed events), and also make a complete copy of the additional config trees (beyond the main physics tree being skimmed). When an input file changes, I was thinking the best thing to do would be to append the entries of the config trees to the new existing copies in the current output file.

I was keeping this clones dictionary to keep track of the new output trees. I wouldn’t want to copy the address of the current input tree to that tree. Attempting to make this less confusing, I dropped the dictionary and just tried retrieving the ouput tree from the current output file:

tree_out = ch_out.GetCurrentFile().Get(tree_name)
tree_out.CopyEntries(tree_in)

svnweb.cern.ch/trac/penn/browse … =6935#L167
but that ended in similar crashes.

Then, I decided that hopefully the problem can be avoided if I don’t try to write the config trees when the current output file is still being written to by the skim of the main physics tree. Instead I’ve tried waiting for the ouput file to change, while keeping track of the input files read so far. Then when the output file changes, I re-open the previous output file and try to save the h_n_events histogram to it. Then I loop over the previous input files, and copy/append the config trees into the previous output file. Close up the files, and continue the main skim. See this new revision here:
svnweb.cern.ch/trac/penn/browse … y?rev=6950

It gets mad like this:

./tree_trimmer.py -M 10 -k 'CollectionTree,tauPerfMeta/TrigConfTree' /exports/project/data_d03_1/reece/datasets/2011/mc10b/group10.perf-tau.mc10_7TeV.109910.SherpabbAtautaulhMA120TB20.e769_s933_s946_r2302_r2300.01-01-01.D3PD.110531150033_TauMEDIUM/*.root
Using rootlogon.py
TClass::TClass:0: RuntimeWarning: no dictionary for class AttributeListLayout is available
TClass::TClass:0: RuntimeWarning: no dictionary for class pair<string,string> is available
Processing event 0 of 49992
Reading from input file: /exports/project/data_d03_1/reece/datasets/2011/mc10b/group10.perf-tau.mc10_7TeV.109910.SherpabbAtautaulhMA120TB20.e769_s933_s946_r2302_r2300.01-01-01.D3PD.110531150033_TauMEDIUM/group10.perf-tau.21508_000590.TauMEDIUM._00001.root
Fill: Switching to new file: skim_1.root
Error in <TFile::ReadBuffer>: error reading all requested bytes from file skim_1.root, got 240 of 300
TFile::Init:0: RuntimeWarning: file skim_1.root probably not closed, cannot read free segments
TFile::Init:0: RuntimeWarning: file skim_1.root has no keys
Cloning tree CollectionTree ...
Error in <TBasket::Create>: Cannot allocate 1414 bytes for ID = StreamESD_ref Title = CollectionTree
Error in <TTree::Fill>: Failed filling branch:CollectionTree.StreamESD_ref, nbytes=-1
 This error is symptomatic of a Tree created as a memory-resident Tree
 Instead of doing:
    TTree *T = new TTree(...)
    TFile *f = new TFile(...)
 you should do:
    TFile *f = new TFile(...)
    TTree *T = new TTree(...)
...

But I didn’t think there wold be this memeory-resident tree problem because I’m careful to cd() to the outpuf file I want before calling CloneTree(0).

Since recording the skim efficiency is currently more important to me than the config trees, let’s try to debug saving the h_n_events histogram first, without worrying about saving the additional config trees.

./tree_trimmer.py -M 10 /exports/project/data_d03_1/reece/datasets/2011/mc10b/group10.perf-tau.mc10_7TeV.109910.SherpabbAtautaulhMA120TB20.e769_s933_s946_r2302_r2300.01-01-01.D3PD.110531150033_TauMEDIUM/*.root
Using rootlogon.py
TClass::TClass:0: RuntimeWarning: no dictionary for class AttributeListLayout is available
TClass::TClass:0: RuntimeWarning: no dictionary for class pair<string,string> is available
Processing event 0 of 49992
Reading from input file: /exports/project/data_d03_1/reece/datasets/2011/mc10b/group10.perf-tau.mc10_7TeV.109910.SherpabbAtautaulhMA120TB20.e769_s933_s946_r2302_r2300.01-01-01.D3PD.110531150033_TauMEDIUM/group10.perf-tau.21508_000590.TauMEDIUM._00001.root
Fill: Switching to new file: skim_1.root
Error in <TFile::ReadBuffer>: error reading all requested bytes from file skim_1.root, got 240 of 300
TFile::Init:0: RuntimeWarning: file skim_1.root probably not closed, cannot read free segments
TFile::Init:0: RuntimeWarning: file skim_1.root has no keys
Error in <TKey::Create>: Cannot allocate 269 bytes for ID = h_n_events Title = 
Error in <TKey::Create>: Cannot allocate 3100 bytes for ID = StreamerInfo Title = Doubly linked list
Writing to output file: skim_1.root
Fill: Switching to new file: skim_2.root
Error in <TFile::ReadBuffer>: error reading all requested bytes from file skim_2.root, got 240 of 300
TFile::Init:0: RuntimeWarning: file skim_2.root probably not closed, cannot read free segments
TFile::Init:0: RuntimeWarning: file skim_2.root has no keys
Error in <TKey::Create>: Cannot allocate 269 bytes for ID = h_n_events Title = 
Error in <TKey::Create>: Cannot allocate 3100 bytes for ID = StreamerInfo Title = Doubly linked list
Writing to output file: skim_2.root
...

As you point out, I’m creating this histogram in a loop, but in this latest revision. I’m trying to create, fill, write it to the previous output file, and close, before continuing the skim.

prev_outfile = ROOT.TFile(prev_outfile_name, 'UPDATE')
prev_outfile.cd()
h_n_events = ROOT.TH1D('h_n_events', '', 20, -0.5, 19.5)
h_n_events.SetDirectory(0)
h_n_events.SetDirectory(prev_outfile)
h_n_events.SetBinContent(1, nf_events)
h_n_events.SetBinContent(2, nf_events_passed)

Any ideas, or simpler routes to meet my requirements? Thanks a lot.

Ryan,

“simpler” might be the CollAppend tool (part of ATLAS releases)? Give it N inputs and 1 output and it’ll weave the collection trees together. Takes care of POOL headers as well.

As for the CopyAddresses, what I meant was that before doing something like:tree_out.CopyEntries(tree_in)something equivalent to:tree_in.CopyAddresses(tree_out)should be done.

The documentation of CopyEntries() says:

[quote] // Copy nentries from given tree to this tree.
// This routines assumes that the branches that intended to be copied are
// already connected.[/quote]so that’s what needs to be done.

CopyEntries() in its explicit form is roughly speaking just GetEntry()/Fill() on the same address.

Cheers,
Wim

Hi Wim,

Thanks again. Copying the addresses seems to have helped. Using your suggestion, I switched back to orignal strategy I proposed of copying the config trees into each newly opened output file, instead of re-opening those finished with the skim. Now my script continues along without crashing.

The new revision is here:
svnweb.cern.ch/trac/penn/browse … y?rev=6960

The CollectionTree appears to have been coppied fine, but tauPerfMeta/TrigConfTree was not coppied to each file, only its sub-directory created. I would have thought the following logic would create the directory and copy the tree fine.

ch_out.GetCurrentFile().cd()
tree_dirname = os.path.dirname(tree_name)
if tree_dirname:
    print 'Creating directory: %s' % tree_dirname
    ch_out.GetCurrentFile().mkdir(tree_dirname).cd()
print 'Cloning tree %s ...' % tree_name
tree_out = tree_in.CloneTree(0)
tree_in.CopyAddresses(tree_out)
tree_out.CopyEntries(tree_in)
print '  done.'

The output from running now looks like:

./tree_trimmer.py -M 10 -k 'CollectionTree,tauPerfMeta/TrigConfTree' /exports/project/data_d03_1/reece/datasets/2011/mc10b/group10.perf-tau.mc10_7TeV.109910.SherpabbAtautaulhMA120TB20.e769_s933_s946_r2302_r2300.01-01-01.D3PD.110531150033_TauMEDIUM/*.root
Using rootlogon.py
TClass::TClass:0: RuntimeWarning: no dictionary for class AttributeListLayout is available
TClass::TClass:0: RuntimeWarning: no dictionary for class pair<string,string> is available
Processing event 0 of 49992
--> skim.root
Cloning tree CollectionTree ...
  done.
Creating directory: tauPerfMeta
Cloning tree tauPerfMeta/TrigConfTree ...
  done.
Writing to output file: skim.root
CopyEntries tree CollectionTree ...
  done.
CopyEntries tree tauPerfMeta/TrigConfTree ...
  done.
Reading from input file: /exports/project/data_d03_1/reece/datasets/2011/mc10b/group10.perf-tau.mc10_7TeV.109910.SherpabbAtautaulhMA120TB20.e769_s933_s946_r2302_r2300.01-01-01.D3PD.110531150033_TauMEDIUM/group10.perf-tau.21508_000590.TauMEDIUM._00001.root
TTree::ChangeFile:0: RuntimeWarning: file skim_1.root already exist, trying with 2 underscores
Fill: Switching to new file: skim__1.root
TFile::Append:0: RuntimeWarning: Replacing existing TH1: h_n_events (Potential memory leak).
--> skim__1.root
Cloning tree CollectionTree ...
  done.
Creating directory: tauPerfMeta
Cloning tree tauPerfMeta/TrigConfTree ...
  done.
Writing to output file: skim__1.root

As you can see, I’m still gettting the warning about the h_n_events histogram, and it never shows up in the output files. I am creating the histogram in a loop, but I’m trying to give up ownership to the current output file, in hopes that it will Close and be saved there properly when the skim automatically changes output files. I’ve put a debug print statement to show that the TDirectory for each histogram is propagating correctly, as you can see from the output above.

ch_out.GetCurrentFile().cd()
h_n_events = ROOT.TH1D('h_n_events', '', 20, -0.5, 19.5)
print '-->', h_n_events.GetDirectory().GetName()
ROOT.SetOwnership(h_n_events, False)

Thanks for your help. A thank you for pyroot.

Hi Wim and those interested,

I finally got tree_trimmer.py to do what I want. I had to abandon trying to write the additional trees at the same time as the main skim. Now, I keep track of what infiles have been read for each outfile. Then I wait untill the skim is complete and the files are closed, before I re-open each of the outfiles and add the h_n_events histograms and additional trees. This works fine.

I’m curious why ATLAS’ CollectionTree takes so long to copy (~1-2 mins)? And why would I ever need it?

Thanks for the help.

A working and documented version is here:
svnweb.cern.ch/trac/penn/browse … trimmer.py

Ryan,

w/o testing it, the copying of CollectionTree may take such a long time b/c of many (small) objects being constructed and destroyed when reading and writing.

I don’t know which version of ROOT you are using, but recent versions (from 5.28, I believe) allow for a “fast” option to CopyEntries (as the third function parameter, the second one should be -1 to copy all entries). The “fast” option tells CopyEntries() to copy the buffers directly:[quote] // If ‘option’ contains the word ‘fast’ and nentries is -1, the cloning will be // done without unzipping or unstreaming the baskets (i.e., a direct copy of the
// raw bytes on disk).[/quote]and this may make quite a difference.

Cheers,
Wim