Copy tree to new file and rename it

you do not need the simple quotes. Ex:

% rootls -l hsimple.root
TProfile        May 16 10:50 2019 hprof   "Profile of pz versus px"
TH1F            May 16 10:50 2019 hpx     "This is the px distribution"
TH2F            May 16 10:50 2019 hpxpy   "py vs px"
TNtuple         May 16 10:50 2019 ntuple  "Demo ntuple"
TDirectoryFile  Feb 12 15:14 2020 subdir  "subdir"
% rootcp hsimple.root:ntuple h2.root
% rootls -l h2.root                 
TNtuple  Mar 17 13:23 2021 ntuple  "Demo ntuple"
% 

rootls -l allYears_501793_MGPy8EG_W0Z0_lvll_lvllj_LO_emu_PtZge150GeV_myOutput.root
TTree  Dec 21 08:32 2020 MUON_SCALE__1up                                     "MUON_SCALE__1up"
TTree  Dec 21 08:31 2020 MUON_SCALE__1up                                     "MUON_SCALE__1up"
TTree  Dec 21 08:32 2020 nominal                                             "nominal"
TTree  Dec 21 08:31 2020 nominal                                             "nominal"
TTree  Dec 21 08:32 2020 PRW_DATASF__1down                                   "PRW_DATASF__1down"
TTree  Dec 21 08:31 2020 PRW_DATASF__1down                                   "PRW_DATASF__1down"
TTree  Dec 21 08:32 2020 PRW_DATASF__1up                                     "PRW_DATASF__1up"
TTree  Dec 21 08:31 2020 PRW_DATASF__1up                                     "PRW_DATASF__1up"
TTree  Dec 21 08:32 2020 Truth                                               "Truth"
TTree  Dec 21 08:31 2020 Truth                                               "Truth"

 rootcp allYears_501793_MGPy8EG_W0Z0_lvll_lvllj_LO_emu_PtZge150GeV_myOutput.root:nominal test.root
WARNING: Same name objects aren't supported: 'nominal' of 'allYears_501793_MGPy8EG_W0Z0_lvll_lvllj_LO_emu_PtZge150GeV_myOutput.root' won't be processed

May try to put the cycle number:

 rootcp allYears_501793_MGPy8EG_W0Z0_lvll_lvllj_LO_emu_PtZge150GeV_myOutput.root:nominal;1 test.root

with that i get

usage: rootcp [-h] [-c COMPRESS] [--recreate] [-r] [--replace]
              SOURCE [SOURCE ...] DEST
rootcp: error: too few arguments
-bash: 1: command not found

and if i put quotation marks around the first argument i get

 rootcp "allYears_501793_MGPy8EG_W0Z0_lvll_lvllj_LO_emu_PtZge150GeV_myOutput.root:nominal;1" test.root
WARNING: can't find nominal;1 in allYears_501793_MGPy8EG_W0Z0_lvll_lvllj_LO_emu_PtZge150GeV_myOutput.root

Can you provide access to this root file if it is not too big ?

@pcanal To me, the original post here demonstrates a bug that appears when the original object (tree) has more than one “cycle” in the original file.

UPDATE: As stated in one of the posts below, this is NOT a “bug” but a “feature” (expected behavior).

A brutal fix:

    oldtree->SetNameTitle("PolSig", "PolSig");
    auto newtree = oldtree->CloneTree();
    newfile.Write();

@Wile_E_Coyote I am not sure why it would be related the number of cycles in the original file.

My guess is that in the code:

    auto newtree = oldtree->CloneTree();
    newtree->SetNameTitle("PolSig", "PolSig");
    newfile.Write();

CloneTree does an autosave and/or write of the TTree with its "then current" name (i.e. nominal) so it works ‘as intended’.

The work-around that you provided is ‘correct’ (change the name before cloning it).

If one does not want to change the original tree name, then you can split the operationing:

auto newtree = oldtree->CloneTree(0); // copy just the structure.
newtree->SetObject("PolSig", "PolSig"); // Somehow SetNameTitle is not yet overloded in TTree :( ...
newtree->CopyEntries(oldtree, -1, "fast"); // "fast" is not the default in CopyEntries not in CloneTree :(

Cheers,
Philippe

@pcanal I did a small test with the “hsmiple.root”. As soon as there are two “cycles” of the “ntuple” (note: you need to increase its number of entries; I tried with 2500000), the problem with “SetObject” appears.

{ // hsimple_newfile.cxx
  TFile oldfile("hsimple.root");
  TTree *oldtree;
  oldfile.GetObject("ntuple", oldtree);
  TFile newfile("hsimple_newfile.root", "recreate");
  // newfile.cd();
#if 0 /* 0 or 1 */
  oldtree->SetNameTitle("PolSig","PolSig");
  auto newtree = oldtree->CloneTree();
#else /* 0 or 1 */
  auto newtree = oldtree->CloneTree();
  newtree->SetObject("PolSig","PolSig");
#endif /* 0 or 1 */
  newfile.Write();
  oldfile.ls(); std::cout << std::endl;
  newfile.ls();
  newfile.Close();
}

With your script (changing the old file name) and the file:

mac-135395:master.module pcanal$ root.exe -b -l  hs3.root -e 'ntuple->Print()'
root [0] 
Attaching file hs3.root as _file0...
(TFile *) 0x7fae38409660
******************************************************************************
*Tree    :ntuple    : Demo ntuple                                            *
*Entries :    25001 : Total =          504554 bytes  File  Size =     401247 *
*        :          : Tree compression factor =   1.25                       *

I get:

Processing wile.C...
TFile**		hs3.root	Demo ROOT file with histograms
 TFile*		hs3.root	Demo ROOT file with histograms
  OBJ: TNtuple	ntuple	Demo ntuple : 0 at: 0x7fb4fecd4070
  KEY: TH1F	hpx;1	This is the px distribution
  KEY: TH2F	hpxpy;1	py vs px
  KEY: TProfile	hprof;1	Profile of pz versus px
  KEY: TNtuple	ntuple;2	Demo ntuple [current cycle]
  KEY: TNtuple	ntuple;1	Demo ntuple [backup cycle]

TFile**		hsimple_newfile.root	
 TFile*		hsimple_newfile.root	
  OBJ: TNtuple	PolSig	PolSig : 0 at: 0x7fb4fee1bc80
  KEY: TNtuple	PolSig;1	PolSig

ROOT 6.22/08 on a Ubuntu 20.04 / x86_64, gcc 9.3.0:

[...]$ root -q hsimple_newfile.cxx 

Processing hsimple_newfile.cxx...
TFile**		hsimple.root	Demo ROOT file with histograms
 TFile*		hsimple.root	Demo ROOT file with histograms
  OBJ: TNtuple	ntuple	Demo ntuple : 0 at: 0x562993478980
  KEY: TNtuple	ntuple;2	Demo ntuple
  KEY: TNtuple	ntuple;1	Demo ntuple
  KEY: TH1F	hpx;1	This is the px distribution
  KEY: TH2F	hpxpy;1	py vs px
  KEY: TProfile	hprof;1	Profile of pz versus px

TFile**		hsimple_newfile.root	
 TFile*		hsimple_newfile.root	
  OBJ: TNtuple	PolSig	PolSig : 0 at: 0x562993926b00
  KEY: TNtuple	ntuple;1	Demo ntuple
  KEY: TNtuple	PolSig;1	PolSig

I’ll try 6.22 but it looks like I already fixed it :slight_smile: … my attempt were with tip of the main branch.

We got a mystery on our hand … I just tried with v6.22/08 and got the same (good) result I got before …
One noticeable difference between your output and mine (in both master and v6.22/08) is the order of the keys in the input file, you get:

  KEY: TNtuple	ntuple;2	Demo ntuple
  KEY: TNtuple	ntuple;1	Demo ntuple
  KEY: TH1F	hpx;1	This is the px distribution
  KEY: TH2F	hpxpy;1	py vs px
  KEY: TProfile	hprof;1	Profile of pz versus px

while I get:

  KEY: TH1F     hpx;1   This is the px distribution
  KEY: TH2F     hpxpy;1 py vs px
  KEY: TProfile hprof;1 Profile of pz versus px
  KEY: TNtuple  ntuple;2        Demo ntuple
  KEY: TNtuple  ntuple;1        Demo ntuple

So there is something different between your hsimple.root and mine. Could you send me yours?

Mine is created with:

bash: cp hsimple.root hs2.root
bash: root.exe -b -l
root [0] f = new TFile("hs2.root", "UPDATE");
root [1] ntuple->Fill(1.0);
root [2] f->Write();
root [3] delete f;
root [4] .q

For test purposes, I took the “${ROOTSYS}/tutorials/hsimple.C” macro and modified two lines:

   const Int_t kUPDATE = 100000;
   for (Int_t i = 0; i < 2500000; i++) {

Then:

rm -f hsimple.root # not really needed, of course
root -q hsimple.C # the modified macro which (re)creates "hsimple.root"
root -q hsimple_newfile.cxx # my test macro (output in my previous post)

@pcanal Let me know if you cannot reproduce this problem (I will then send you the “hsimple.root” file that I produced).

depending on the content of the TTree you’re dealing with, the root-cp command from Go-HEP may help you (while the one you get from ROOT/C++ is being fixed).

see:

$> root-cp -h
Usage: root-cp [options] file1.root[:REGEXP] [file2.root[:REGEXP] [...]] out.root

ex:
 $> root-cp f.root out.root
 $> root-cp f1.root f2.root f3.root out.root
 $> root-cp f1.root:hist.* f2.root:h2 out.root

options:

there’s also a root-merge command (under the same location).

$> root-merge -h
Usage: root-merge [options] file1.root [file2.root [file3.root [...]]]

ex:
 $> root-merge -o out.root ./testdata/chain.flat.1.root ./testdata/chain.flat.2.root

options:
  -o string
    	path to merged output ROOT file (default "out.root")
  -v	enable verbose mode

@pcanal I saw probably similar problem in master in my PR Testing builtin fast lzma2 in ROOT by oshadura · Pull Request #216 · root-project/rootbench · GitHub

Doesn’t work:

..
auto newtree = oldtree->CloneTree();
newfile->SetCompressionAlgorithm(algo);
newfile->SetCompressionLevel(comp_level);
..
newfile->Write();

Working version:

..
newfile->SetCompressionAlgorithm(algo);
newfile->SetCompressionLevel(comp_level);
auto newtree = oldtree->CloneTree();
..
newfile->Write();

This one worked for me and did not actually change the name of the tree in the original file.

So now get the first step done. Which means that i can take all the input files and from them produce new ones that only contain the properly renamed “nominal” trees including all cycle numbers.

However now i need to add all of these trees together. And i noticed that hadd always just takes the highest cycle. However, i need to combine all of the cycles from all of the files.

So currently i now have 2 files that contain PolSig;1 and PolSig;2.
One file that containts PolBkg;1 and one that containrs PolBkg;1 and PolBkg;2.

What i needs is one files that contains PolSig and PolBkg that are the sum of the four PolSig’s and three PolBkg’s. I do not care if they are also split into different cycles but i need all events from the input files to be in the merged one. Is there any reasonable way to do that?

See:

@Wile_E_Coyote with you chance to hsimple.C, I can reproduce the behavior …

So, the behavior is exactly as intended. The flow is:

  oldfile.GetObject("ntuple", oldtree);
  ...
  auto newtree = oldtree->CloneTree();

where the request is here “explicitly” to slow copy a TTree named “ntuple”, and which has, in the original case and the modified hsimple case, more entries that the AutoSave limit. This means that during the execution of CloneTree a cycle for the TTree named “ntuple” is (of course) created.
Calling TTree::SetObject (better version of SetNameTitle but same result here), will change the name of the live object (and we see that with the name of the key/cycle written by TFile::Write) but will not (and should not) retroactively change the name of the existing cycles in the file. So

  KEY: TNtuple	ntuple;1	Demo ntuple
  KEY: TNtuple	PolSig;1	PolSig

depicts accurately what has happened:

  1. a TNtuple named “ntuple” was auto-saved
  2. a TNtuple named “PolSig” was saved.

To modify the behavior you can change the name of the TNtuple *before* cloning it:

     oldtree->SetObject("PolSig","PolSig");
     auto newtree = oldtree->CloneTree();

*or* avoid the call to AutoSave (and save a lot of time! :slight_smile: ) by using the fast cloning.

  auto newtree = oldtree->CloneTree(-1, "fast");
  newtree->SetObject("PolSig","PolSig");

but still even in that case I would rename the object before cloning.

Cheers,
Philippe.

Yes, this is the intent. Lowest cycles are, for all the automatic tools, consider are backup backup that are to be ignored (as they contains redundant information).

So currently i now have 2 files that contain PolSig;1 and PolSig;2.

I am confused. Do you really have 2 completely distinct TNtuple with the same name and PolSig;1 is not a backup/partial-copy of PolSig;2? How did you get there? Why?

For example in the case of the extended hsimple.C from Wile, we ended up with a file containing:

TFile**		hs3.root	Demo ROOT file with histograms
 TFile*		hs3.root	Demo ROOT file with histograms
  OBJ: TNtuple	ntuple	Demo ntuple : 0 at: 0x7fb4fecd4070
  KEY: TH1F	hpx;1	This is the px distribution
  KEY: TH2F	hpxpy;1	py vs px
  KEY: TProfile	hprof;1	Profile of pz versus px
  KEY: TNtuple	ntuple;2	Demo ntuple [current cycle]
  KEY: TNtuple	ntuple;1	Demo ntuple [backup cycle]

where ntuple;1; has 1891734 entries and ntuple;2 has/points-to the exact same 1891734 entries plus an addition 608266 entries for a total of 2500000 entries.

In this case if one were to use both ntuple;1 and ntuple;2, you would be use the first 1891734 entries twice most likely rendering the result wrong.

Unless you are doing something unusual, you should ignore the lowest cycles.

If you are doing something unusual you have to weight the advantage you found in doing so with the fact that the automatic tools will be not recognized this unusual situation and ignore the lowest cycles (which contains usually duplicated/redundant information).