I wanted to split a TTree in a root file in different files according to a certain selection criteria. I know how to do it with CopyTree and specifying the cut, but that means looping once for each file I want to split the TTree to. To give an idea, my tree has 1100M entries and I want to split it in 6000 files according to a certain criteria. That means it would take ages to perform 6000 CopyTrees on a 1100M entry tree. I was hoping there could be a way to do the splitting in just one loop.
I have tried a very simple test case but it doesn’t seem to work:
[code]int test() {
TFile* origin = new TFile(“origin.root”);
TFile* dest = new TFile(“dest.root”, “RECREATE”);
TTree* tree = (TTree*) origin->Get(“Pi0-Tuple”);
TTree* newtree = tree->CloneTree(0);
The snippet code in your post, is creating one tree whose branch are spread over several files, reading one entry of the TTree means reading for ALL those files are the same time, is that really what you want? It seems that the following code is what you are looking for:[code]int test() {
TFile* origin = new TFile(“origin.root”);
TTree *tree; origin->GetObject(“Pi0-Tuple”,tree);
TFile* downMassFile = new TFile"(downMass.root", “RECREATE”);
TTree *downMass = tree->CloneTree(0);
TFile* upMassFile = new TFile"(upMass.root", “RECREATE”);
TTree *upMass = tree->CloneTree(0);
int cutter(TString filename) {
TFile* origin = new TFile(filename);
TTree* tree = (TTree*) origin->Get(“Pi0-Tuple”);
int cell1, cell2;
tree->SetBranchAddress(“ind1”, &cell1);
tree->SetBranchAddress(“ind2”, &cell2);
int n = tree->GetEntries();
for (int i=0; i<n; i++){
std::cout << i << std::endl;
tree->GetEntry(i);
if (0) {
continue;
}
else if (cell1==1368 || cell2==1368 || cell1==2727 || cell2==2727)
{
TFile* newfile = new TFile(“1368_2727.root”, “UPDATE”);
TTree* newtree = (TTree*) newfile->Get(“Pi0-Tuple”);
if (!newtree) newtree = tree->CloneTree(0);
newtree->Fill();
newtree->Write();
newfile->Close();
}
else if (cell1==1910 || cell2==1910 || cell1==2185 || cell2==2185)
{
TFile* newfile = new TFile(“1910_2185.root”, “UPDATE”);
TTree* newtree = (TTree*) newfile->Get(“Pi0-Tuple”);
if (!newtree) newtree = tree->CloneTree(0);
newtree->Fill();
newtree->Write();
newfile->Close();
}
else if (cell1==2735 || cell2==2735 || cell1==1360 || cell2==1360)
{
TFile* newfile = new TFile(“2735_1360.root”, “UPDATE”);
TTree* newtree = (TTree*) newfile->Get(“Pi0-Tuple”);
if (!newtree) newtree = tree->CloneTree(0);
newtree->Fill();
newtree->Write();
newfile->Close();
}
else
{
TFile* newfile = new TFile(“2736_1359.root”, “UPDATE”);
TTree* newtree = (TTree*) newfile->Get(“Pi0-Tuple”);
if (!newtree) newtree = tree->CloneTree(0);
newtree->Fill();
newtree->Write();
newfile->Close();
}
}
return 0;
}
[/code]
But after running over 74k entries I find that each root file has more than one “Pi0-Tuple” with the form “Pi0-Tuple;XX”, and I don’t understand why. I understand that what I am doing here is a little but hacky, but since the “real” if has 6000 else ifs and I cannot have that many files open it’s the only solution I could come up with.
There is something in the way this works I am not getting correctly, can anybody enlighten me?
[quote]I find that each root file has more than one “Pi0-Tuple” with the form “Pi0-Tuple;XX”[/quote]Those are called ‘cycles’ (see User’s Guide for details) and are backup copies of the TTree meta. To avoid those you can call newtree->Write("",TObject::kOverwrite) or newtree->AutoSave().
To avoid having to open and close the TTree and TFile all the time (which are expansive operations), consider using something like this:
void fillTTree(const char *filename, TTree *original)
{
static TList files;
TFile *input = (TFile*)files.FindObject( filename );
TTree *newtree;
if (input == 0) {
// Check if we have space.
int alreadyOpened = files.GetEntries();
if (alreadyOpened > 500) {
// Close one of the files
TFile *toclose = (TFile*)files.First();
files.RemoveFirst();
toclose->Write("",kOverwrite);
delete toclose;
}
input = TFile::Open(filename,"UPDATE");
input->GetObject("Pi0-Tuple", newtree);
if (!newtree) newtree = tree->CloneTree(0);
else {
// Reconnect the TTree.
original->AddClone( newtree );
original->CopyAddresses( newtree );
}
} else {
// Assumes we already connected the new tree.
input->GetObject("PiO-Tuple", newtree);
}
newtree->Fill();
}
NOTE that your code is missing the lines: // Reconnect the TTree.
original->AddClone( newtree );
original->CopyAddresses( newtree );without which the ‘reloaded’ TTree will NOT copy any actual data …
with a slightly modified version of your function:
void fillTTree(const char *filename, TTree *original)
{
static TList files;
TFile *input = (TFile*)files.FindObject( filename );
TTree *newtree;
if (input == 0) {
// Check if we have space.
int alreadyOpened = files.GetEntries();
if (alreadyOpened > 10) {
// Close one of the files
TFile *toclose = (TFile*) files.First();
files.RemoveFirst();
toclose->Write("",TObject::kOverwrite);
delete toclose;
}
input = TFile::Open(filename,"UPDATE");
input->GetObject("Pi0-Tuple", newtree);
if (!newtree) newtree = original->CloneTree(0);
else {
// Reconnect the TTree.
original->AddClone( newtree );
original->CopyAddresses( newtree );
}
} else {
// Assumes we already connected the new tree.
input->GetObject("PiO-Tuple", newtree);
}
newtree->Fill();
}
I am getting errors of Too many files open with > 100 and with the shown value of 10 I get the following error (numbers indicate entry in the loop):
0
1
Error in <TFile::ReadBuffer>: error reading all requested bytes from file 406_3689.root, got 222 of 300
Warning in <TFile::Init>: file 406_3689.root probably not closed, cannot read free segments
Warning in <TFile::Init>: file 406_3689.root has no keys
2
Error in <TFile::ReadBuffer>: error reading all requested bytes from file 406_3689.root, got 222 of 300
Warning in <TFile::Init>: file 406_3689.root probably not closed, cannot read free segments
Warning in <TFile::Init>: file 406_3689.root has no keys
...
110
Error in <TFile::ReadBuffer>: error reading all requested bytes from file 406_3689.root, got 222 of 300
Warning in <TFile::Init>: file 406_3689.root probably not closed, cannot read free segments
Warning in <TFile::Init>: file 406_3689.root has no keys
Error in <TFile::ReadBuffer>: error reading all requested bytes from file 3241_854.root, got 222 of 300
Warning in <TFile::Init>: file 3241_854.root probably not closed, cannot read free segments
Warning in <TFile::Init>: file 3241_854.root has no keys
111
SysError in <TFile::TFile>: file 406_3689.root can not be opened (Too many open files)
I guess that puts my max files open in a little bit over 100, but I don’t know about the other errors. Besides, I think there wouldn’t be more than 10 files open. Am I leaving some file handlers unhandled?
[quote] Am I leaving some file handlers unhandled? [/quote]Yes, my bad . The code I gave is lacking the essential “files.Add(input);” after opening a new file (hence the TList was always empty):void fillTTree(const char *filename, TTree *original)
{
static TList files;
TFile *input = (TFile*)files.FindObject( filename );
TTree *newtree;
if (input == 0) {
// Check if we have space.
int alreadyOpened = files.GetEntries();
if (alreadyOpened > 10) {
// Close one of the files
TFile *toclose = (TFile*) files.First();
files.RemoveFirst();
toclose->Write("",TObject::kOverwrite);
delete toclose;
}
input = TFile::Open(filename,"UPDATE");
list.Add( input );
input->GetObject("Pi0-Tuple", newtree);
if (!newtree) newtree = original->CloneTree(0);
else {
// Reconnect the TTree.
original->AddClone( newtree );
original->CopyAddresses( newtree );
}
} else {
// Assumes we already connected the new tree.
input->GetObject("PiO-Tuple", newtree);
}
newtree->Fill();
}
it is almost working, but I get a bus error when input != 0 and tries to do
I have tried
and then works. It seems like there is something not accessed properly, but frankly I am completely lost in the way ROOT handles this kind of things… Is it safe if I leave it with my correction? Anyway, I’d like to understand why this fails…
The only normal reason why the first one would fail while the 2nd seems to be succeed would be if the object in the file exist but does not inherit from TTree.
However I just noted that there is a difference (most likely introduced by my typo) between the name used in both case, the first one is the letter 0 while the 2nd one use the number 0; I suspect that with the correct name the first one would also work.
I have been testing the script and if the file has not been closed nothing is written when the fillTTree function is closed. I do not understand why, maybe it is because the references are lost?
In order for the meta data (i.e. the TTree object itself) to be written to the disk you need to make sure that myfile->Write(…) is called …
So you need to make that all the files that are not yet closed by the end of cutter and finally closed … you can simply do:TIter fileiter( gROOT->GetListOfFiles() );
TFile *file;
while ( (file = (TFile*) fileiter() ) ) {
file->Write("",kOverwrite();
}(do not delete the file as it will invalidate the iterator that you are looping over).
[quote] Am I leaving some file handlers unhandled? [/quote]Yes, my bad . The code I gave is lacking the essential “files.Add(input);” after opening a new file (hence the TList was always empty):void fillTTree(const char *filename, TTree *original)
{
static TList files;
TFile *input = (TFile*)files.FindObject( filename );
TTree *newtree;
if (input == 0) {
// Check if we have space.
int alreadyOpened = files.GetEntries();
if (alreadyOpened > 10) {
// Close one of the files
TFile *toclose = (TFile*) files.First();
files.RemoveFirst();
toclose->Write("",TObject::kOverwrite);
delete toclose;
}
input = TFile::Open(filename,"UPDATE");
list.Add( input );
input->GetObject("Pi0-Tuple", newtree);
if (!newtree) newtree = original->CloneTree(0);
else {
// Reconnect the TTree.
original->AddClone( newtree );
original->CopyAddresses( newtree );
}
} else {
// Assumes we already connected the new tree.
input->GetObject("PiO-Tuple", newtree);
}
newtree->Fill();
}
Cheers,
Philippe[/quote]
Hi Philippe,
sorry for bringing up this old thread again, but there is something new that has developed with this code. I am trying to compile it as a standalone code, and now it fails because TTree->AddClone is protected. I have been looking at the ttree documentation and I don’t see any workaround. ANy hints?
Ok so you do need AddClone, use the following:#define protected publicas a way to work around the privacy. We will need to add a new interface and/or make AddClone public in a future release.