TTree as Proof-Output

I tried to produce a TTree as a Proof Output but failed. What I did was:


TTree *Test_Tree;

Test_Tree = new TTree(“T”,“ExampleTree”);
Test_Tree->Branch(“Test_Entry”,&test_entry,“Test_Entry/D”);
fOutput->Add(Test_Tree);

Test_Tree->Write("",TObject::kOverwrite);

It works in the root session but the Branch is empty in Proof.

Any ideas and thanks for the help!

Hi,

The problem comes probably from the location where you do what you mention.

I suggest the following:

  1. Have Test_Tree as a private member of your TSelector implementation
class MySelector : public TSelector {
private:
   ...
   TTree      *Test_Tree;
   ...
public:
   ...
};
  1. Create the tree and register it in the output list in the SlaveBegin method of the selector:
void MySelector::SlaveBegin(TTree *)
{
   // function called before starting the event loop
   ...
   Test_Tree = new TTree("T","ExampleTree");
   Test_Tree->Branch("Test_Entry",&test_entry,"Test_Entry/D");
   fOutput->Add(Test_Tree); 
   ...
};
  1. The tree is filled in the Process method of your selector
Bool_t MySelector::Process(Long64_t entry)
{
   // entry is the entry number in the current Tree

   ...
   Test_Tree->Fill();
  1. Access the tree locally in the Terminate method of your selector
void MySelector::Terminate()
{
   // function called at the end of the event loop

   Test_Tree = dynamic_cast<TTree*>(fOutput->FindObject("T"));

   if (Test_Tree) {
      // Play with the tree 
      ...
   } else {
      Error("Terminate", "TTree object missing");
   }
}

Hope it helps.

Gerri Ganis

Thanks - that helped!

Hi,

I’m trying to do the same thing, but this answer didn’t help me unfortunately.

I’m doing everything as suggested here (as much as I can tell), but when my job finishes (after TSelector::SlaveTerminate() is called on the worker nodes but before TSelector::Terminate() is called locally) I get a lot of error messages like this:

Warning in <TMessage::CheckObject>: reference to object of unavailable class TObject, offset=975197555 pointer will be 0
Error in <TExMap::Remove>: key 64 not found at 65
Warning in <TMessage::CheckObject>: reference to object of unavailable class TObject, offset=64 pointer will be 0
Error in <TExMap::Remove>: key 31232 not found at 47

If I try to process all my events (12k in my test) then the local job simply crashes after these messages. If I only process 100 events, then the job finishes after printing all these messages. But if I try to access one of the branches of the output tree (which all seem to exist in the TTree returned by TProof) ROOT crashes violently under me.

Could someone enlighten me about the memory management of TTree-s? How is one supposed to create reasonably large TTree-s (say a few hundred MB) in a PROOF job? What if the worker node has problems keeping the TTree in memory?

If anyone is willing to look at my problem, I’m happy to give more details, as in some places I’m doing things in a non-standard fashion. (Which works perfectly when running locally, but apparently not with PROOF.)

Cheers,
Attila

Dear Attila,

Is your code running on PROOF with a small number of events?

If NO then we need more details about what you are really trying to do, with possibly the essential extracts of your code.

If YES then you are probably hitting a memory problem. PROOF was not originally designed to produce big outputs.
For the ALICE collaboration we have designed a solution which uses temporary files; this is still somewhat experimental, but I can try to explain you how to try it.

Gerri Ganis

Dear Gerri,

I’m trying to write the foundations to a framework that would make it a bit easier to handle flat ntuples as input and output to a general job running on PROOF. I’m attaching the code that I’m using right now.

The code tries to do very simple things, but does these in a quite complicated manner because I want to preserve generality. I have a base class in the code called CycleBase that provides the virtual functions that TSelector needs, provides a new set of virtual functions that the user should implement, and also provides some convenience functions that the user can use. There is also a test class in the package called TestCycle. This tries to use some of the basic functionalities of CycleBase by reading in a few variables from an input file, filling two 1-dimensional histograms and filling an output TTree with two relatively simple branches.

I add the new branches using the CycleBase::DeclareVariable(…) function. I’ve created this function for another project earlier (Which can be found under: atlas-sw.cern.ch/cgi-bin/viewcvs … me/SFrame/), and it works very well for handling output variables when running locally.

To compile this example, I execute “make && make par”. Afterwards I (try to) run the example with the “macro/proof_test.C” macro. (If you’d like to run it, you can find the input file under /afs/cern.ch/atlas/maxidisk/d181/SFrame.)

I think I’m doing something wrong when creating the tree, adding the output variables or when filling the tree. (The code for these can be found in the CycleBase.icc and CycleBase.cxx files.) I don’t think I should be hitting memory problems running over 100 events. The code is already complicated, for which I’m sorry. But if you can help me get it working, I’d be most grateful.

Later on I’d be interested in the ALICE approach to producing large outputs as well, so let’s remember that. :slight_smile:

Cheers,
Attila
pframe.tar.gz (7.98 KB)

Hi,

Just to give some updates: After playing some more with the code, I’m starting to get the idea that it’s not me who does the mistake after all…

Running the same selector locally on a TChain produces the output TTree just as I intended. But if I run the selector on PROOF, even if I don’t add any branches to my output TTree, I get these error messages. I thought maybe the problem was with how I added the branches to the TTree, but since the errors are there without adding any branches to it, now I’m beginning to think the problem is somewhere deeper.

I still have a few ideas that I can try, will get back if I find out more.

Cheers,
Attila

Oh boy…

Turns out it was indeed ROOT/PROOF that caused the problems. I would not bore you with the details, but it seems ROOT version 5.18 has some incompatibilities with 5.20.

In my tests I was sending a job from my MacBook running ROOT 5.20 to a PC running SLC4 and ROOT 5.18. Now that I changed the ROOT version on the PC to 5.20 as well, my job suddenly started working. No warnings, no errors, nothing.

Too bad, as I was quite impressed that I could send jobs from Mac OS X to Linux running two different versions of ROOT. Being able to send jobs from Mac OS X to Linux with the same version of ROOT is still pretty nice, don’t get me wrong, but the PROOF TWiki didn’t warn me about such possible incompatibilities.

So the minimum requirement is satisfied. Now, how about that ALICE support for large outputs? :slight_smile:

Cheers,
Attila

Dear Attila,

Sorry for the late reply to this thread.

When you say that using 5.20 everywhere the problems are gone, do you mean that your examples based on PFrame work fine now?

As for the incompatibilities, could you give some details? This may be useful while debugging other problems. We try to be backward compatible, but that is a very complicated task when working with code still evolving in functionality.

I have added a first description and an example of the file merging stuff in the wiki pages (http://root.cern.ch/twiki/bin/view/ROOT/ProofMergeFiles). We have a plan to solve the memory problem in a more dynamic way, w/o files on the workers. But the interface will not be so different, probably.

Sorry again for the slowness in replying. As Anna said this is peculiar period, because of absences. This should go back to normal end of the month.

Gerri

Dear all,
I noticed a strange feature. In my PROOF macro I store a TTree along with some histograms in two different output files.
I followed this recipe:

[quote=“ganis”]
I suggest the following:

  1. Have Test_Tree as a private member of your TSelector implementation
class MySelector : public TSelector {
private:
   ...
   TTree      *Test_Tree;
   ...
public:
   ...
};
  1. Create the tree and register it in the output list in the SlaveBegin method of the selector:
void MySelector::SlaveBegin(TTree *)
{
   // function called before starting the event loop
   ...
   Test_Tree = new TTree("T","ExampleTree");
   Test_Tree->Branch("Test_Entry",&test_entry,"Test_Entry/D");
   fOutput->Add(Test_Tree); 
   ...
};
  1. The tree is filled in the Process method of your selector
Bool_t MySelector::Process(Long64_t entry)
{
   // entry is the entry number in the current Tree

   ...
   Test_Tree->Fill();

Gerri Ganis[/quote]

but instead of

I used
Test_Tree->Write();
in the SlaveTerminate method, along with the histrograms I want to store in my output files.
What I get is that the TTree branches in the first output file are filles properly, instead the same TTree branches in the second output file are filled always with the 0 value.

Do you have any suggestion on how I can get the second file TTree filled properly?

Thank ypu,
Leonardo

Dear Leonardo,

I guess the problem comes from the fact that after the first

the TTree buffers are empty.

Do you really need to write the same tree in different files?
Also, node that doing thins in SlaveTerminate will create a copy of your two files for each worker.

G Ganis

Dear Ganis,
thank you for your quick reply.

[quote=“ganis”]
Do you really need to write the same tree in different files?
G Ganis[/quote]
I can write the tree in one file only, it is just more convenient for me to have it in two files that contain different histograms.

[quote=“ganis”]
Also, node that doing thins in SlaveTerminate will create a copy of your two files for each worker.
G Ganis[/quote]
Where should I call the Write method then?

Leonardo

Hi Leonardo,

I can write the tree in one file only, it is just more convenient for me to have it in two files that contain different histograms.

Then it seems that doing the Write (you actually should call it on the file rather than the TTree) in SlaveTerminate and not adding the TTree into the output list should have done the trick.

What was the actual code you used?

Cheers,
Philippe.

Hi,

[quote=“cristella”]it is just more convenient for me to have it in two files that contain different histograms.
[/quote]

I am not sure to understand:
do you want the same whole tree in all files? Or the tree splitted in various files?

The solution may also depend on the size of the tree …

G Ganis

Hi Ganis,
I want the same whole tree in all files.
So far my tree contains about ten Float_t branches with O(10K) events each.

Leonardo

Hi Philippe,

I just followed the same steps I use when writing histograms.

In my header file:

class psiPrimePiK_MC : public TSelector {
public :
   TProofOutputFile *OutFile_1, *OutFile_2; 
   TFile            *fOut_1, *fOut_2 ;
   
   Float_t my_branch;
   TTree *my_tree;
}

and in my .C file:

void psiPrimePiK_MC::SlaveBegin(TTree * /*tree*/) {
   OutFile_1 = new TProofOutputFile( "myFile_1.root" );
   OutFile_2 = new TProofOutputFile( "myFile_2.root" );
   fOut_1 = OutFile_1->OpenFile("RECREATE");
   fOut_2 = OutFile_2->OpenFile("RECREATE");

   my_tree = new TTree("mva_variable_all","MVA variables");
   my_tree->Branch("my_branch", &my_branch, "my_branch/F" );
   ...
}

Bool_t psiPrimePiK_MC::Process(Long64_t entry) {
   ...
   my_branch = 3;
   my_tree->Fill();
   ...
}

void psiPrimePiK_MC::SlaveTerminate() {
   fOut_1->cd();
   my_treel->Write();
   
   fOut_2->cd();
   my_treel->Write();
}

but the tree in myFile_2.root is not filled properly.

Thank you,
Leonardo

Dear Leonardo,

Sorry for the late reply, you should then work with the tree in memory

void MySelector::SlaveBegin(TTree *)
{
   // function called before starting the event loop
   ...
   Test_Tree = new TTree("T","ExampleTree");
   Test_Tree->Branch("Test_Entry",&test_entry,"Test_Entry/D");
   fOutput->Add(Test_Tree); 
   ...
};

and then in Terminate write it to the two files:

void MySelector::Terminate()
{
   // function called in the client at the end ...
   ...
   Test_Tree = (TTree *) fOutput->FIndObject("Test_Entry");
   if (Test_Tree) {
      TFile *f = TFile::Open("MyNewFileOne.root", "RECREATE");
      f->WriteObject(Test_Tree, "TestENtry");
      delete f;
      f = TFile::Open("MyNewFileTwo.root", "RECREATE");
      f->WriteObject(Test_Tree, "TestENtry");
      delete f;
   }
   ...
};

If this does not work please provide the simplest example allowing to reproduce the problem.

G Ganis