Std vector pointer branch and memory leak

wiso · January 4, 2011, 9:42am

Suppose I’ve a TFile containing a TTree and inside it some vector branches, and I want to read it, for example:

   vector<float>   *truth_ph_pt;
   vector<float>   *truth_ph_E;
   TBranch        *b_truth_ph_pt;   //!
   TBranch        *b_truth_ph_E;   //!
   truth_ph_pt = 0;
   truth_ph_E = 0;
   fChain->SetBranchAddress("truth_ph_pt", &truth_ph_pt, &b_truth_ph_pt);
   fChain->SetBranchAddress("truth_ph_E", &truth_ph_E, &b_truth_ph_E);

for every event I call GetEntry(i), but my question is what appens to the vectors object allocated for the previous events? For example, when the loop loads the event 0 with GetEntry(0) it load the vectors from the TFile to the memory, right? But when the first loop it finished the memory used by the two vectors in the example is free? Or I need to call manually something like truth_ph_pt->clear() (or delete truth_ph_pt)?

My problem is that I’ve a big TChain with a lot of vectors branches and my proof session crashes when the memory used reaches the maximum. I don’t know why it need so much memory, if I look at the plot on a ganglia monitor I see that the profile of the memory (ram + swap) is like a sqrt(time).

pcanal · January 4, 2011, 2:57pm

Hi,

Beside the fact that you must initialize the vector pointer to zero:vector<float> *truth_ph_pt = 0; vector<float> *truth_ph_E = 0; TBranch *b_truth_ph_pt; //! TBranch *b_truth_ph_E; //! truth_ph_pt = 0; truth_ph_E = 0; fChain->SetBranchAddress("truth_ph_pt", &truth_ph_pt, &b_truth_ph_pt); fChain->SetBranchAddress("truth_ph_E", &truth_ph_E, &b_truth_ph_E);, setting the branch address means that you take ownership of the object and that you must delete them every time you call SetBranchAddress or at the end of the processing (after the looping).

If you set the branch addresses before the loop, then the memory for the vector will be re-used for each entries and you do not need to call clear nor delete (and no memory leak should happen).

Cheers,
Philippe.

PS. Which version of ROOT are you using? Do you have the same problem when you run within Proof?

wiso · January 4, 2011, 3:28pm

[quote=“pcanal”]Hi,

Beside the fact that you must initialize the vector pointer to zero:vector<float> *truth_ph_pt = 0; vector<float> *truth_ph_E = 0; TBranch *b_truth_ph_pt; //! TBranch *b_truth_ph_E; //! truth_ph_pt = 0; truth_ph_E = 0; fChain->SetBranchAddress("truth_ph_pt", &truth_ph_pt, &b_truth_ph_pt); fChain->SetBranchAddress("truth_ph_E", &truth_ph_E, &b_truth_ph_E);, setting the branch address means that you take ownership of the object and that you must delete them every time you call SetBranchAddress or at the end of the processing (after the looping).

If you set the branch addresses before the loop, then the memory for the vector will be re-used for each entries and you do not need to call clear nor delete (and no memory leak should happen).

Cheers,
Philippe.

PS. Which version of ROOT are you using? Do you have the same problem when you run within Proof?[/quote]

Thanks for the answer, I’m using ROOT 5.28 and I’ve this problem with proof (I’ve not checked without).

I want to be sure about your answer. The point is that I’m not using vectors, but pointer to vectors. Normally if a pointer to a vector goes out of scope you have a memory leak:

{
  (std::vector<double>) * v = new std::vector<double>()
  v->push_back(2.1);
  // delete v
}
// memory leak

A question is: is this a memory leak:

int total;
std::vector<int> *v; // declare it before
for (ievent = 0; ievent != nevent; ++ievent)
{
    v = get_vector(ievent);
    total += std::sum(*v)
    // delete v
    // memory leak?
}

? As you said, if the vector is declared output it is not, because as cplusplus.com/ says about the operator= “the elements contained in the vector object before the call are dropped, and replaced by copies of those in vector x, if any.” Does the branch mechanism use operator= ?

I don’t know how vectors are implemented, but I can imagine that std::vector has a private member like T *data; and I want to be sure that data memory is free after each step of the main loop. The question is: if a branch has the responsibility to a pointer to a vector is it responsible also to the vector (as a smart pointer)?

pcanal · January 4, 2011, 3:46pm

[quote]A question is: is this a memory leak:[/quote]It depends on the content of ‘get_vector’ …

[quote]Does the branch mechanism use operator= ?[/quote]No. Instead the branch looks at the value of the pointer, if the pointer is null, it creates the vector (doing the equivalent of new vector…), if the pointer is not null, it uses it and call v->resize() and then insert the proper value directly in the right place in the reserver memory.

[quote]and I want to be sure that data memory is free after each step of the main loop[/quote]The memory is intentionally not freed after each loop as it is usually is just a waste of time and a source of memory fragmentation. Instead we make all possible attempt to ‘reuse’ allocated memory.

[quote]The question is: if a branch has the responsibility to a pointer to a vector is it responsible also to the vector (as a smart pointer)?[/quote]Even-though the branch will make all possible attempt to reuse the allocated memory, you are still explicitly owning the vector and should ultimately delete it (and IF you delete before the TTree object is delete you also must set the vector pointer to zero so that the TTree knows the vector object no longer exist.

Cheers,
Philippe.