TTree::GetEntries(), then problems occur

I have a long piece of code which performs descriminant analysis on the data from multiple tree. I’ve isolated a weird effect. When I call TTree::GetEntries() one tree, my custom C++ object goes bad. What is it about calling TTree::GetEntries() that could cause this? (ROOT version 5.08)

Heres a snippet of the code and the results.

Thanks

				printf("bg events %d) signal PDE has %d events . . .\n", k, signalPDE->GetNumberPoints() );
				bgTrees[i]->GetEntry( k );
				printf("bg events %d) signal PDE has %d events after bgTrees[i]->GetEntry( k ); . . .\n", 
					k, signalPDE->GetNumberPoints() );

bg events 0) signal PDE has 704 events . . .
bg events 0) signal PDE has -2131052265 events after bgTrees[i]->GetEntry( k ); . . .

I assume that you mean TTree::GetEntry (not GetEntries)

GetEntry stores the info from the Tree branches into your object.
A possible cause of problems is if you have an error with your destructors,
or an unitiialized class member.

Try to run with valgrind.

Rene

Thanks for the valgrind suggestion. Pretty amazing tool.

I haven’t been able to fix this error on TTree::GetEntry(). How can I be getting a write error when I call GetEntry()?

==21926== Invalid write of size 1 ==21926== at 0x1C7E27F4: TBuffer::operator>>(double&) (in /D0/usr/products/root/Linux-2-4/v4_04_02b_fbits_eh-GCC_3_4_3-opt/lib/libCore.so) ==21926== by 0x1BCD2524: TLeafD::ReadBasket(TBuffer&) (in /D0/usr/products/root/Linux-2-4/v4_04_02b_fbits_eh-GCC_3_4_3-opt/lib/libTree.so) ==21926== by 0x1BCAA978: TBranch::ReadLeaves(TBuffer&) (in /D0/usr/products/root/Linux-2-4/v4_04_02b_fbits_eh-GCC_3_4_3-opt/lib/libTree.so) ==21926== by 0x1BCA9646: TBranch::GetEntry(long long, int) (in /D0/usr/products/root/Linux-2-4/v4_04_02b_fbits_eh-GCC_3_4_3-opt/lib/libTree.so) ==21926== by 0x1BCE262E: TTree::GetEntry(long long, int) (in /D0/usr/products/root/Linux-2-4/v4_04_02b_fbits_eh-GCC_3_4_3-opt/lib/libTree.so) ==21926== by 0x805322B: pde_tools::looPDE::CrossValidateBackground(TTree&, std::vector<TTree*, std::allocator<TTree> >&, std::valarray<double>, std::string) (looPDE.cpp:298) ==21926== by 0x805181C: pde_tools::looPDE::TestSignalPoints() (looPDE.cpp:146) ==21926== by 0x8050982: pde_tools::looPDE::looPDE(std::string, std::string) (looPDE.cpp:107) ==21926== by 0x806BDC3: main (runPDE.cpp:33)

I also created a smaller test application, but got the same error:

==11502== Invalid write of size 1 ==11502== at 0x1C7E27B9: TBuffer::operator>>(double&) (in /D0/usr/products/root/Linux-2-4/v4_04_02b_fbits_eh-GCC_3_4_3-opt/lib/libCore.so) ==11502== by 0x1BCD2524: TLeafD::ReadBasket(TBuffer&) (in /D0/usr/products/root/Linux-2-4/v4_04_02b_fbits_eh-GCC_3_4_3-opt/lib/libTree.so) ==11502== by 0x1BCAA978: TBranch::ReadLeaves(TBuffer&) (in /D0/usr/products/root/Linux-2-4/v4_04_02b_fbits_eh-GCC_3_4_3-opt/lib/libTree.so) ==11502== by 0x1BCA9646: TBranch::GetEntry(long long, int) (in /D0/usr/products/root/Linux-2-4/v4_04_02b_fbits_eh-GCC_3_4_3-opt/lib/libTree.so) ==11502== by 0x1BCE262E: TTree::GetEntry(long long, int) (in /D0/usr/products/root/Linux-2-4/v4_04_02b_fbits_eh-GCC_3_4_3-opt/lib/libTree.so) ==11502== by 0x805F763: pde_tools::FileUtils::ReadRootTreeIntoTMatrixD(TTree*, TMatrixD*, char const*) (FileUtils.cpp:218) ==11502== by 0x806C5B8: testKDE() (testKDE.cpp:169) ==11502== by 0x806CBF1: main (testKDE.cpp:219)

Hi,

a valgrind write error means that ROOT is trying to write a value into memory where it shouldn’t. That agrees with Rene’s assumption. Now you know what the problem is, but you don’T know yet how to fix it.

In connection with a TTree::GetEntry this error usually means that you have supplied an invalid BranchAddresses to the TTree, or that the buffer at the branch address is too small to hold all the data that’s in one of the entry’s branch.

We need your code (or an example, that we can compile and run, and that shows your error) and your data file. Or at least your code and the output of TTree::Print().

Axel.

Hi Axel,

I made a simplified version of the code and attached here. I think I’ve identified the problem but I don’t fully understand it.

Here’s my erroneous understanding:
My function adds a branch to a tree and fills it using a loop and dynamically allocated array of double values called leaves. Each time through the loop, new values are copied to the leaves array. I call the branch’s fill method which causes the values to be copied to the tree. Once the loop has finished, I need to delete the array or it will be a small memory leak. So I delete it.

But deleting the array actually causes a crash. If I don’t delete, the program runs fine though according to valgrind, there are some memory leaks.

What’s the correct understanding of what’s happening? Should I be allocated the memory each time through the loop?

Thank you,
Dennis

//C++ includes
#include <string>
#include <vector>
#include <valarray>
#include <iostream>
#include <errno> 
#include <math>

//Root includes
#include "TBrowser.h"
#include "TCanvas.h"
#include "TFile.h"
#include "TH1D.h"
#include "TH1F.h"
#include "TLeaf.h"
#include "TTree.h"
#include "TMatrixD.h"
#include "TGraph.h"
#include "TPad.h"
#include "TLatex.h"
#include "TRandom.h"
#include "TMatrixD.h"
#include "TCanvas.h"
#include "TH2D.h"
#include "TMatrixDEigen.h"
#include "TRandom.h"
#include "TSystem.h"
#include "TStopwatch.h"

//KDE Includes
//#include "../include/KDE.h"
//#include "../include/FileUtils.h"
#include "../include/StringUtils.h"

using namespace std;
//using namespace pde_tools;

int split(const string& original, const string& delimiter, vector<string>& results)
{
	int numFound = 0;
	int currentPos=0, findPos=0;
	results.clear();
	findPos = original.find( delimiter, 0);
	while( 0 <= findPos){
		assert( currentPos <original> 1){ //make sure the token value is not empty
			numFound++;
			results.push_back( original.substr(currentPos, findPos - currentPos) );
		}
		//cout<<"token("<<currentPos<<", "<<findPos<<"): "<< original.substr(currentPos, findPos - currentPos) <<endl;
		//cout<<"original.size()="<<original.size()<<endl>= 1){ //add the remainder of the original to the output
		//cout<<"token("<<currentPos<<", "<<findPos<<"): "<< original.substr(currentPos) <<endl;
		results.push_back( original.substr(currentPos) );
		numFound++;
	}

	return numFound;
}//end of split


/////////////////////////////////////////////////////////////////////
void ReadRootTreeIntoTMatrixD(TTree *dataTree, TMatrixD *dataMatrix, const char *variableList){
/////////////////////////////////////////////////////////////////////
/// Extracts data from tree, applies cuts, and copies the values into
/// the TMatrixD object. TMatrixD objects are the data structure used
/// by PDE so this function is useful in converting data from the 
/// the commonly used storage format, TTree, to format used for PDE 
/// processing
/////////////////////////////////////////////////////////////////////
///
/// @param dataTree Tree containing the data to be extracted
/// @param mat TMatrixD object to hold the data from the tree
/// @returns TCut object with all of the cuts for this extractor
/// @bug The cuts should not be hard coded
/////////////////////////////////////////////////////////////////////	
	cout<<"Copying tree "<<dataTree>GetName()<<" to TMatrixD . . ."<<endl;
	vector<string> variablesVec;
	string variableString(variableList);
	Int_t numberVars = split(variableString, ":", variablesVec);
	assert( numberVars == (Int_t)variablesVec.size() );
	printf("%d variables in variableVec\n", variablesVec.size() );
									
	int rows = dataTree->GetEntries();
	int columns = variablesVec.size();
	dataMatrix->Clear();
	dataMatrix->ResizeTo( rows, columns );
	
	for(int i=0; i <rows>GetEntry(i);
		for(int j=0; j< columns; j++){
			//cout<<"getting "<<variablesVec[j].c_str()<<" . . ."<<endl>GetLeaf( variablesVec[j].c_str() )->GetValue( );
			//cout<<dataTree>GetLeaf( variablesVec[j].c_str() )->GetValue(0) << endl;
		}
	}
	
	cout << "Built TMatrixD: (" <<dataMatrix>GetNrows() << "," <<dataMatrix>GetNcols() << ") matrix"<<endl;
} //end of ReadRootTreeIntoTMatrixD	
//////////////////////////////////////////////////////////////////////////////////////////////

///////////////////////////////////////////////////////////////////////////////
/// GetListOfVariables
///////////////////////////////////////////////////////////////////////////////
/// returns a : delimitted list of variables x0:x1:. . .:x[numberOfVariables]
///////////////////////////////////////////////////////////////////////////////
void GetListOfVariables(int numberOfVariables, string &variables){

	char variablesArray[1000], buffer[1000];
	variablesArray[0] = '\0';
	string delimiter("");
	for(int j=0; j<numberOfVariables; ++j){
		sprintf(&buffer[0], "%s%sy%d", variablesArray, delimiter.c_str(), j);
		sprintf(&variablesArray[0], "%s", buffer);
		delimiter = ":";
	}

	variables = variablesArray;
}


///////////////////////////////////////////////////////////////////////////////
/// GetLeavesString
///////////////////////////////////////////////////////////////////////////////
/// returns string formatted to create ROOT branch numberOfVariables leaves.
///////////////////////////////////////////////////////////////////////////////
void GetLeavesString(int numberOfVariables, string &leaves){

	char leavesString[1000], buffer[1000];
	leavesString[0] = '\0';
	string delimiter("");
	for(int j=0; j<numberOfVariables; ++j){
		sprintf(&buffer[0], "%s%sy%d/D", leavesString, delimiter.c_str(), j);
		sprintf(&leavesString[0], "%s", buffer);
		delimiter = ":";
	}
	leaves = leavesString;
}


///////////////////////////////////////////////////////////////////////////////
/// GetLeavesString
///////////////////////////////////////////////////////////////////////////////
/// populates tree with numberOfEvents events,  numberOfVariables, 
/// randomly distributed as Gaussian(mean, std). The random seed can be specified
/// by variable randomSeed.
///////////////////////////////////////////////////////////////////////////////
void GetGaussianTree(TTree &tree, int numberOfVariables, double numberOfEvents, double mean, double std, int randomSeed=12 ){

	printf("Filling tree %s with Gaussian data.\n", tree.GetName() );
	TRandom rand(randomSeed);
	Double_t *leaves = new Double_t[numberOfVariables];
	string leavesString;
	GetLeavesString(numberOfVariables, leavesString);
	printf("%s\n", leavesString.c_str());

	TBranch *branch = tree.Branch("TestPDE", leaves, leavesString.c_str() );

	for(int i=0; i < numberOfEvents; ++i){
		for(int j=0; j<numberOfVariables>Fill( );
		tree.Fill( );
	}
	
	//delete[] leaves;
}

int memError(){

	static const Int_t numberOfTrainingEvents = 3000;
	static const Int_t numberOfTestEvents = 30;
	static const Int_t numberOfVariables = 100;

	cout<<"Testing KDE . . ."<<endl;

	TTree *signalTree = new TTree("signalTree","signal");
	TTree *bgTree = new TTree("bgTree", "background ");
	TTree *testTree = new TTree("dataTree", "test events");
	TMatrixD *signalData = new TMatrixD(numberOfTrainingEvents, numberOfVariables);
	TMatrixD *bgData = new TMatrixD(numberOfTrainingEvents, numberOfVariables);
	TMatrixD *testData = new TMatrixD(numberOfTestEvents , numberOfVariables);

	GetGaussianTree(*signalTree, numberOfVariables, numberOfTrainingEvents, 0.2, 1);
	GetGaussianTree(*bgTree, numberOfVariables, numberOfTrainingEvents, 0, 1);
	GetGaussianTree(*testTree, numberOfVariables, numberOfTestEvents, -0.1, 1);

	//Copy trees into TMatrixD's for input to KDE
	string variableList;
	GetListOfVariables(numberOfVariables, variableList);
	cout<<"vars: "<<variableList<<endl;
	ReadRootTreeIntoTMatrixD( signalTree, signalData, variableList.c_str() );
	cout << "Copied " <<signalTree>GetName()<<endl;
	ReadRootTreeIntoTMatrixD( bgTree, bgData, variableList.c_str() );
	cout << "Copied " <<bgTree>GetName()<<endl;
	ReadRootTreeIntoTMatrixD( testTree, testData, variableList.c_str() );
	cout << "Copied " <<testTree>GetName()<<endl>Delete();
	bgData->Delete();
	testData->Delete();

	testTree->Delete();
	signalTree->Delete();
	bgTree->Delete();
	 
	cout<<"\ntest complete."<<endl>Load("libTree");
	memError();
}

Hi,

could you please attach this file? The phpBB2 code posting facility currently corrupts code (see the event fill loop and the GetEntry() call, exactly where it gets interesting). Sorry about that.

And no, you should not allocate new memory each time you fill the tree.

Axel.

No problem. Here it is.

Thanks for looking at it.
memError.cpp (7.74 KB)

You have

Double_t *leaves = new Double_t[numberOfVariables]; TBranch *branch = tree.Branch("TestPDE", leaves, leavesString.c_str() ); delete[] leaves;
and you never tell the tree that the memory has been deleted (it can not guess this) AND you re-use this tree later … which consequently uses the deleted memory!

One quick solution is to add:

tree.ResetBranchAddresses();where the delete happens.

Cheers,
Philippe.

True, but when I tell the Branch to fill(), don’t I ask it to copy the memory to the tree, in which case, the tree would have its own copy and have no need for the original array.

If I were using a static array, it would go out of scope at the end of the function. Why is delete any different?

If I changed the values in the array rather than deleting it, would the tree’s values change?

When does the tree let go of the array? Is only when Fill is called again?[/list]

[quote]True, but when I tell the Branch to fill(), don’t I ask it to copy the memory to the tree, in which case, the tree would have its own copy and have no need for the original array.[/quote]I think you are missing a little bit of information (you may want to re-read the User’s Guide chapter on TTree). They are 2 different parts (at least :wink:) concerning the memory. There is one part which is used for you to communicate with the TTree. You tell it where it should find (and later write) your values. This is the address that you pass when building a branch (or when calling SetBranchAddress). When you calling Fill, the TTree copies the information from this memory to his cache (the TBasket in memory). When the TBasket is full it is flush/copied to disk (if the TTree is connected to a file).
When you call GetEntry, the TTree recover from the file the proper basket (if needed) and copy the information from the basket to the user memory.
The format on file (and thus in the TBasket) is not appropriate for use by C/C++ so there has to be a copy/unserialization from the TBasket to the user memory (set in TTree::Branch or TTree::SetBranchAddress).

If it is static, by definition it does not go away after the end of the function.

[quote]If I changed the values in the array rather than deleting it, would the tree’s values change?[/quote]No. see explanation above.

[quote]When does the tree let go of the array? Is only when Fill is called again?[/quote]It stops refering to the array and once you tell him to (Either via ResetBranchAddress or SetBranchAddress).

Cheers,
Philippe.

Static was a poor choice of words. Let me re-ask one quesion, then I’ll go away.

If I new that I needed 100 leaves, I would write:

This array would go out of scope at the end of the function.

Since I don’t know how many leaves I need, I write:

double *leaves = new double[numberOfLeaves]; . . . . . . . . . delete []leaves;
How does TTree know the difference between the first case, where the memory goes away when it goes out of scope and the second case where the memory goes away when I delete it?

Indeed.

It does NOT know the difference. The result is both case (if you do not reset the branch address) will be the same: unpredictable.

The correct solution is to either use a static or global array OR better yet to use ResetBranchAddress.

Cheers,
Philippe