Get distinct entries from tbranch

Hi,

I am looking for a nice way to get all different values from branches which contain either ints or strings.

Other branches in the tree contain a continuous range of values, but some contain very few different values which I would like to retrieve to enable a selection by those values.

The closest analogy that comes to my mind is a “select distinct” in an oracle query.

Any suggestions apart from manually looping over all entries of the tree?

Cheers

Erik

if your Tree has been created from an object, you can get some inspiration from the example below where Event.root is the file generated by running
Event 100 1 99 1
in $ROOTSYS/test (run make in this dir)

[code]void selectb() {
//open a tree and print the leaf names with integers or strings

TFile f(“Event.root”);
TTree T = (TTree)f.Get(“T”);
TIter next(T->GetListOfLeaves());
TLeafElement leaf;
while ((leaf = (TLeafElement
)next())) {
TString typeName = leaf->GetTypeName();
printf(“leaf = %s, typeName = %s\n”,leaf->GetName(),typeName.Data());
if (typeName == “Int_t”) {
printf(" found leaf: %s with integer type\n",leaf->GetName());
} else if (typeName == “Char_t”) {
printf(" found leaf: %s with string type\n",leaf->GetName());
}
}
}
[/code]

Rene

Hi Rene,

thanks for the answer. If I read the code correctly however, this is not what I intended to do.

The example below maybe illustrates the problem. What I would like to get is the range [0:9] for i and [0:4] for j.

Cheers,

Erik

void branchrange()
{
  TFile *f = new TFile("test.root","RECREATE");
  TTree *tree = new TTree("tree","My Tree");
  Int_t i, j;
  Double_t x,y;
  tree->Branch("i",&i,"i/I");
  tree->Branch("j",&j,"j/I");
  tree->Branch("x",&x,"x/D");
  tree->Branch("y",&y,"y/D");
  
  for(int k = 0; k < 10000; ++k) {
    x = gRandom->Gaus();
    y = gRandom->Gaus();
    i = k/1000;
    j = k/2000;
    tree->Fill();
  }

  tree->Write();
    
  for(int k = 0; k < tree->GetEntriesFast(); ++k) {
    tree->GetEntry(k);
    std::cout << i << ' ' << j << ' ' << x << ' ' << y << std::endl;
  }
  
}

Your explanations are still quite unclear. May be have a look at the following changes

Rene

[code]void branchrange()
{
TFile *f = new TFile(“test.root”,“RECREATE”);
TTree *tree = new TTree(“tree”,“My Tree”);
Int_t i, j;
Double_t x,y;
TBranch * bi = tree->Branch(“i”,&i,“i/I”);
TBranch * bj = tree->Branch(“j”,&j,“j/I”);
TBranch * bx = tree->Branch(“x”,&x,“x/D”);
TBranch * by = tree->Branch(“y”,&y,“y/D”);

for(int k = 0; k < 10000; ++k) {
x = gRandom->Gaus();
y = gRandom->Gaus();
i = k/1000;
j = k/2000;
tree->Fill();
}

tree->Write();

for(int k = 0; k < tree->GetEntriesFast(); ++k) {
bi->GetEntry(k);
if (i !=3) continue; //do not read anything else
bj->GetEntry(k);
if (j != 1) continue;
tree->GetEntry(k); //read everything else otherwise
std::cout << i << ’ ’ << j << ’ ’ << x << ’ ’ << y << std::endl;
}

}
[/code]

Hi Rene,

trying to clarify (hopefully) more.

The tree is the output from a different script and prior to doing any full analysis on it, I would like to get the range of values from some of the branches.

The code example should illustrate the data structure, namely that I have a very limited number of different values in branches i and j which I would like to retrieve.

If I then want to exclude or include certain values of either variable, the code changes you provided do what I want, but I would like to get the different values beforehand if possible.

Maybe there is no truly simple method and I can just disable all other branches prior to looping over the tree, to minimize the amount of data to be read.

Any other suggestion?

Thanks,

Erik

Simply make a histogram of your variables like I and J and in the histo select the non-empty bins.

Rene

Hi Rene,

now thats indeed a good and simple solution. I did not think of this.

Thanks