Hi there,
now that the nice TTreeReader is available, is there an easy way to write only a subset of an input TTree (meaning only a few entries are retained)?
Thanks.
Hi there,
now that the nice TTreeReader is available, is there an easy way to write only a subset of an input TTree (meaning only a few entries are retained)?
Thanks.
If you mean, you want to select certain events, then you can use TTree::CopyTree
.
If you mean, copy only certain branches, then you can disable branches you don’t want to see in the new file:
auto inFile = TFile::Open("in.root");
auto tree = (TTree*) inFile->Get("tree_name");
// Disable all branches
tree->SetBranchStatus("*", kFALSE);
// Re-enable desired branches
tree->SetBranchStatus("mass", kTRUE);
...
auto newFile = new TFile("out.root", "recreate");
// Copy only those parts of the tree for which the selection string is true
// Note that any branch needed in the selection needs to be enabled (and will therefore also be copied)
auto newTree = tree->CopyTree("mass > 100");
newTree->Write();
newFile->Write();
newFile->Close();
If you want to add branches whose value depends on some of the old branches, have a look at my post about a function that accomplishes that: Add a new branch to a tree given only a formula
All of this will hopefully become way easier, as soon as ROOT 6.10 is published with its TDataFrame
.
Hi,
ROOT 6.10 is out: https://root.cern.ch/content/release-61000
As it was correctly stressed in the previous post, TDataFrame easily allows to:
In a nutshell, supposed to have a tree with 2 branches, b1 and b2. Suppose you want to select some events according to some cut and you want to create an additional column. Suppose you want to save the result as a new dataset.
int MyMacro()
{
// We prepare an input tree to run on
auto fileName = "myinputfile.root";
auto outFileName = "myoutputfile.root";
auto treeName = "myTree";
// We read the tree from the file and create a TDataFrame.
ROOT::Experimental::TDataFrame d(treeName, fileName);
// ## Select entries
// We now select some entries in the dataset and save the intermediate TDataFrame
auto d_cut = d.Filter("b1 % 2 == 0");
// ## Enrich the dataset
// Build some temporary columns: we'll write them out
// One column is defined with a lambda, one with a jitted C++ string
auto d2 = d_cut.Define("b1_square", "b1 * b1")
.Define("b2_vector",
[](float b2) {
std::vector<float> v;
for (int i = 0; i < 3; i++) v.push_back(b2*i);
return v;
},
{"b2"});
// ## Write it to disk in ROOT format
// We now write to disk a new dataset with one of the variables originally
// present in the tree and the new variables.
// The user can explicitly specify the types of the columns as template
// arguments of the Snapshot method, otherwise they will be automatically
// inferred.
d2.Snapshot(treeName, outFileName, {"b1", "b1_square", "b2_vector"});
All the tutorials can be found here: https://root.cern/doc/master/group__tutorial__tdataframe.html
Cheers,
D