Automated flattening of trees

Dear ROOT-experts,

I’m trying to figure out whether the following problem can be solved with a reasonable coding effort:

I have a tree with std::vectors of basic types, i.e. double, bool, and int, which I need to transform into a flat tree. The flattening is necessary because I want to then use the data in TMVA, which expects flat trees.

As an example, say there is set of particle properties pT, eta, phi, which are stored in std::vectors and the index in each vector identifies a different particle. Now I want to flatten this structure, i.e. create branches with not vectors but the basic types and put the information for each particle in a new entry in the tree. In doing so I’d like to perform a selection on the objects, e.g. say keep only particles with pT>10.

I know how to achieve this functionality “manually”, but I was wondering whether there is a more elegant way. Specifically, I want to avoid having to specify the output branch names explicitly, needing to manually create the intermediate variables to be used for the definition of the branches of the new tree etc.

Now my idea so far would be to get a list of all the Branches in the tree, select the names of the branches I want to keep and create the branches of the new tree based on that information. I see the following difficulty in doing so:
I somehow need to extract the type of the variables in the vector in the branch and pass that information to the new branch. Is there an easy way to extract this information easily from a branch? (It must be possible, since the MakeSelector does exactly that) ?

Regarding the filling of the tree, I was considering using an std::map to store the branch/variable name, filling it with the values from the old tree and passing the address of the the map::value (i.e. the map.second) to the TTree::Branch() function.

Before getting my hands dirty in trying to piece something together, I wanted to ask whether there already exists a tool to do the job and if that is not the case, if my outlined approach seems reasonable.

Many thanks for your comments!

Cheers
Philipp

Hi Philipp,

I’ll try to address your points.
[ul]
[li] Unfortunately there is no tool to achieve what you describe. Fortunately the task is not too complicated[/li]
[li] The method to extract the type name from a branch is TBranch::GetClassName(). And yes, utilities like MakeSelector and MakeProject work fetching this information :slight_smile:[/li]
[li] The approach seems reasonable[/li][/ul]

Cheers,
Danilo

Hello Danilo,

thanks for your reply! It encouraged me to proceed writing the tool I need, but the first step/exercise is already met with some resistance in the form of a set fault I’m unable to debug. I’m certain its because I’m doing something stupid when setting up the new tree, but I cannot pinpoint it :

Selector.h:
Selector::Init(TTree *tree) {
[...]
     newFile = new TFile("blubb.root", "RECREATE");
     if (first) {
 	output_tree = fChain->CloneTree(0);
   	fChain->GetTree()->CopyAddresses(output_tree);
   	output_tree->SetName("TrigSelector0");
   	fOutput->Add(output_tree);
   	first = false;
 }

and then

Selector.C:
Bool_t Selector::Process(Long64_t entry) {
    GetEntry(entry);
    //gDebug=2;
    if (true) output_tree->Fill();

Running this code crashes after about ~100 calls to Selector::Process with a seg fault. Now the stack trace are incomprehensible to me, so I was not even able to debug this much further. I am sure its due to an incorrect link between the output file and the tree (even though I’ve tried the code with just adding the new tree to the same file with the same set fault result).

Many thanks for any hints on how to understand the stack trace or on whats wrong with my code.

Cheers
Philipp

===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0  0x00000037358ac61e in waitpid () from /lib64/libc.so.6
#1  0x000000373583e609 in do_system () from /lib64/libc.so.6
#2  0x00007f2997199fc7 in TUnixSystem::StackTrace() () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCore.so
#3  0x00007f299719bf5c in TUnixSystem::DispatchSignals(ESignals) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCore.so
#4  <signal handler called>
#5  0x00007f298c9a1ab9 in TTree::OptimizeBaskets(unsigned long long, float, char const*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libTree.so
#6  0x00007f298c9a66cb in TTree::Fill() () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libTree.so
#7  0x00007f298b9a8d2f in TrigSelector::Process(long long) () from /grid_mnt/vol__vol_U__u/llr/cms/pigard/eIDflattener/TrigSelector/TrigSelector_C.so
#8  0x00007f298bae55fe in TTreePlayer::Process(TSelector*, char const*, long long, long long) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libTreePlayer.so
#9  0x00007f298bae8dd6 in TTreePlayer::Process(char const*, char const*, long long, long long) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libTreePlayer.so
#10 0x00007f29942fe5de in ?? ()
#11 0x0000000000000001 in ?? ()
#12 0x00007f2996edcb78 in std::string::_Rep::_S_empty_rep_storage () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/external/gcc/4.9.1-cms/lib64/libstdc++.so.6
#13 0x0000000003baff98 in ?? ()
#14 0x0000000003a90c90 in ?? ()
#15 0x00007f2996edcb78 in std::string::_Rep::_S_empty_rep_storage () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/external/gcc/4.9.1-cms/lib64/libstdc++.so.6
#16 0x0000000003959258 in ?? ()
#17 0x00000000039a0ed0 in ?? ()
#18 0x00000000039a0ed8 in ?? ()
#19 0x00000000039a0ed0 in ?? ()
#20 0x00007fff5ddebc98 in ?? ()
#21 0x00000000039a1c40 in ?? ()
#22 0x00007f2994d5c869 in cling::Value::isValid() const () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#23 0x00000000039a0ed0 in ?? ()
#24 0x00000000039a0ed8 in ?? ()
#25 0x00000000039a0ed8 in ?? ()
#26 0x00007fff5ddec240 in ?? ()
#27 0x00007f29942fe1d0 in ?? ()
#28 0x00007fff5ddec240 in ?? ()
#29 0x00000000039a0d48 in ?? ()
#30 0x00007f298d112018 in ?? ()
#31 0x00007f29942fe1d0 in ?? ()
#32 0x00000000036e18c8 in ?? ()
#33 0x0000000000000015 in ?? ()
#34 0x00007fff5ddec240 in ?? ()
#35 0x00007f2996edcb60 in ?? () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/external/gcc/4.9.1-cms/lib64/libstdc++.so.6
#36 0x00007fff5ddebd20 in ?? ()
#37 0x00007f29942fe21c in ?? ()
#38 0x0000000000000015 in ?? ()
#39 0x00007fff5ddec240 in ?? ()
#40 0x0000000000509c60 in ?? ()
#41 0x00007f2994d24c8d in cling::IncrementalExecutor::executeFunction(llvm::StringRef, cling::Value*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#42 0x00007f2994d2c3b0 in cling::Interpreter::RunFunction(clang::FunctionDecl const*, cling::Value*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#43 0x00007f2994d315b5 in cling::Interpreter::EvaluateInternal(std::string const&, cling::CompilationOptions const&, cling::Value*, cling::Transaction**) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#44 0x00007f2994d31791 in cling::Interpreter::echo(std::string const&, cling::Value*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#45 0x00007f2994d7a610 in cling::MetaSema::actOnxCommand(llvm::StringRef, llvm::StringRef, cling::Value*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#46 0x00007f2994d7292d in cling::MetaParser::isXCommand(cling::MetaSema::ActionResult&, cling::Value*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#47 0x00007f2994d7387e in cling::MetaParser::isCommand(cling::MetaSema::ActionResult&, cling::Value*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#48 0x00007f2994d7426b in cling::MetaProcessor::process(char const*, cling::Interpreter::CompilationResult&, cling::Value*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#49 0x00007f2994c24f5a in TCling::ProcessLine(char const*, TInterpreter::EErrorCode*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#50 0x00007f2994c17e37 in TCling::ProcessLineSynch(char const*, TInterpreter::EErrorCode*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#51 0x00007f29970be79d in TApplication::ExecuteFile(char const*, int*, bool) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCore.so
#52 0x00007f29970bf766 in TApplication::ProcessLine(char const*, bool, int*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCore.so
#53 0x00007f2996eecf75 in TRint::ProcessLineNr(char const*, char const*, int*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libRint.so
#54 0x00007f2996eee3a7 in TRint::Run(bool) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libRint.so
#55 0x0000000000400fc0 in main ()
===========================================================


The lines below might hint at the cause of the crash.
If they do not help you then please submit a bug report at
http://root.cern.ch/bugs. Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#5  0x00007f298c9a1ab9 in TTree::OptimizeBaskets(unsigned long long, float, char const*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libTree.so
#6  0x00007f298c9a66cb in TTree::Fill() () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libTree.so
#7  0x00007f298b9a8d2f in TrigSelector::Process(long long) () from /grid_mnt/vol__vol_U__u/llr/cms/pigard/eIDflattener/TrigSelector/TrigSelector_C.so
#8  0x00007f298bae55fe in TTreePlayer::Process(TSelector*, char const*, long long, long long) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libTreePlayer.so
#9  0x00007f298bae8dd6 in TTreePlayer::Process(char const*, char const*, long long, long long) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libTreePlayer.so
#10 0x00007f29942fe5de in ?? ()
#11 0x0000000000000001 in ?? ()
#12 0x00007f2996edcb78 in std::string::_Rep::_S_empty_rep_storage () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/external/gcc/4.9.1-cms/lib64/libstdc++.so.6
#13 0x0000000003baff98 in ?? ()
#14 0x0000000003a90c90 in ?? ()
#15 0x00007f2996edcb78 in std::string::_Rep::_S_empty_rep_storage () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/external/gcc/4.9.1-cms/lib64/libstdc++.so.6
#16 0x0000000003959258 in ?? ()
#17 0x00000000039a0ed0 in ?? ()
#18 0x00000000039a0ed8 in ?? ()
#19 0x00000000039a0ed0 in ?? ()
#20 0x00007fff5ddebc98 in ?? ()
#21 0x00000000039a1c40 in ?? ()
#22 0x00007f2994d5c869 in cling::Value::isValid() const () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#23 0x00000000039a0ed0 in ?? ()
#24 0x00000000039a0ed8 in ?? ()
#25 0x00000000039a0ed8 in ?? ()
#26 0x00007fff5ddec240 in ?? ()
#27 0x00007f29942fe1d0 in ?? ()
#28 0x00007fff5ddec240 in ?? ()
#29 0x00000000039a0d48 in ?? ()
#30 0x00007f298d112018 in ?? ()
#31 0x00007f29942fe1d0 in ?? ()
#32 0x00000000036e18c8 in ?? ()
#33 0x0000000000000015 in ?? ()
#34 0x00007fff5ddec240 in ?? ()
#35 0x00007f2996edcb60 in ?? () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/external/gcc/4.9.1-cms/lib64/libstdc++.so.6
#36 0x00007fff5ddebd20 in ?? ()
#37 0x00007f29942fe21c in ?? ()
#38 0x0000000000000015 in ?? ()
#39 0x00007fff5ddec240 in ?? ()
#40 0x0000000000509c60 in ?? ()
#41 0x00007f2994d24c8d in cling::IncrementalExecutor::executeFunction(llvm::StringRef, cling::Value*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#42 0x00007f2994d2c3b0 in cling::Interpreter::RunFunction(clang::FunctionDecl const*, cling::Value*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#43 0x00007f2994d315b5 in cling::Interpreter::EvaluateInternal(std::string const&, cling::CompilationOptions const&, cling::Value*, cling::Transaction**) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#44 0x00007f2994d31791 in cling::Interpreter::echo(std::string const&, cling::Value*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#45 0x00007f2994d7a610 in cling::MetaSema::actOnxCommand(llvm::StringRef, llvm::StringRef, cling::Value*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#46 0x00007f2994d7292d in cling::MetaParser::isXCommand(cling::MetaSema::ActionResult&, cling::Value*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#47 0x00007f2994d7387e in cling::MetaParser::isCommand(cling::MetaSema::ActionResult&, cling::Value*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
#48 0x00007f2994d7426b in cling::MetaProcessor::process(char const*, cling::Interpreter::CompilationResult&, cling::Value*) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/lcg/root/6.02.00-odfocd4/lib/libCling.so
===========================================================

Any update on this topic. I mean how to flaten normal trees?

Hi @Qamar-Ul-Hassan!

That depends. What do you want to flatten your TTree for? And do you want to do it in Python or C++?

Hi Jonas,

I just want to compare a delphes out file to nanoaod output which has flat trees. Delphes file is just normal trees. So I am looking any example in c++, if I can extract my tree from delphes file and save in other file in flat tree.

Thanks for your prompt reply.
Qamar

Hi!

I just want to compare a delphes out file to nanoaod output which has flat trees.

Okay then I think your issue is unrelated to this forum thread. What this thread means by “flattening” is transform a TTree with variable-size array branches into a tree where each of the elements of the variable-size arrays is mapped to one TTree entry.

For example if this is the input:

Entry index | ele_pt (std::vector<double>) | ele_eta (std::vector<double>)
--------------------------------------------------------------------------
0           | [ 10.1, 25.3, 37.2 ]         | [ 0.91, 2.37, 0.85 ]
1           | [ ]                          | [ ]
2           | [ 87.4 ]                     | [ 1.82 ]
3           | [ ]                          | [ ]

The “flattened” TTree would look like this:

Entry index | ele_pt (double) | ele_eta (double)
------------------------------------------------
0           | 10.1            | 0.91
1           | 25.3            | 2.37
2           | 37.2            | 0.85
3           | 87.4            | 1.82

The original usecase for this flattening was to pass the data to machine learning frameworks, but in the meantime other solutions were developed to do this (e.g. RDataFrame and uproot).

Your problem is a different one. I think NanoAOD doesn’t have “flat” trees according to the definition here, but it has collections with variable size per event (each entry in NanoAOD is an event, which can have an arbitrary number of electrons or jets for example).

As your problem is a different one, please feel free to open a new forum issue with an example Delphes and NanoAOD file such that we can help you precisely.

Cheers,
Jonas