Quiet writing failure when applying schema evolution while slow cloning


ROOT Version: 6.34.04
Platform: linuxx8664gcc
Compiler: GCC 13.3.0

Specifically, I am using the container image ldmx/pro:v4.4.0 on DockerHub.


Boilerplate Intro Copied from Previous Post

Previous Forum Post: Schema Evolution: Renaming a `std::map` Member Variable

I am working in a project where we recently adopted some naming guidelines for our member variables and have now applied them to classes that are serialized by ROOT into output TTrees. However, unsurprisingly, we still have data files with the old schema (old names for member variables) floating around and so I am interested in having some schema rules that rename the member variables when reading the old format.

The full project ldmx-sw is large and takes quite some time to compile, but I’ve been able to partially replicate the issue with a smaller example.

The source code for this example is available on GitHub: tomeichlersmith/ldmx-root-schema-evolution-testbench in the slow-clone-schema-evolve sub-directory.

I have a simple class Header with two int members that have changed names.
The v1 Header uses camel case names while the v2 Header uses snake case names
and includes an updated #pragma read statement in its LinkDef.h file to be
able to evolve the v1 schema into v2.

Running the ./show script displays all of the grizzly details, but the summary is

  • I can write, read, and copy[1] a TTree of v1 while only using v1
  • I can write, read, and copy a TTree of v2 while only using v2
  • I can write v1 and read it with v2, but if I attempt to copy v1 with v2, the output file does not read correctly (even though the printouts while doing the copy are correct)
  • I can avoid this write-out error by manually syncing the addresses between the input and output TTree (instead of using CloneTree), but then I cannot expect branches that are not “observed” while copying to be copied at all.

Compile

Normal config and build cycle using CMake to find and configure the ROOT installation.

cmake -B build -S .
cmake --build build

Run

Write a file using the old version of the Header object and attempt to read that
file with both the old and new versions.
Reading goes okay, but copying with the new schema and then reading does not go okay.

$ ./build/write-v1 v1-output.root
$ ./build/read-v1 v1-output.root
{ run: 42, event: 0 }
{ run: 42, event: 1 }
{ run: 42, event: 2 }
{ run: 42, event: 3 }
{ run: 42, event: 4 }
{ run: 42, event: 5 }
{ run: 42, event: 6 }
{ run: 42, event: 7 }
{ run: 42, event: 8 }
{ run: 42, event: 9 }
$ ./build/read-v2 v1-output.root
manual schema evolution rule being applied
{ run: 42, event: 0 }
manual schema evolution rule being applied
{ run: 42, event: 1 }
manual schema evolution rule being applied
{ run: 42, event: 2 }
manual schema evolution rule being applied
{ run: 42, event: 3 }
manual schema evolution rule being applied
{ run: 42, event: 4 }
manual schema evolution rule being applied
{ run: 42, event: 5 }
manual schema evolution rule being applied
{ run: 42, event: 6 }
manual schema evolution rule being applied
{ run: 42, event: 7 }
manual schema evolution rule being applied
{ run: 42, event: 8 }
manual schema evolution rule being applied
{ run: 42, event: 9 }
$ ./build/copy-v2-clone-tree v1-output.root v2-clone-tree-copy-v1-output.root
manual schema evolution rule being applied                                                                            
{ run: 42, event: 0 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 1 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 2 }                                                                                                 
manual schema evolution rule being applied
{ run: 42, event: 3 }
manual schema evolution rule being applied
{ run: 42, event: 4 }
manual schema evolution rule being applied
{ run: 42, event: 5 }
manual schema evolution rule being applied
{ run: 42, event: 6 }
manual schema evolution rule being applied
{ run: 42, event: 7 }
manual schema evolution rule being applied
{ run: 42, event: 8 }
manual schema evolution rule being applied
{ run: 42, event: 9 }
$ ./build/read-v2 v2-clone-tree-copy-v1-output.root
manual schema evolution rule being applied
{ run: -1904246784, event: -1904246784 }
manual schema evolution rule being applied
{ run: -1904246784, event: -1904246784 }
manual schema evolution rule being applied
{ run: -1904246784, event: -1904246784 }
manual schema evolution rule being applied
{ run: -1904246784, event: -1904246784 }
manual schema evolution rule being applied
{ run: -1904246784, event: -1904246784 }
manual schema evolution rule being applied
{ run: -1904246784, event: -1904246784 }
manual schema evolution rule being applied
{ run: -1904246784, event: -1904246784 }
manual schema evolution rule being applied
{ run: -1904246784, event: -1904246784 }
manual schema evolution rule being applied
{ run: -1904246784, event: -1904246784 }
manual schema evolution rule being applied
{ run: -1904246784, event: -1904246784 }
$ denv ./build/copy-v2-manual data/v1-out
put.root data/v2-copy-manual-v1-output.root                                                                           
manual schema evolution rule being applied                                                                            
{ run: 42, event: 0 }                                                                                                 
manual schema evolution rule being applied
{ run: 42, event: 1 }
manual schema evolution rule being applied
{ run: 42, event: 2 }
manual schema evolution rule being applied
{ run: 42, event: 3 }
manual schema evolution rule being applied
{ run: 42, event: 4 }
manual schema evolution rule being applied
{ run: 42, event: 5 }
manual schema evolution rule being applied
{ run: 42, event: 6 }
manual schema evolution rule being applied
{ run: 42, event: 7 }
manual schema evolution rule being applied
{ run: 42, event: 8 }
manual schema evolution rule being applied
{ run: 42, event: 9 }
$ denv ./build/read-v2 data/v2-copy-manual-v1-output.root 
{ run: 42, event: 0 }
{ run: 42, event: 1 }
{ run: 42, event: 2 }
{ run: 42, event: 3 }
{ run: 42, event: 4 }
{ run: 42, event: 5 }
{ run: 42, event: 6 }
{ run: 42, event: 7 }
{ run: 42, event: 8 }
{ run: 42, event: 9 }

  1. When I say “copy” here, I mean “slow clone”. I am focusing on slow cloning because our data processing framework slow clones while allowing the user to read some of the branches and potentially write new ones. ↩︎

Hi @tomeichlersmith,

thanks for reporting. Maybe @pcanal could help you here.

Cheers,

Marta

This is (for better or worse) the behavior as I expect since CloneTree as no way to know that the data were renamed (or more exactly it is not implemented and is likely ‘hard’ to do in a generic way).

The solution is to combine both option, i.e.:

  • Use CloneTree
  • Call SetBranchAddress to syncing the addresses between the input and output TTree just of the renamed data members.

Can you provide more detail on how to call SetBranchAddress for just the renamed data members?

In the example, I do call SetBranchAddress on the input TTree:

This is done before calling CloneTree. Are you saying I should do the SetBranchAddress call after CloneTree ? on the output TTree as well? or some option where I SetBranchAddress for specific members of my Header class?

This is done before calling CloneTree. Are you saying I should do the SetBranchAddress call after CloneTree ? on the output TTree as well? or some option where I SetBranchAddress for specific members of my Header class?

What I meant what do the SetBranchAddress before the CloneTree but also call the SetBranchAddress on the output for the new branches …

Except that I realize that CloneTree is actually not what you (seem to) want. CloneTree will keep the structure the same as the input. Since the live object does not have a memory slot for the rename members (the system does not know they are ‘just’ rename, just that there are input to a rule), the address used is likely not properly passed on to the clone (I guess this is a bug).

So a work-around (as long as the type of the member did not change) is to set the branch address (using the old name of the members) in the output Tree to the current address in memory of the data members.

I can avoid this write-out error by manually syncing the addresses between the input and output TTree (instead of using CloneTree), but then I cannot expect branches that are not “observed” while copying to be copied at all.

This might actually be the best solution. And to make it work you can do:

auto inputTree = file->Get<Tree>(treename);
TTree *outputTree = createOutputTree();
inputTree->GetEntry(0); // Make sure the address are populated in the inputTree
inputTree->CopyAddresses(outputTree);
changedType *transfer = nullptr;
inputTree->SetBranchAddress(toplevel_branch_name, &transfer);
outputTree->SetBranchAddress(toplevel_branch_name, &transfer);

What do you imagine is in createOutputTree?

When I just new TTree in place of createOutputTree:

I get a few warnings that are understandable:

# when attempting to copy
Warning in <TTree::CopyAddresses>: Could not find branch named 'header' in tree named 'tree'
Error in <TTree::SetBranchAddress>: unknown branch -> header

# when attempting to read the copy
Error in <TTreeReaderValueBase::CreateProxy()>: The tree does not have a branch called header. You could check with TTree::Print() for available branches.

I want to inherit the structure of the input tree for all branches that are not “observed” (i.e. read in memory while doing the copying/slow cloning). This feels like what CloneTree is intended to do, so is there a way to clone a tree and then detach specific branches from their old memory slots? I had assumed that SetBranchAddress would do this, but when I try to replace createOutputTree with input_tree→CloneTree(0), it still doesn’t work.

More detail on “it still doesn’t work”:

diff --git i/slow-clone-schema-evolve/copy.cxx w/slow-clone-schema-evolve/copy.cxx
index 92c653c..07d4adf 100644
--- i/slow-clone-schema-evolve/copy.cxx
+++ w/slow-clone-schema-evolve/copy.cxx
@@ -38,8 +38,8 @@ int main(int nargs, char** argv) {
   /**
    * Solution as described by @pcanal on ROOT Forum.
    */
-  TTree* output_tree = new TTree("tree", "tree");
   input_tree->GetEntry(0);
+  TTree* output_tree = input_tree->CloneTree(0);
   input_tree->CopyAddresses(output_tree);
   Header* h_ptr = nullptr; //new Header;
   input_tree->SetBranchAddress("header", &h_ptr);
$ denv ./build/copy-v2-copy-addr data/v1-output.root data/v2-copy-addr-clone-tree-v1-output.root                                                               
manual schema evolution rule being applied                                                                            
manual schema evolution rule being applied                                                                            
{ run: 42, event: 0 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 1 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 2 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 3 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 4 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 5 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 6 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 7 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 8 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 9 }
$ denv ./build/read-v2 data/v2-copy-addr-
clone-tree-v1-output.root                                                                                             
manual schema evolution rule being applied                                                                            
{ run: 0, event: 0 }                                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 0, event: 0 }                                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 0, event: 0 }                                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 0, event: 0 }                                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 0, event: 0 }                                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 0, event: 0 }                                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 0, event: 0 }                                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 0, event: 0 }                                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 0, event: 0 }                                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 0, event: 0 }                         

What do you imagine is in createOutputTree?

I assume that in

I can avoid this write-out error by manually syncing the addresses between the input and output TTree (instead of using CloneTree)

there was a step creating the output TTree.

This feels like what CloneTree is intended to do, so is there a way to clone a tree and then detach specific branches from their old memory slots?

Maybe … i.e. SetBranchAddress on those (sub)branches after the first call to GetEntry(0) so that it is not overridden by the setting of the top level addresses.

but when I try to replace createOutputTree with input_tree→CloneTree(0), it still doesn’t work.

Yes but in that scenario, the output tree has the old structure and you need to explicitly connect the sub-branches of the renamed data member to the in memory object (it is probably a bug that connecting the cloned tree to the scratch area where the old data member are staged but there might also be lifetime issues).

Ok, this is still not working. Here is the full copying program:

This reads appropriately while copying, but the resulting file it writes is not readable by either v1 or v2.

Copying v1-output with v2 producing v2-copy-v1-output:
                                                    
manual schema evolution rule being applied                                                                            
manual schema evolution rule being applied                                                                            
{ run: 42, event: 0 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 1 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 2 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 3 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 4 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 5 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 6 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 7 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 8 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 9 }       

Contents of v2-copy-v1-output according to v2:                                                                        
manual schema evolution rule being applied                                                                            
{ run: 536879104, event: 536879104 }                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 536879104, event: 536879104 }                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 536879104, event: 536879104 }                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 536879104, event: 536879104 }                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 536879104, event: 536879104 }                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 536879104, event: 536879104 }                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 536879104, event: 536879104 }                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 536879104, event: 536879104 }                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 536879104, event: 536879104 }                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 536879104, event: 536879104 }         

Contents of v2-copy-v1-output according to v1:
{ run: 536879104, event: 536879104 }
{ run: 536879104, event: 536879104 }
{ run: 536879104, event: 536879104 }
{ run: 536879104, event: 536879104 }
{ run: 536879104, event: 536879104 }
{ run: 536879104, event: 536879104 }
{ run: 536879104, event: 536879104 }
{ run: 536879104, event: 536879104 }
{ run: 536879104, event: 536879104 }
{ run: 536879104, event: 536879104 }

Moving input_tree→GetEntry(0) to before CloneTree also does not work, but instead of writing garbage data, it writes all zeroes (from the default constructor?)

diff --git a/slow-clone-schema-evolve/copy.cxx b/slow-clone-schema-evolve/copy.cxx
index 1913d6a..155e434 100644
--- a/slow-clone-schema-evolve/copy.cxx
+++ b/slow-clone-schema-evolve/copy.cxx
@@ -17,8 +17,8 @@ int main(int nargs, char** argv) {
   TTree* input_tree{f.Get<TTree>("tree")};
   TFile o{argv[2], "recreate"};
 
-  input_tree->GetEntry(0);
   TTree* output_tree = input_tree->CloneTree(0);
+  input_tree->GetEntry(0);
   Header* h_ptr = nullptr; //new Header;
   input_tree->SetBranchAddress("header", &h_ptr);
   output_tree->SetBranchAddress("header", &h_ptr);

Results in

Copying v1-output with v2:                                                                                            
manual schema evolution rule being applied                                                                            
manual schema evolution rule being applied                                                                            
{ run: 42, event: 0 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 1 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 2 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 3 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 4 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 5 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 6 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 7 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 8 }                                                                                                 
manual schema evolution rule being applied                                                                            
{ run: 42, event: 9 }         

Contents of v2-copy-v1-output according to v2:                                                                        
manual schema evolution rule being applied                                                                            
{ run: 0, event: 0 }                                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 0, event: 0 }                                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 0, event: 0 }                                                                                                  
manual schema evolution rule being applied                                                                            
{ run: 0, event: 0 }                                                                                                  
manual schema evolution rule being applied
{ run: 0, event: 0 }
manual schema evolution rule being applied
{ run: 0, event: 0 }
manual schema evolution rule being applied
{ run: 0, event: 0 }
manual schema evolution rule being applied
{ run: 0, event: 0 }
manual schema evolution rule being applied
{ run: 0, event: 0 }
manual schema evolution rule being applied
{ run: 0, event: 0 }

Contents of v2-copy-v1-output according to v1:
{ run: 0, event: 0 }
{ run: 0, event: 0 }
{ run: 0, event: 0 }
{ run: 0, event: 0 }
{ run: 0, event: 0 }
{ run: 0, event: 0 }
{ run: 0, event: 0 }
{ run: 0, event: 0 }
{ run: 0, event: 0 }
{ run: 0, event: 0 }