Segmentation violation during Snapshot() of an existing tree using RDataFrame

Dear ROOT experts,

I’ve got a file that contains 3 trees. I want to add a branch to one of these trees and write it to the same file with the same name. I’ve converted the code from this post to python and it works perfectly.

import ROOT

df = ROOT.ROOT.RDataFrame(10).Define("x", "gRandom->Rndm()")

# produce initial file
df.Snapshot("t1", "f.root", "")

# write another tree to that file
opts1 = ROOT.ROOT.RDF.RSnapshotOptions()
opts1.fMode = "update"
df.Snapshot("t2", "f.root", "", opts1)

# overwrite the t1
opts2 = ROOT.ROOT.RDF.RSnapshotOptions()
opts2.fMode = "update"
opts2.fOverwriteIfExists = True
df.Range(5).Snapshot("t1", "f.root", "", opts2)

However, in a more realistic situation, when I try to open an already existing file and write to it, the program fails with a segmentation violation and non descriptive python and C++ errors.

import ROOT

def create_file(file_name, tree_name_list):
    df = ROOT.ROOT.RDataFrame(10).Define("x", "gRandom->Rndm()")
    df.Snapshot(tree_name_list[0], file_name, "")

    for tree_name in tree_name_list[1:]:
        opts = ROOT.ROOT.RDF.RSnapshotOptions()
        opts.fMode = "update"
        df.Snapshot(tree_name, "f.root", "", opts)

def change_tree(file_name, tree_name):
    df = ROOT.ROOT.RDataFrame(tree_name, file_name)

    opts = ROOT.ROOT.RDF.RSnapshotOptions()
    opts.fMode = "update"
    opts.fOverwriteIfExists = True
    df.Range(5).Snapshot(tree_name, file_name, "", opts)

def change_tree_add_branch(file_name, tree_name, branch_name):
    df = ROOT.ROOT.RDataFrame(tree_name, file_name)
    df = df.Define(branch_name, "x")

    opts = ROOT.ROOT.RDF.RSnapshotOptions()
    opts.fMode = "update"
    opts.fOverwriteIfExists = True
    df.Snapshot(tree_name, file_name, "", opts)

def main():
    file_name = "f.root"
    tree_name_list = ["t1", "t2", "t3"]
    create_file(file_name, tree_name_list)
    change_tree(file_name, tree_name_list[0])
    change_tree_add_branch(file_name, tree_name_list[0], "y")

if __name__ == "__main__":
    main()

create_file() function works perfectly fine but both functions change_tree() (that truncates the tree as in the example thread) and change_tree_add_branch() (that adds branch to the tree) fail.

What is the correct way to add a new branch to the existing tree using RDataFrame?

Thanks in advance,
Aleksandr


ROOT Version: 6.28.04
Platform: Ubuntu 20.04
Compiler: Precompiled


Dear @apetukho ,

In the same post you mention it is also discussed that it is not possible to update an already existing tree within a file Is there a way to save dataframe to an already existing root file? - #6 by Karl007 .This hasn’t changed meanwhile. But as Enrico suggests in the post, you could create a new tree with the branch you were trying to attach and just use it as a friend tree via TTree::AddFriend later on.

Cheers,
Vincenzo

Dear Vincenzo,

Thank you for your response.
Then I’m confused about the meaning of the fOverwriteIfExists snapshot option. The description says “If fMode is “UPDATE”, overwrite object in output file if it already exists.” Isn’t it the same as “updating an already existing tree within a file”? And why does it work for the first example, updating the t1 tree?

I’m afraid using a friend tree is not an option as it is not supported by the fitting frameworks that are used afterwards. Right now, I’m trying to calculate the TMVA BDT classifier response and write it as a branch in the same tree as the input variables using pyROOT. Right now I’ve got a working but poorly performing implementations using for event in tree python loops. I was hoping to use the RReader interface within RDataFrame to speed up this process. But this approach is now hindered by inability to write the updated tree directly to the existing file.
So I guess the easiest workaround right now is to write the trees in a new file using Snapshot(), join the files and just remove the old ones? Or is there any other approach that would speed up the TMVA response computation in pyROOT and would allow to update an already existing tree within a file?

Best reagrds,
Aleksandr

Hi,

sorry for the late reply. Do I understand correctly that you want to overwrite the contents of a tree while you are reading that same tree? I don’t think that’s supported (by TTree/TFile, that is).

Cheers,
Enrico

If you make the ‘friendship’ permanent (by storing an updated TTree object (which does not duplicate the data) after setting the friendship, some tools should not notice the difference between the single tree and friend tree case.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.