Dear @vpadulan , thanks for the reply,
I spent some time trying to figure out what was going on and it turned out that it was sufficient to change the rooabsdatastore default setup to
ROOT.RooAbsData.setDefaultStorageType(ROOT.RooAbsData.Tree)
However when i do
ws = ROOT.RooWorkspace("myws")
ws.Import( dataset)
ws.writeToFile( "myfile.root", "RECREATE")
I am a bit puzzled on why i get as output in the final TFile both a TTree and a RooWorkspace.
Are the workspace and ttree somewhat ‘linked’ in the final storage? I.e can the workspace saved be read regardless of the presence of TTree added when saving?
For the reproducer, i think the issue is that the RooWorkspace.writeToFile saves a normal TKey while defaulting the storage type to a vector instead, the TTree datastore type is not having the same limitation. Tough it’s a bit unclear to me the reason why both a TTree and a RooWorkspace is saved ultimately.
Here an example code to create and save a dataset and load it back, when the dataset has TOO many entries to save and too many columns the error show up.
import ROOT
import numpy as np
def save() :
n_entries = 1000
ROOT.EnableImplicitMT()
df = ROOT.RDataFrame(n_entries)
# Define columns in the RDataFrame
# df = df.Define("float_col" , ",".join(map(str, float_col))) #.Define("float_col2","float(float_col)")
ncols = 80
for i in range(ncols) :
df = df.Define(f"double_col_{i}", "1.5")
ROOT.RDF.Experimental.AddProgressBar(ROOT.RDF.AsRNode(df))
ROOT.RooAbsData.setDefaultStorageType(ROOT.RooAbsData.Tree)
vars_list =[]
vars_name =[]
for c in df.GetColumnNames():
v = ROOT.RooRealVar( str(c),str(c),0)
v.setConstant(0)
vars_list.append(v)
vars_name.append(str(c))
helper = ROOT.RooDataSetHelper("dataset", "Title of dataset", ROOT.RooArgSet( *vars_list))
roo_data_set_result = df.Book( ROOT.std.move(helper), vars_name)
df.Count()
roo_data_set_result.Print()
ws = ROOT.RooWorkspace( "space", "space")
ws.Import( roo_data_set_result.GetValue())
ws.writeToFile( "test.root", True )
def load():
# ROOT.RooAbsData.setDefaultStorageType(ROOT.RooAbsData.Tree)
f = ROOT.TFile("test.root")
ws = f.Get("space")
return ws , f
if __name__ == "__main__" :
save()
wspace ,filein = load( )
wspace.Print()
print( wspace["dataset"].sumEntries() )
wspace["double_col_0"].Print()
v0 = wspace["double_col_0"]
ds = wspace["dataset"]
frame = v0.frame(ROOT.RooFit.Bins(10), ROOT.RooFit.Range( 0,4))
ds.plotOn(frame)
cc = ROOT.TCanvas()
frame.Draw()
cc.SaveAs("test.pdf")