RDataframe vectors of differnt size error

Hi, I’m getting this error message below.

line 479, in <module>  hNQvT.Add(i.GetValue())
cppyy.gbl.std.runtime_error: const TH2D& ROOT::RDF::RResultPtr<TH2D>::GetValue() =>
    runtime_error: Cannot call operator / on vectors of different sizes.

and below is part of my code

df = rt.RDataFrame("Events", [argv[1],argv[2],argv[3],argv[4],argv[5]])
HList=[]     #holds all histogram
hb_NormQ = defaultdict(dict)  #Linearized charge, normalized by the amplitude of SOI (hb_fc3)

s_QSum = 'hb_fc0+hb_fc1+hb_fc2+hb_fc3+hb_fc4+hb_fc5+hb_fc6+hb_fc7'
df = df.Define('QSum',s_QSum) # sum of charges in from all time samples, for HB

#Bunch of G matrices....

#generate signal amplitudes
matrices={-10:Gm10 ,-9:Gm9, -8:Gm8, -7:Gm7, -6:Gm6, -5:Gm5, -4:Gm4, -3:Gm3, -2:Gm2, -1:Gm1, 0:G0, 1:G1, 2:G2, 3:G3, 4:G4, 5:G5, 6:G6, 7:G7, 8:G8, 9:G9, 10:G10, 11:G11, 12:G12, 13:G13, 14:G14, 15:G15, 16:G16, 17:G17, 18:G18, 19:G19, 20:G20, 21:G21}

sig_amp = defaultdict(dict)

def MatMul(M,i):

sig_amp = defaultdict(dict)

for tshift in range(-10,21):
    for tslice in range(8):

#make RDataFrame with amplitides
for tshift in range(-10,21):
    for tslice in range(8):
        varName = str(tshift).replace("-","m")
        df = df.Define(f"tshift_{tslice}_{varName}",f"tshift=={tshift}").Define(f"sig_amp_{tslice}_{varName}",sig_amp[tshift][tslice])

#make GoodPulse cut

for tshift in range(-10,21):
    varName = str(tshift).replace("-","m")
    df=df.Define(f"GoodPulse_{varName}",f"0.1*(sig_amp_3_{varName})>sig_amp_2_{varName} && 0.1*(sig_amp_3_{varName})>sig_amp_4_{varName} && sig_amp_0_{varName}+sig_amp_1_{varName}+sig_amp_5_{varName}+sig_amp_6_{varName}+sig_amp_7_{varName}<10000")

#make Norm_Q
for tshift in range(-10,21):
    varName = str(tshift).replace("-","m") 
    for tslice in range(8):
        df = df.Define(f"RealTime{tslice}_{varName}",f"0*cut_hb_fc{tslice}_{varName}+25*{tslice}-(tshift)")
        df = df.Define(f"hb_NormQ_{tslice}_{varName}", f"cut_hb_fc{tslice}_{varName}/sig_amp_3_{varName}")
    df = df.Define(f"EWeight{varName}",f"cut_hb_fc0_{varName}+cut_hb_fc1_{varName}+cut_hb_fc2_{varName}+cut_hb_fc3_{varName}+cut_hb_fc4_{varName}+cut_hb_fc5_{varName}+cut_hb_fc6_{varName}+cut_hb_fc7_{varName}")

#make plots

for tshift in range(-10,21):
    for tslice in range(8):
        varName = str(tshift).replace("-","m")

hNQvT = rt.TH2D("hNQvT","Phase aligned normal pulses;Time [ns];Normalized charge [fC]",200,0,200,200,0,2)
hNQvT_EnW = rt.TH2D("hNQvT_EnW","Phase aligned normal pulses | Energy weighted;Time [ns];Normalized charge [fC]",200,0,200,200,0,2.0)

for i in htemp1:

for i in htemp2:

HList.append(df.Histo1D(("hTShift",";tshift [ns]",30,-10.5,19.5),'tshift'))
HList.append(df.Histo1D(("hQSum","Charge summed of all time samples;charge [fC]",100,0,2e6),"QSum"))

tf_out = rt.TFile(argv[6],'RECREATE')
for hh in HList:

I believe this is because the histogram that I’m appending to the list htemp1 has different numbers of x and y values, but I don’t think it should have different x and y values. Am I missing something? The reason I think there should be same number of x and y value is because I applied the cut [GoodPulse] to make cut_hb_fc and used that to make x value: RealTime and y value: hb_NormQ

_ROOT Version: 6.26/11
_Platform: Ubuntu
_Compiler: python3


Thanks for posting, and welcome to the forum!
The code you posted is sophisticated (it does many things, as it should) and a bit hard to read. We do not have indications of bugs in the detection of input collections of different sizes so far.

My suggestion would be to add somewhere a debug node to check what is going on, perhaps a filter like "cout << rdfentry_; return true;" to check at what entry the problem occurs and then to check the content of that row with the Display method.

Let us know how it goes.


It says rdfentry_ is not defined. I didn’t put return True because it said it needs to be inside a function.

Apologies for that.
Could you try

Define("x", "rdfentry_").Filter("cout << x << endl; return true;")

That should work for 6.26 and given it’s debugging it might be considered ok-ish.


I did

df=df.Define("x", "rdfentry_").Filter("print(x), return True")

since this is python3 but I might have wrote down something nonsense (Sorry I am not familar with interchanging c++ and python) because it gave me errors

I might be off but since the error is

Cannot call operator / on vectors of different sizes.

and the only occurrence of the / operator in the code is in this expression:


the problem is likely that variables cut_hb_fc{tslice}_{varName} and sig_amp_3_{varName} are arrays of different sizes (at least for one combination of tslice and varName). You can verify that with a printout like Danilo suggests (using C++'s cout rather than Python’s print in the string expression).

I hope this helps!

1 Like

Hi, my error was indeed from this part. I fixed it by adding [GoodPulse] cut on sig_amp_3_{varName}. Thank you so much!