What's with my code?

I’ve written the following code, which takes in three sorted ROOT files, each containing a TNtupleD of UNIX timestamps from the year 2015, and pairs them up into the closest unique triplets of the form {a,b,c}. I then plot these triplets on a TH2, with the horizontal axis being a-b, and the vertical axis being a-c.

To ensure the algorithm is working properly, I produced two ROOT files of completely random timestamps, using gRandom->Uniform() and setting the seed value using time(). I fed the code one live dataset, and the two random datasets. I expected random distribution, but instead I’m getting these weird, perfect diagonal lines: https://filebin.net/spbswnhkfy8xdkp8/random_plot.pdf?t=48f2yhjq

Moreover, when I plot using three real datasets, if I use enough bins there is a weird diagonal line, two more lines running straight through the graph horizontally and vertically. Have I made some mistake with my code? Is this a ROOT thing? A C++ problem?

My code:

void unbiasedAnalysis(){

	TNtupleD *D  = new TNtupleD("D","D","x:y");
 
	ROOT::RDataFrame statn1("D", "./pathtodata");
	ROOT::RDataFrame statn2("D", "./pathtodata");
	ROOT::RDataFrame statn3("D", "./pathtodata");
 
	vector<double> vec_1, vec_2, vec_3;
	statn1.Foreach([&](double tstamp){ vec_1.push_back(tstamp); },{"UNIX"});
	statn2.Foreach([&](double tstamp){ vec_2.push_back(tstamp); },{"UNIX"});
	statn3.Foreach([&](double tstamp){ vec_3.push_back(tstamp); },{"UNIX"});
 
	vector<vector<double>> pairs;
	for(auto tstamp : vec_1){
 
		double first,second;
 
		//get iterator pointing to closest element greater than or equal to
		auto geq = std::lower_bound(vec_2.begin(), vec_2.end(), tstamp);
		//get iterator pointing to nearest element less than
		auto leq = geq - 1;
 
		double foo = tstamp - *geq;
		double bar = tstamp - *leq;
 
		//compare iterators, save the closest 
		if(dabs(foo) <  dabs(bar)){ first = *geq; }
		else { first = *leq; }
 
 
 
 
		//repeat
		geq = std::lower_bound(vec_3.begin(), vec_3.end(), tstamp);
		leq = geq - 1;
 
		foo = tstamp - *geq;
		bar = tstamp - *leq;
 
		if(dabs(foo) < dabs(bar)){ second = *geq; }
		else { second = *leq; }
 
		//add to pairs
		pairs.push_back({tstamp, first, second, (tstamp-first), (tstamp-second), std::min((tstamp-first), (tstamp-second))});
 
	}
 
	//sort vector of vectors by size of smallest difference
	std::sort(pairs.begin(), pairs.end(),
		[](const vector<double>& A, const vector<double>& B){
			return A[5] < B[5];
	});
 
	std::set<double> cache;
 
	ROOT::EnableImplicitMT();
 
	for(auto pair : pairs){
		//if not in cache, add to TNtuple
		if(cache.find(pair[1]) == cache.end() && cache.find(pair[2]) == cache.end()){
 
				D->Fill(pair[3],pair[4]);

			//add to cache
			cache.insert(pair[1]); cache.insert(pair[2]);
		}
	}
 
	D->Draw("x:y>>htemp(100,-0.02,0.02,100,-0.02,0.02)","","colz");
 
}

Please read tips for efficient and successful posting and posting code

ROOT Version: Not Provided
Platform: Not Provided
Compiler: Not Provided


Hi,

Can you re-share the plot you get? I can’t access it.

Asking @eguiraud to give a hand with this one.

Oh, sorry: https://filebin.net/5gl6a5hw30qg0ja2/Lines.pdf?t=aa7l5bxd
That link should work.

Hi,

Unix timestamps as doubles? Could something be wrong with the conversion? How do you write the unix time to the NTuple?

Hi @KAM,
given that statn1, statn2 and statn3 contain the correct numbers (I suppose they do, the code looks ok), I suggest you manually fill x and y with numbers that you know will produce the plot you want, and double-check that D->Draw(..) works as expected (I expect it will). With that, ROOT features are out of the way and that remains is to debug your logic I’m afraid :sweat_smile: stepping through with a debugger like gdb will definitely help.

If you do find out that Foreach or Draw do not work as expect please post a small reproducer with data we can use to investigate and fix the ROOT bug.

Also note that the ROOT::EnableImplicitMT call you have does not do anything, there are no implicitly parallel ROOT features used afterwards.

Cheers,
Enrico

Sorry. To clarify, these are UNIX timestamps plus the sub-second part (hence the double). I’ve used Foreach() to print out the timestamps, and have verified that the files are ok. To write the timestamps to the TNtuple, I iterate over a file containing the time information and use Fill() to fill the TNtuple with each timestamp.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.