What's with my code?

I’ve written the following code, which takes in three sorted ROOT files, each containing a TNtupleD of UNIX timestamps from the year 2015, and pairs them up into the closest unique triplets of the form {a,b,c}. I then plot these triplets on a TH2, with the horizontal axis being a-b, and the vertical axis being a-c.

To ensure the algorithm is working properly, I produced two ROOT files of completely random timestamps, using gRandom->Uniform() and setting the seed value using time(). I fed the code one live dataset, and the two random datasets. I expected random distribution, but instead I’m getting these weird, perfect diagonal lines: https://filebin.net/spbswnhkfy8xdkp8/random_plot.pdf?t=48f2yhjq

Moreover, when I plot using three real datasets, if I use enough bins there is a weird diagonal line, two more lines running straight through the graph horizontally and vertically. Have I made some mistake with my code? Is this a ROOT thing? A C++ problem?

My code:

void unbiasedAnalysis(){

	TNtupleD *D  = new TNtupleD("D","D","x:y");
 
	ROOT::RDataFrame statn1("D", "./pathtodata");
	ROOT::RDataFrame statn2("D", "./pathtodata");
	ROOT::RDataFrame statn3("D", "./pathtodata");
 
	vector<double> vec_1, vec_2, vec_3;
	statn1.Foreach([&](double tstamp){ vec_1.push_back(tstamp); },{"UNIX"});
	statn2.Foreach([&](double tstamp){ vec_2.push_back(tstamp); },{"UNIX"});
	statn3.Foreach([&](double tstamp){ vec_3.push_back(tstamp); },{"UNIX"});
 
	vector<vector<double>> pairs;
	for(auto tstamp : vec_1){
 
		double first,second;
 
		//get iterator pointing to closest element greater than or equal to
		auto geq = std::lower_bound(vec_2.begin(), vec_2.end(), tstamp);
		//get iterator pointing to nearest element less than
		auto leq = geq - 1;
 
		double foo = tstamp - *geq;
		double bar = tstamp - *leq;
 
		//compare iterators, save the closest 
		if(dabs(foo) <  dabs(bar)){ first = *geq; }
		else { first = *leq; }
 
 
 
 
		//repeat
		geq = std::lower_bound(vec_3.begin(), vec_3.end(), tstamp);
		leq = geq - 1;
 
		foo = tstamp - *geq;
		bar = tstamp - *leq;
 
		if(dabs(foo) < dabs(bar)){ second = *geq; }
		else { second = *leq; }
 
		//add to pairs
		pairs.push_back({tstamp, first, second, (tstamp-first), (tstamp-second), std::min((tstamp-first), (tstamp-second))});
 
	}
 
	//sort vector of vectors by size of smallest difference
	std::sort(pairs.begin(), pairs.end(),
		[](const vector<double>& A, const vector<double>& B){
			return A[5] < B[5];
	});
 
	std::set<double> cache;
 
	ROOT::EnableImplicitMT();
 
	for(auto pair : pairs){
		//if not in cache, add to TNtuple
		if(cache.find(pair[1]) == cache.end() && cache.find(pair[2]) == cache.end()){
 
				D->Fill(pair[3],pair[4]);

			//add to cache
			cache.insert(pair[1]); cache.insert(pair[2]);
		}
	}
 
	D->Draw("x:y>>htemp(100,-0.02,0.02,100,-0.02,0.02)","","colz");
 
}

Please read tips for efficient and successful posting and posting code

ROOT Version: Not Provided
Platform: Not Provided
Compiler: Not Provided


Hi,

Can you re-share the plot you get? I can’t access it.

Asking @eguiraud to give a hand with this one.

Oh, sorry: https://filebin.net/5gl6a5hw30qg0ja2/Lines.pdf?t=aa7l5bxd
That link should work.

Hi,

Unix timestamps as doubles? Could something be wrong with the conversion? How do you write the unix time to the NTuple?

Hi @KAM,
given that statn1, statn2 and statn3 contain the correct numbers (I suppose they do, the code looks ok), I suggest you manually fill x and y with numbers that you know will produce the plot you want, and double-check that D->Draw(..) works as expected (I expect it will). With that, ROOT features are out of the way and that remains is to debug your logic I’m afraid :sweat_smile: stepping through with a debugger like gdb will definitely help.

If you do find out that Foreach or Draw do not work as expect please post a small reproducer with data we can use to investigate and fix the ROOT bug.

Also note that the ROOT::EnableImplicitMT call you have does not do anything, there are no implicitly parallel ROOT features used afterwards.

Cheers,
Enrico

Sorry. To clarify, these are UNIX timestamps plus the sub-second part (hence the double). I’ve used Foreach() to print out the timestamps, and have verified that the files are ok. To write the timestamps to the TNtuple, I iterate over a file containing the time information and use Fill() to fill the TNtuple with each timestamp.