Hello,
I have a question about TTree file IO. I get the following error when Writing my TTree to a TFile at a certain threshold point of my task:
Error in TBufferFile::CheckCount: buffer offset too large (larger than 1073741822)
Error in TBufferFile::CheckCount: buffer offset too large (larger than 1073741822)
Error in TBufferFile::WriteByteCount: bytecount too large (more than 1073741822)
Error in TBufferFile::WriteByteCount: bytecount too large (more than 1073741822)
Error in TBufferFile::WriteByteCount: bytecount too large (more than 1073741822)
Error in TBufferFile::WriteByteCount: bytecount too large (more than 1073741822)
I was reading other posts a bit and tried to fix the issue to my understanding, but so far without any progress…
So as far as I understand the data for a branch is buffered and then streamed into the TFile and if the total data size for the branch is larger than 1073741822 bytes, you get that error. Is that correct?
Now to my task. In principle I want to save this map into a ROOT file
std::map<Point2D, Point2DCloud> hit_map_2d;
with
struct Point2D {
double x;
double y;
};
struct Point2DCloud {
std::map<Point2D, unsigned int> points;
unsigned long total_count;
};
This map carries all of the detector resolution information I need for my smearing algorithm. Now initially I only had two branches, one for the key (so a Point2D container branch) and one for the value (so a Point2DCloud container branch). Usually I hear that splitting the data should solve this problem and I understood this as splitting the Point2DCloud into its parts, hence I created a branch for the x and y coordinate and the counters, as shown in the following code snippet.
data_tree = new TTree();
// new format
Point2D mc_point;
const Point2D *pmc_point(&mc_point);
std::vector<double> reco_points_x;
const std::vector<double> *preco_points_x(&reco_points_x);
std::vector<double> reco_points_y;
const std::vector<double> *preco_points_y(&reco_points_y);
std::vector<unsigned int> reco_points_count;
const std::vector<unsigned int> *preco_points_count(&reco_points_count);
ULong64_t total_count;
data_tree->Branch("mc_point", &pmc_point);
data_tree->Branch("reco_points_x", &preco_points_x);
data_tree->Branch("reco_points_y", &preco_points_y);
data_tree->Branch("reco_points_count", &preco_points_count);
data_tree->Branch("total_count", &total_count, "total_count/l");
for (auto const& entry : hit_map_2d) {
pmc_point = &entry.first;
total_count = entry.second.total_count;
reco_points_x.clear();
reco_points_y.clear();
reco_points_count.clear();
for (auto const& reco_point : entry.second.points) {
reco_points_x.push_back(reco_point.first.x);
reco_points_y.push_back(reco_point.first.y);
reco_points_count.push_back(reco_point.second);
}
data_tree->Fill();
}
But the problem persists. So I did a little analysis of how much memory my object should consume:
[code] std::cout<<“estimating size of data in memory…\n”;
unsigned long bytes_unsigned_ints(0);
unsigned long bytes_doubles(0);
unsigned long total_overhead_bytes_unsigned_ints(0);
unsigned long total_overhead_bytes_doubles(0);
unsigned short overhead_unsigned_ints(sizeof(std::vector));
unsigned short overhead_doubles(sizeof(std::vector));
unsigned long all_reco_points(0);
for (auto const& entry : hit_map_2d) {
bytes_unsigned_ints += overhead_unsigned_ints;
total_overhead_bytes_unsigned_ints += overhead_unsigned_ints;
bytes_doubles += overhead_doubles;
total_overhead_bytes_doubles += overhead_doubles;
bytes_unsigned_ints += sizeof(unsigned int)*entry.second.points.size();
bytes_doubles += sizeof(double)*entry.second.points.size();
all_reco_points += entry.second.points.size();
}
std::cout<<“memory summary:\n”;
std::cout<<“overhead for a single unsigned int vector: “<<overhead_unsigned_ints<<” bytes\n”;
std::cout<<“overhead for a single double vector: “<<overhead_doubles<<” bytes\n”;
std::cout<<"number of entries: "<<hit_map_2d.size()<<std::endl;
std::cout<<"number of reco entries: "<<all_reco_points<<std::endl;
std::cout<<“total memory consumption for all unsigned int vectors: “<<bytes_unsigned_ints<<” bytes\n”;
std::cout<<“total memory consumption for all double vectors: “<<bytes_doubles<<” bytes\n”;
std::cout<<“total overhead for the unsigned int vectors: “<<total_overhead_bytes_unsigned_ints<<” bytes\n”;
std::cout<<“total overhead for the double vectors: “<<total_overhead_bytes_doubles<<” bytes\n\n”;
std::cout << “converting hit map to root tree…\n”;
[/code]
So the largest branches should be the reco_points_x/y ones, with a size of roughly 440mb. How much overhead is in the buffer for storing one tree entry? Or am I missing something else? Thank in advance
Best regards,
Stefan