ReadStream issue for unseekable file

Hi,

I need your help to know if something is possible or not :

I want to open a compressed file using Boost C++ library.
To uncompress the entire file is too heavy.
In order to save memory, I do that :

        ifstream file("ntrac.pin_detector.gz", std::ios_base::in | std::ios_base::binary);
  	boost::iostreams::filtering_streambuf<boost::iostreams::input> inbuf;
		inbuf.push(boost::iostreams::gzip_decompressor());
		inbuf.push(file);
		istream instream(&inbuf); //Convert streambuf to istream
		getline(instream, first_file_line);

After that, I use TTree.ReadStream() to “put” my file in a TTree and I got that error :
Error in <TTree::ReadStream>: Error reading stream

Do you know if it is normal and how to make it work ?

Regards,
William.

PS : Someone told me that this error could happen because the stream is unseekable if I use Boost.
Maybe he’s right, what do you think ?
This is the full code :

#include "TTree.h"
#include "TH1F.h"
#include "Riostream.h"
#include "TDirectory.h"
#include <string>
#include <sstream>
#include <iostream>
#include <map>
#include <memory>
#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filter/gzip.hpp>
#include <boost/algorithm/string.hpp>
#include <boost/format.hpp>

using namespace std;


int main()
{
	//init the map which contain the data type for each existing column on our file
	std::map<string, string> type = {{"CAID", "/I"}, 
		{"TIME", "/F"},
		{"COOR_X", "/F"},
		{"COOR_Y", "/F"},
		{"COOR_Z", "/F"}};

	//init of string variable
	string first_file_line;
	string file_head;
	string str_temp;

	ifstream file("ntrac.pin_detector.gz", std::ios_base::in | std::ios_base::binary);
  	boost::iostreams::filtering_streambuf<boost::iostreams::input> inbuf;
		inbuf.push(boost::iostreams::gzip_decompressor());
		inbuf.push(file);
		istream instream(&inbuf); //Convert streambuf to istream
		getline(instream, first_file_line);
	istringstream first_line_stream;

	first_line_stream.str(first_file_line);
	first_line_stream >> str_temp;

	//creating the branchDescriptor for the tree
	file_head = file_head + str_temp + type[str_temp];
	while(first_line_stream.good())
	{
		first_line_stream >> str_temp;
		file_head = file_head + ":" + str_temp + type[str_temp];
	}

	//branchDescriptor is a const char*, not a string
	const char* char_file_head = file_head.c_str();

	cout << char_file_head << endl;

	std::shared_ptr<TTree> mytree(new TTree("Data","This is my first ttree"));
	mytree->ReadStream(instream, char_file_head, ' ');

	return 0;
}

And the ten first file lines : ```
CAID TIME COOR_X COOR_Y COOR_Z
1 .869214 -8.7287829E-02 -1.9229909E-02 9.8033469E+01
1 1.570078 5.2402355E-01 -8.8949773E-02 1.3084988E+02
1 25.121292 6.3431012E-02 5.2537755E-01 7.2844195E+01
1 4.765265 3.0164797E-01 -1.3995741E-01 6.4161585E+01
1 55.628903 4.5086203E-01 2.9084185E-01 1.1436328E+02
1 9.530194 -3.3718074E-01 -1.1671732E-01 5.1681154E+01
1 .720152 -3.4568707E-01 -2.1176673E-01 9.7635021E+01
1 24.776037 4.7535869E-01 -9.5100680E+00 1.3371108E+02
1 23.394644 3.6313655E-02 4.4193817E-01 8.1504813E+01

I fear you have to patch either boost or ROOT :cry:

Reason is this piece of code in ReadStream:

   Long_t inPos = inputStream.tellg();
   if (!inputStream.good()) {
      Error("ReadStream","Error reading stream");
      return 0;
   }
   if (inPos == -1) {
      ss << std::cin.rdbuf();
      newline = GetNewlineValue(ss);
      inTemp = &ss;
   } else {
      newline = GetNewlineValue(inputStream);
      inTemp = &inputStream;
   }

tellg does not work on unseekable streams. For the boost decompressed stream it retuns -1 AND sets the badbit. Thus ReadStrean breaks the stream right in the beginning.

The question is: why do you have to use “tellg”? From the lines below it seems to be a check for cin (that comparison to -1 is the only use of the inPos variable). tell+seek are also used in the GetNewlineValue function. I fear you have to rewrite these parts of ReadStream before you can use gzip_decompressor.

One final comment: std::shared_ptr<TTree> mytree(new TTree("Data","This is my first ttree")); looks strange for multiple reasons:
a) in root “new” objects will be associated a file, so you do not need to delete them very often - and especially before creating a tree you usually open a file for writing…
b) in general std::shared_ptr<T>(new T(...)) is better written as std::make_shared<T>(...)
c) Why a shared_ptr? If pointer, wouldn’t a unique_ptr -> std::make_unique be better? Or even a TTree automatic storage duration object (without *)?

1 Like

Hi,

Thank you for telling me that.
I decide to change my strategy finally because this problem need to much time for a short benefit…

I didn’t understand why did you ask me to use std::make_shared ?
Do I have to do that :

std::make_shared<TTree>("Data","This is my first ttree"); ?

Ok but when do I give it a name ?
The TTree “mytree” will be returned by a function at the end so I need a name.
Furthermore, I have to do :

mytree->ReadStream(instream, char_file_head, ' ');

just after creating my tree.
Or maybe I didn’t understand what you proposed…

And if I use shared_ptr because I use a function which return a TTree. It doesn’t worked with unique_ptr or automatic storage duration object so I tried using shared_ptr and it worked. I don’t really know how shared and unique pointers work because I don’t have to. But maybe I made something wrong…

Regards,
William

Use the return value: auto mytree = std::make_shared...

Why make_shared instead of shared_ptr + new: it is better style mainly because of exception safety (and you can save one extra allocation). Only drawback: no custom deleter possible.

But what I wanted to tell you is that a shared pointer is probably not what you want here. Use shared pointer only if you need to. Be default, try either automatic storage duration (for small objects or objects that don’t live long) or a unique_ptr. For ROOT objects that are owned by the current TFile you don’t need to care about this, so you can use plain new. You might even want to keep your objects alive so that you can inspect them in the TBrowser. So in that case simply use new and bare pointers.

Sounds very strange. You need to show code.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.