Incorrect # of events in tree constructed from ascii file

cernbuild.C from /root/tutorial/tree was taken to generate the trees for events from an ascii file. However, the # of events in the constructed tree is more than the events available in ascii data file. Here is a snapshot:

FILE *fbk = fopen(Form("%ssample-noal.tes",dir.Data()),“r”);

tree->Branch(“Class”,&Class,“Class/I”); // a total of 10 variables
char line[80];

while (fgets(&line,80,fp)) {
sscanf(line,"%f %f %f %f %f %f %f %f %f %d " , &len,&wid,&size,&conc,&conc1,&asym,&m3long,&m3trans,&dist,&Class);
if (print) tree->Print();


My ascii file contains 12680 events whereas the tree has 12686 events. I am not sure where is the problem.


Search your ascii data file for empty lines and lines which are longer than 80 characters.

The datafile has lines with maximum of 78 characters ( checked with awk ‘length() > 78’ file ) and there is no empty space. The file under consideration (sample-noal.dat) is attached.
sample-noal.dat (886 KB)

Your data file has two spaces before the last column.

{ TTree *t = new TTree("t", "sample-noal"); t->ReadFile("sample-noal.dat", "len/F:wid:size:conc:conc1:asym:m3long:m3trans:dist:Class/I"); t->Print(); }

std::cerr << "Hi Wilie\n!";

Yes, that should work, but you didn’t explain why.

The format string claimed that all columns where separated by one space, but in the actual input the last column was offset by two spaces. That caused sscanf to not produce the expected output (without diagnostic since if you use it you know you shoot yourself in the foot).

Now one can use TTree::ReadFile which will automatically figure out how many spaces there are between columns, but that doesn’t explain why parsing by hand failed here.

BTW. You do have 6 lines which need more than 80 characters (just 81). Increase your “line” buffer to 90 and it should be fine.
P.S. I don’t think you need to care about the “double space” before the “last column” (i.e. they should not make you any problems). In a “format string” … a “sequence of white-space characters (space, tab, newline, etc.; see isspace(3)) … matches any amount of white space, including none, in the input”.

Dear Wile
Increasing the “line” buffer to 90 solved the problem. And you are also right about the “double space”. Increase in the buffer takes care of the problem.

Thanks a lot.

BTW. by what command you found that the datafile have 6 lines which need more than 80 characters?

@Wilie: You were right, sscanf isn’t picky about extra spaces like I thought. I think it’s still a good idea to some basic explanation instead of just a working sample.



The description of the “fgets” function explicitly says that, in the “buffer” string, you need space for at least 2 (two) additional characters -> a “trailing” newline character (“LF”) and a “terminating” null character (“NULL”).
So, you need a “buffer” string which is at least 2 (two) characters longer than your maximum line length.
If you were dealing with a text file in the “DOS format”, you would need to make sure that you have space for 3 (three) additional characters -> “CR”, “LF”, “NULL”.

The command:
awk ‘length() >= 79’ sample-noal.dat
returns 6 lines.
Each of them is actually just 79 characters long, so you need a “buffer” string which is at least 81 (= 79 + 2) characters long.
You could try (please note also that the first “fgets” parameter is “line”, not “&line”): // ... char line[81]; // ... while (fgets(line, sizeof(line), fp)) { // ...


As an aside, there is also a function TTree::ReadFile which may (or may not) be able to process your input file.