TChain::Add() and TTree::CopyTree() documentation issues

Dear Root Developers,

I recently ran into a ROOT usage problem which may interest you, the
solution of which may suggest a minor ROOT code change or some changes
to the documentation. This issue is important because at least in
this case valid data was not processed as expected and with no warning
messages of any kind. First, let me describe the scenario.

In use is a standard tool developed locally, a ROOT macro, which
creates a TChain of input ntuple datasets, applies a TCut and outputs
the result via CopyTree(), an operation we refer to as “pruning”. We
use a web page front-end to assemble data for the job, and then submit
a batch job to run the ROOT macro. The user depends on the return
code to indicate success or failure. Typically, a user may not
even look at the output log beyond the return code.

The case of interest included 792 separate files, for a grand total
of 1,885,273,770 events. The user had specified a TCut which selected
a contiguous group of ~21k events in the 540th file in the TChain
(although he did not know this beforehand). When the job ended, the
return code was zero, and no events had been delivered to the output
stream.

Inside the ROOT macro, there are two relevant functions to
accomplish this task.

chain->Add(filename);  		// Build the TChain of files

tree->CopyTree(selection);	// Create output file based on TCut criteria

What was not at all obvious from the documentation (ROOT Reference
Manual) is that in both cases arbitrary limits were imposed by
default.

In the case of the chain->Add(filename), the documentation discusses
the second “nentries” parameter but implies that the default is safe
to use. I do not know a priori how many events are in the
requested TChain so never thought about using case “B”. And case “A”,
while ultimately appropriate for this situation, is made out to be an
inefficient mode when the files are to be read sequentially (which
they are).

[ref root.cern.ch/root/html402/TChain.html#TChain:Add ]

So I changed the code to the following and that seems to work,

chain->Add(filename,0);

Now, about the documentation and code, may I suggest a few changes:

  • First, include a link to the value of kBigNumber rather than to
    its type declaration? This would help the user quickly find its
    value.

  • Next, in today’s world of high energy (and astro) particle physics,
    the current value of kBigNumber, 1,234,567,890, is hardly a “big
    number” any more. Add to that the fact that the “nentries” parameter
    is a Long64_t and kBigNumber becomes laughably small! How about
    modernizing this value? Or perhaps case “C” is no longer a good
    default?

  • Is it possible and reasonable for the user to specify an
    arbitrarily large value for “nentries” without incurring significant
    performance penalty? For example, if I were to routinely specify
    nentries=999,999,999,999,999,999 would this hinder performance? Would
    doing so be more desirable than specifying nentries=0?

  • Most importantly, when chain->Add() decides to stop accummulating
    events (having reached nentries), could it please emit a warning
    message to the user just in case that was not the intended action?
    This was, in my opinion, the most dangerous part of this situation:
    ROOT quietly threw events on the floor and, at least in some (many?)
    cases, the user would be none the wiser.

Unrelated question: Add() seems a superset of AddFile(), is there any
advantage to using AddFile() over Add(), when possible to do so?

===

In the case of tree->CopyTree(), the documentation is completely
remiss in discussing the 2nd, 3rd and 4th parameters. (This is not
the only function with this symptom.) Tracking down the meaning of
these parameters - in the code - eventually led me to these changes:

Long64_t nentries = chain->GetEntries();
tree->CopyTree(selection,"",nentries);

because the “invisible” default value for nentries is 1,000,000,000.,
and its meaning was a surprise: “nentries” refers to the maximum
number of events read from the input stream, not the maximum written
to the output stream!

[ref root.cern.ch/root/html402/TTree. … e:CopyTree ]

  • Could all parameters for every function please be properly
    documented, along with default values?

  • Again, the most dangerous part of this situation is that ROOT lets
    data quietly fall on the floor. When CopyTree() runs into its
    "nentries" limit, could it please emit a warning to the user?

  • Does ROOT have a “verbose mode” by which long-running, monolithic
    functions (e.g. CopyTree()) can be monitored? For example, to emit a
    heartbeat INFO message every 1M events could be useful for
    diagnostics.

===

For the record, most of my testing was done with ROOT v4.02 but a
limited amount was done with v5.10 to verify similar behavior. The
machine type was Linux RHEL 2.4.21-47 (2xCPU 2GB).

Thank you for your consideration,

  • Tom Glanzman
    SLAC

Hi Tom,

Thanks for this interesting message. I agree with all your points.
These artificial limitations were introduced when the loop index for
Trees was limited to 32 bits integers.
I believe that we could change these constants without any problems
of back compatibility. We need some time to digest all this,
but will implement them in the coming weeks.

Rene

It’s been 9 years, but TChain::kBigNumber is still set to 1,234,567,890, and the default for nentries in TTree::CopyTree() is still 1,000,000,000, and the TTree::CopyTree() documentation still contains no information on the meaning of its arguments.

It’s less hidden, but TTree::Draw also has an arbitrary and unnecessarily small nentries parameter at 10 million: https://root.cern.ch/root/html/TTree.html#TTree:Draw@1 If you have a large tree and ::Draw it without specifying nentries, you won’t get what you expect.

Now that I look at the documentation, it actually lies to you: the text says that the default is “all entries” but nentries’ default value is 10000000.

This was mentioned in another RootTalk post but I can’t find it right now.

Jean-François

Hi,

This issues was resolved in v6.06 (the ‘big’ number is now the max that can be held in a long long).

Cheers,
Philippe.