For quite some time now, we have not been able to use the latest versions of ROOT as they make our
data analysis extremely slow (timing comparisons given below). The last version which works for us is 5.34/08, which is starting to get a bit old!
I have identified the problem, it is the modifications made to TList::Clear, TList::Delete and TObjArray::Delete in the following commit:
made on 28/6/2013 by Philippe in order to “Fix ROOT-5284 which was due to an unusual setup (in TParallelCoord) leading to only partial cleanup”. Of these 3, it is TList::Clear which is most to blame in our case (probably because it is the one that is most used).
Basically, every time e.g. TList::Clear is called, a search is
made in gROOT->GetListOfCleanups() to see if the TList in question is in there; if not, it is added temporarily, then once the Clear() has been performed, it is removed again. When ROOT starts up, gROOT->GetListOfCleanups() contains 6 lists, so the overhead is negligible. However, in our software (indra.in2p3.fr/KaliVedaDoc) we rely rather heavily on the cleanup mechanism, and so we have around 6000 lists of multiply cross-referenced objects in gROOT->GetListOfCleanups(). In this case the overhead is not negligible at all, even after doing a rehash of the cleanups list
in order to reduce the collision rate from ~300 to ~1 (I realise now that the rehash doesn’t help because
although this makes gROOT->GetListOfCleanups()->FindObject() faster, the real problem is gROOT->GetListOfCleanups()->Remove() which does not use the object’s Hash() value but performs a sequential scan, and so gets slower proportional to the number of objects in the cleanups list).
If I comment out the modifications to these 3 methods in any ROOT 5.34 version from /09 to /18 then our analysis runs at the same speed as in 5.34/08. I have spent a lot of time trying to find a solution on our side, but I haven’t been able to get anything faster than a factor ~20 slower than before. Therefore, I would like to ask if you could perhaps find another solution to bug ROOT-5284, because the current one seems to be a very general solution to a rather specific problem, with very wide-ranging side-effects.
It is very difficult for me to give a simple example of code which reproduces this problem. The following timings have been obtained by reading 5000 events from a TTree, a TStopwatch was used to give the CPU time in seconds:
with v5.34/08: 5.8 sec.
with v5.34/09: 450.1 sec.
v5.34/09 with all modifications commented out: 5.7 sec.
v5.34/09 with only TList::Delete modification: 5.7 sec.
v5.34/09 with only TList::Clear modification: 116.2 sec.
v5.34/09 with only TObjArray::Delete modification: 75.8 sec.
During the reading of these 5000 events, TList::Clear is called 3.6e+06 times (same value for 34/08 or 34/09).
Out of these, gROOT->GetListOfCleanups()->FindObject() is called 1.2e+06 times, which suggests that 2/3 of the time Clear() is called for empty lists (condition ‘if(fFirst …)’).
And finally, the ‘needRegister’ flag is set (i.e. Add/Remove is performed) 6.3e+05 times i.e. roughly half of the time.
I would be grateful for any help or advice!