Snapshot shuffles data invisibly with ImplicitMT

mwilkins · February 9, 2021, 4:56pm

If I have enabled ImplicitMT, RDataFrame::Snapshot does not preserve the row order. Moreover, this happens completely invisibly with no errors or warnings, so the user is left with shuffled rows and no indication this happened without manually checking. This strikes me as a feature-breaking bug.

Reproducer below.

ROOT Version: 6.22/06
Platform: macOS
Compiler: conda-forge

import ROOT as r
r.ROOT.EnableImplicitMT(4)
rdf = r.RDataFrame(10).Define("e", "rdfentry_")
rdf.Snapshot("test", "test.root")
f = r.TFile.Open("test.root")
f.test.Scan()

Output:

************************
*    Row   *       e.e *
************************
*        0 *         7 *
*        1 *         0 *
*        2 *         1 *
*        3 *         8 *
*        4 *         6 *
*        5 *         2 *
*        6 *         3 *
*        7 *         9 *
*        8 *         4 *
*        9 *         5 *
************************

eguiraud · February 9, 2021, 7:08pm

Hi,
it’s not a bug, if that TTree is used as a friend of another tree we warn users that it is a TTree that has been written by a “shuffling” operation.

What’s your usecase that’s broken by this behavior?

Cheers,
Enrico

mwilkins · February 9, 2021, 7:26pm

Well, anything that depends on the order of the rows, but mainly friend trees. If friends are considered the only use-case for row-ordering, then indeed, this is not a bug.

eguiraud · February 10, 2021, 8:38am

In many cases rows are independent (as in, independent physical events) and order does not matter. Order relative to other TTrees of course matters, hence the problem with friend trees.

In 6.22 you can’t use a “shuffled tree” as a friend unless you manually unset a certain flag in that TTree.

In 6.24 (being released in O(few weeks)) we filled a feature gap and added support for indexed friend trees in RDataFrame, so you can use one of the TTree columns as an index to recover ordered access into the shuffled TTree if needed.

mwilkins · February 10, 2021, 12:05pm

This seems like strange behavior for an (otherwise) ordered data structure, but friend trees are my only use-case, so this doesn’t currently break anything for me.

There are, however, workflows where this could be a problem. For example:
Tree1 created in ROOT
→ Tree2 created from Tree1 using ROOT with ImplicitMT
→ Tree3 created from Tree2 using pandas dataframe
→ Tree1.AddFriend(Tree3)
This involves leaving the ROOT package, and one could see it as a case of outside programs not providing feature support. But since ROOT is designed to work with other storage formats and be somewhat interoperable, I think it would be appropriate for Snapshot to throw a warning when writing out with ImplicitMT turned on.

eguiraud · February 10, 2021, 12:23pm

I think a runtime warning is too much (it would affect all programs that currently use RDF+multi-thread+Snapshot, and that’s a lot of programs) – but we can certainly add a big-letter warning to Snapshot’s docs.

It would be also nice to have an option to force Snapshot to maintain entry ordering (at the cost of performance and RAM): that’s a stretch goal for this year’s plan of work.

nmangane · February 11, 2021, 1:04pm

Would this operate via a cache for ‘forward’ results that wait until the thread handling earlier chunks finishes and writes out?

By the way, is simultaneous snapshots supported right now? I think I tried this in version 6.20 or 6.18 and trying to snapshot two different nodes at the same time resulted in seg faults (i.e. I split the data into two or four distinct datasets and want to dump them to separate root files with one processing loop). I ended up predicting the largest of them (pretty asymmetric), writing that one out immediately and putting the remaining sets into Cache, then snapshotting those one at a time. Of course, this is shifting performance bottlenecks around and had some severe limitations with the memory available.

eguiraud · February 11, 2021, 1:16pm

That’s the only idea I have for now, because it works well with the design of TBufferMerger, which Snapshot uses for multi-thread writes to a ROOT file – there is already a thread-safe queue of buffered “clusters ready to be written out”, we would “just” need to switch from FIFO to an ordered processing (much easier said than done, but it’s what I have so far). The main problem is that memory usage, in this scenario, might have hard-to-predict, highly undesirable long tails.

Assuming each Snapshot writes to a different file, I think it has always been supported. In other words, please report the bug, ideally with a self-contained reproducer, at Issues · root-project/root · GitHub .

Cheers,
Enrico

eguiraud · February 19, 2021, 2:32pm

[DF][NFC][skip-ci] Warn about "shuffled" output TTrees in Snapshot docs by eguiraud · Pull Request #7259 · root-project/root · GitHub adds a warning about this behavior to Snapshot’s docs.

mwilkins · February 19, 2021, 2:45pm

Thanks!

system · March 5, 2021, 2:46pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.