ForEachSlot not updating slot

Inside RDataFrame::ForEachSlot, the ‘slot’ doesn’t seem to be updating. Here’s a reproducer:

void testSel()

    // Set up multi-threading
    const unsigned int n_thread_capacity = std::thread::hardware_concurrency();
    std::cout << "System has " << n_thread_capacity << " threads." << std::endl;
    const unsigned int nSlots = ROOT::GetThreadPoolSize();
    std::cout << "nSlots: " << nSlots << std::endl;

    ROOT::RDataFrame d("DecayData","myrootfile.root");

    auto applySelection = [](
        unsigned int slot
        std::cout << "Slot is " << slot << std::endl;

    d.ForeachSlot( applySelection , {} );

The output is just:

  | Welcome to ROOT 6.28/00               |
  | (c) 1995-2022, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Mar 05 2023, 06:52:00                 |
  | From tag , 3 February 2023                                       |
  | With                                                             |
  | Try '.help'/'.?', '.demo', '.license', '.credits', '.quit'/'.q'  |

System has 8 threads.
nSlots: 8
Slot is 7
Slot is 7

Why does slot seem to be stuck at 7?

ROOT Version: 6.28.00
Platform: Linux 6.2.2-arch1-1
Compiler: gcc (conda-forge gcc 11.3.0-19) 11.3.0

@eguiraud or @vpadulan might help.

Thanks. The same piece of code has been working fine for months, and I have the impression nothing changed when it stopped working. But perhaps I’ve missed something basic.

It looks to me like only one thread is being ‘dispatched’. Why might that be?

Hi @danj1011 ,

It’s most likely that N threads are being dispatched but only 1 has work to do. Probably because the input file is too small.

You can activate RDF logging with a level of kDebug to get a log of each thread task start and end. There is probably only one task (i.e. only one chunk of data to process) being dispatched.


Maybe the ROOT version? It’s possible the dataset splitting between threads has been made coarser from one release to another – as dataset size increases, it’s better to run few larger tasks rather than many small ones. It makes less sense for small dataset sizes, but we are willing to take a (small) performance hit there because those are the cases where processing time is small anyway.

OK, thank you, that’s all very interesting. Do you know why the slot would be ‘7’ and not ‘0’? Wouldn’t it make sense more the latter?

There is a stack of slot numbers [0, …, nThreads-1] and slot numbers are popped from the top of the stack, so higher first :smiley: (not that it matters)

Thanks! Actually I write each thread to a separate file, so the numbering ends up being significant where not all threads are used :slight_smile:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.