Sending configuration data to TSelector

krasznaa · July 28, 2008, 8:45am

Hi,

First off: Is this forum the main source of support for PROOF? I’m just asking because it starts to feel like I’m talking to myself here… By the way, I did figure out the cure for my crash that I described in a previous post, but because of a lack of interest, didn’t answer myself yet again.

I have a new (slightly longer) question. I hope this will start a bit of a discussion at least, as I’ll be very disappointed if no one has an opinion on this either.

As I wrote before, I’m evaluating if it would be realistic to transform an existing ROOT analysis framework to using PROOF. In the last couple of days I was trying to test some basic functionalities of PROOF to find any bottle necks. I think I’m down to one issue that I don’t know how to solve elegantly: Handling configuration data. It seems that PROOF doesn’t have anything to handle cases when the TSelector code needs configuration information from the client machine. Let me give an example of what I’m thinking about:

One pivotal feature of our current analysis framework is that it can calculate event weights based on a number of configuration options. For instance the framework makes it possible to add up the distributions of physics quantities from the same MC process generated with different generator cuts. Let’s say I have two MC datasets for the Z->ee decay. In one dataset the pT of the electrons is >10 GeV, in the other its >20 GeV. To make it possible for the framework to assign weights for the individual events from these datasets automatically, it has to know that we’re currently processing the “Zee” sample, and it has to know what amount of total integrated luminosities are available with the different generator cuts in the “Zee” datasets.

As I started the post, I don’t know how to implement this functionality in PROOF. In our current framework we have a complex object describing this kind of configuration (which is initialised from an XML file at startup). I would need to transfer this configuration object somehow to the worker nodes running my TSelector. I can think of a number of brute force methods (configuration file in an NFS area, writing my own network protocol for sending the configuration, etc.), but I don’t think that this would be such a special thing that I need to do. Am I really the only user who ever needs to send configuration information to the worker nodes? I would even have a number of ideas how this could be done by PROOF itself, but none of them look to be too easy.

I really hope that some PROOF developers are reading these posts and I can get in touch with them. Since this issue is currently a deal-breaker for me, if I don’t manage to find a good solution to it, I’m afraid I’ll have to drop PROOF…

Cheers,
Attila

pcanal · July 28, 2008, 1:31pm

Hi,

The TSelector has an ‘InputList’ that will be sent to each of the slaves. You should be able to use it to pass along your configuration information.

Cheers,
Philippe.

krasznaa · July 28, 2008, 1:47pm

Hi Philippe,

Can you comment on how this input list (“fInput”) variable works? I thought that PROOF was setting it up itself and I didn’t have any influence on what kind of objects can be in it.

Unfortunately the documentation about the members of TSelector is quite thin. In any case, I didn’t find anything in TProof so far that hinted about modifying this variable of my TSelector-s. I see all the TProof::SetParameter(…) functions which mention input list parameters, but just from this it’s not obvious what they’re good for.

Cheers,
Attila

anna · July 28, 2008, 5:12pm

Hi Attila,

Have you tried TProof::AddInput()? It will add the objects to the input list. For example, in your macro:

TProof *proof = TProof::Open("localhost"); TH1F *h = new TH1F("histhist", "histhist", 100, 0, 1); h->FillRandom("gaus", 100); proof->AddInput(h);

And then in the selector::SlaveBegin():

Just a thought, if you are working on root --> proof transition for an analysis framework, it might be interesting for you to take a look at the Alice analysis framework here:
http://aliceinfo.cern.ch/Offline/Activities/Analysis/AnalysisFramework/index.html

Our anwering time is a bit longer than usual now, because of the holiday season, sorry about that.

Cheers,
Anna

krasznaa · July 28, 2008, 6:37pm

Hi Anna,

Hmm… I never realised when skimming through the documentation that TProof::AddInput() would be used for anything else than specifying the input files. Now that you pointed me towards it, the one sentence written in the documentation makes some sense.

But without you telling me, I couldn’t have easily connected this function with the TSelector::fInput variable… In any case, this looks promising. I’ll try to use it tomorrow. Also, thanks for the link!

Cheers,
Attila

krasznaa · July 29, 2008, 2:57pm

Hi,

I’m kind of stuck with this, but I have no idea why…

I defined a custom class called “Config” that I want to send to all the worker nodes. This class inherits from “TNamed”, and I think I took care of generating the necessary dictionary for it.

The TSelectors running on the worker nodes acknowledge receiving such an object, but when I print the contents of the objects on the worker nodes, they don’t have the configuration that I set on the client. They just have the configuration that is set by the default constructor of the “Config” class.

I tried Anna’s example, and I could send a 1-dimensional histogram to the workers. So that seems to work. But I can’t figure out why this configuration object of mine doesn’t get copied correctly. Do you have any ideas what I might have left out when writing the class? (By now I even wrote a custom copy constructor and assignment operator for it. But those are not called by the framework, I checked.)

I’ll now try to write a little simpler class as this “Config” class is a bit complicated already, but I’m not so sure that this will help. So any ideas are welcome.

Cheers,
Attila

anna · July 29, 2008, 3:11pm

Hi,

Is your Config class streamable, i.e. is the ClassDef version >0? In other words, if you are not able to write this object in a file and then read it back in the form that you want, it won’t be transfered to the slaves in the form that you want.

Cheers,
Anna

krasznaa · July 29, 2008, 3:25pm

Hi Anna,

You win… It’s just obvious that there are a lot of things I don’t know about ROOT. I automatically defined 0 as the version number for the Config class. Now that I changed it to 1, everything started working.

Thanks a lot!

   Attila

P.S. The really annoying thing is that now I remember that I’ve spent quite a while with the same issue a while ago. (Unrelated code.) Still, I wouldn’t have remembered it…

anna · July 29, 2008, 3:36pm

Attila,

I’m glad it works for you now. Looks like the PROOF documentation needs some serious upgrading. Well, we always knew it did, but thanks for bringing it to our attention again
I promise, we’ll improve it some time soon.

Anna