Max number of events when using a txt file with datasets

Hi, after a very fast check it seems that when using a txt file with several datasets (one per line) the maximum number of events is ignored by proof and analysis goes on for the whole number of events in the sum of datasets. Is this the intented behaviour? I have a dataset with total 22M and the analysis went past 25% although I have set max 5M events.
filimon

Hi,

No, it is not the expected behavior.
Could you do the following:

  1. Restart a run after setting
gProof->SetLogLevel(2, TProofDebug::kPacketizer)
  1. When the query is running push the ‘stop’ button
  2. When the prompt is back, retrieve and save the master log file
   TProofLog *pl = gProof->GetManager()->GetSessionLogs();
   pl->Save("0", "master.log");

and post master.log .

Thanks.

G. Ganis

Hi, it is rather difficult to get the logs like this because of other reasons (crash during merging at stop pressing-observed also in different proof setups since v5-34-02-to be looked into further and reported back). Also it seems that asking for the logs without stopping first also crashes at this moment. This seems like a new regression (definetely worked correctly in previous setups I had) however looks like a different issue and could be related to my environemnt and not proof. I did however confirm that the number of events requested is definetely not respected by hardcoding the requested number of events at the place of the call as follows
mgr->StartAnalysis(“proof”, gridDataDir, 100000, nEventsSkip); // gridDataDir is actually “alice-caf.cern.ch” in this context and nEventsSkip=0
Is there a way to get logging enabled in a file in advance without explicitly requesting to gather the logs?
Equivalently, I attach the txt file with the datasets. Assuming you have a bare task for your tests it should be straightfoward to verify the problem on your side also just by using this dataset list file. If it somehow works properly on your side we should make some more efforts for the logs on my side.
proof_LHC10d1_dataset.txt (946 Bytes)

Hi,

Sorry for the late reply. I will try to reproduce it and get back to you.

For your information, logs can be retrieved after a crash by reconnecting and running

  root [] TProofLog *pl = TProof::Mgr("alice-caf.cern.ch")->GetSessionLogs()
  root [] pl->Save("0", "master.log")

The last logs will be retrieved and saved.

G. Ganis

Hi, I am currently not able to reproduce the problem. Also other random crashes are not appearing at the moment. I can only assume that the strange behaviour was due to connecting to previously disconnected and not well cleaned-up batch sessions (due to the crashes). The recent cleanup of AAF seems to help a lot. Anyway, I have two irrelevant questions:

  1. I am using AliAnalysisManager to do analysis and not TProof::Process. Is there a way (env variable/setter) to request asyncronous processing (option “ASYN”) this way?
  2. Is there a way to disable workder from becoming mergers, ie go to the old scheme where the master only does the merging? (for test purposes). Thanks,
    filimon

Hi,

No, unfortunately. Did you check with the ALICE experts that there is no way to pass a PROOF option via the AliAnalysisManager? If there is none, I think that is a reasonable thing to have.

Yes, this should do that:

   gProof->SetParameter(""PROOF_UseMergers", -1)

Gerri