How does proof work with dcache

I looks like proof is trying to lookup the files directly from dcache? Is it supposed to do this. I’m not really posting this as an error but to understand what is going on.

-bash-3.00$ root


  •                                     *
    
  •    W E L C O M E  to  R O O T       *
    
  •                                     *
    
  • Version 5.14/00b 17 January 2007 *
  •                                     *
    
  • You are welcome to visit our Web site *
  •      [root.cern.ch](http://root.cern.ch)            *
    
  •                                     *
    

FreeType Engine v2.1.9 used to render TrueType fonts.
Compiled on 9 February 2007 for linux with thread support.

CINT/ROOT C/C++ Interpreter version 5.16.16, November 24, 2006
Type ? for help. Commands must be C++ statements.
Enclose multiple statements between { }.
root [0] .x /grp/cms/root/etc/CMSSW_1_3_1/rootlogon.C
root [1] .x main.C
root://dcache-00.rcac.purdue.edu/pnfs/r … 44C40.root
There are 1000 events

constructing TrackTSelector
Starting master: opening connection …
Starting master: OK
Opening connections to workers: OK (3 workers)
Error in TXSocket::TXSocket on master0: severe error occurred while opening a connection to server [proof://:1093//]: Server not allowed to be worker - ignoring request
Error in TXSlave::Init on master0: some severe error occurred while opening the connection at proof://:1093// - exit
PROOF set to parallel mode (2 workers)
Connected to: gluon.rcac.purdue.edu (valid)
Port number: 1093
User: dbraun
Proofd protocol version: 12
Client protocol version: 12
Remote protocol version: 12
Log level: 0
Session unique tag: gluon-1183132544-6632
Default data pool: root://gluon.rcac.purdue.edu//proofpool
*** Master server 0 (parallel mode, 2 slaves):
Master host name: gluon.rcac.purdue.edu
Port number: 1093
User: dbraun
Protocol version: 12
Image name: gluon.rcac.purdue.edu
Working directory: /tmp/dbraun/session-gluon-1183132544-6632/master-0-gluon-1183132544-6632
Config directory: /apps/02/cmssoft/cms/slc3_ia32_gcc323/lcg/root/5.14.00b-pCMS1/rootConfig file: proof.conf
Log level: 0
Number of workers: 2
Number of active workers: 2
Number of unique workers: 2
Number of inactive workers: 0
Number of bad workers: 1
Total MB’s processed: 0.00
Total real time used (s): 0.001
Total CPU time used (s): -0.000
(const char* 0x81d9978)“cms-069.rcac.purdue.edu
(const char* 0x81d9978)“cms-070.rcac.purdue.edu
(int)0
(int)0
Begin
Looking up for exact location of files: OK (1 files)
Info in TSignalHandler::Notify: Processing interrupt signal …
Info in TXSlave::HandleError: 0xe28c510: got called … fProof: 0xe28b2c0
Info in TMonitor::Select: *** interrupt occured ***
root [2]
*** Break *** keyboard interrupt :0:
root [2] TXProofMgr::HandleError: 0xe27f328: got called …
Info in TXSlave::HandleError: 0xe28c510: got called … fProof: 0xe28b2c0

*** Break *** keyboard interrupt
Root >
*** Break *** keyboard interrupt
Root > Cannot get entries for file: root://cms-042.rcac.purdue.edu:33115/pn … 44C40.root - skipping
Info in TProof::CollectInputFrom: the processing was aborted - 0 events processed

Info in TXProofServ::SetQueryRunning on master0: starting query: 1
Error in TXNetFile::CreateXClient on slave0.1: open attempt failed on root://cms-042.rcac.purdue.edu:33115/pn … 44C40.root
SysError in TDSet::GetEntries on slave0.1: cannot open file root://cms-042.rcac.purdue.edu:33115/pn … 44C40.root (Operation now in progress)
Error in TPacketizer::ValidateFiles on master0: cannot get entries for root://cms-042.rcac.purdue.edu:33115/pn … 44C40.root (

SlaveBegin
SlaveTerminate
SlaveBegin
SlaveTerminate

Hi,

Sorry for the late reply.

The reason why the default PROOF packetizer opens the files before start processing is to collect some info it needs to optimize the work distribution. The default packetizer will be able to use cached information in a near future, but the for the moment this is still not the case.
However, the files are open from the workers, so if opening (as it seems) fails at this stage, it is likely to fail even later on, when the files are open to be processed.

Do you have any idea why opening fails from the workers?
Can you ope the files from your client machine?

G. Ganis

I can open the files on a local machine just fine. I now understand what its trying to do but its not possible to open root doors on our all our dcache nodes. I’m wondering if there is a way to turn that functionality off. I have tested on the worker nodes directly and they can load the files okay through a TXNetFile interface. But when I use the proofchain it doesn’t work. Thus I came to the conclusion its taking a different path.

Do you know of a way to turn this off?
I think every thing will work so long as it goes through the specified dcache door.

Hi,

You do not need to open root doors on our all our dcache nodes: if the file is not local to the worker,
as it is in the case of a dcache backend, the worker just needs the entry point from where to get the file.

But I am bit confused by your reply.
You say that you can open the files “by hand” from the workers but that PROOF file validation still fails?
Are the URLs exactly the same?
There may be some problem in the lookup step (we never tested it with dcache).

Can you do the following:

     // Create a TDset instance
     root[0] TDSet *d = new TDSet("myset")

     // Fill it with your files, e.g.
     root[1] d->Add("root://dcache-00.rcac.purdue.edu/pnfs/rcac.purdue.edu/data/store/mc/2007/4/26/...")
     root[2] d->Add("root://dcache ...")

     // Look them up
     root[] d->Lookup()

     // Print the result
     root[] d->Print("a")

I would expect that for dcache the URLs should be the same as before; if not (possibly because of the xrootd interface)
they should at least make sense to you.

In any case, can you post your main.C so that I can see exactly all the steps ?

G. Ganis

“By Hand” in terms of using the Exec command on each worker node.

I will try what you described and see if it works.

Here is my test main.

{
gSystem->Load(“libFWCoreFWLite”);
AutoLibraryLoader::enable();

// TString dcache = TString(“dcap://dcache.rcac.purdue.edu:22125/pnfs/rcac.purdue.edu/data”);
TString dcache = TString(“root://dcache-00.rcac.purdue.edu/pnfs/rcac.purdue.edu/data”);
char *files[] = {
"/store/RelVal/2007/8/2/RelVal-RelVal160pre7SingleMuPlusPt10-1186056193/0000/80C7A533-FC40-DC11-B871-000423D944A4.root",
"/store/RelVal/2007/8/2/RelVal-RelVal160pre7SingleMuPlusPt10-1186056193/0000/80014F26-1741-DC11-A994-000423D99BF2.root",
"/store/RelVal/2007/8/2/RelVal-RelVal160pre7SingleMuPlusPt10-1186056193/0000/445FC4D5-1141-DC11-AC68-000423D94524.root",
NULL};

TChain chain(“Events”);
int index = 0;
while(files[index]!=NULL){
TString filename = TString(files[index]);
filename.Prepend(dcache);
std::cout<<filename<<endl;
chain.Add(filename);
index++;
}

td::cout<<“There are “<<chain.GetEntries()<<” events”<<endl;

gSystem->Load(“libPhysicsToolsParallelAnalysis”);

TProof *gProof = TProof::Open(“cms.rcac.purdue.edu”);
gProof->Print();
gProof->Exec(“gSystem->HostName()”);
gProof->Exec(".x /grp/cms/root/etc/purdueCMS_CMSSW_1_6_0_pre9/rootlogon.C");
gProof->Exec("{gSystem->Load(“libFWCoreFWLite”); AutoLibraryLoader::enable(); gSystem->Load(“libPhysicsToolsParallelAnalysis”)
;}");
chain.SetProof();

chain.Process(“TestSelector.C”);