Problems with GLOBUS-Authentification (2. Part)

Hello Rooters,

as you can see in our last postings, we are still working to build a PROOF-enviroment on LSF/PBS.

The problem this time is with the GLOBUS-authentification.
After grid-proxy-init and copying the proxy-files to the PROOD-hosts the next step is to start the PROOF-session and this is the moment things went wrong:

the PROOF-session runs trough the authentification-code and crashes afterwards with the following failure:


Info in TSocket::SendHostAuth: sent 4 bytes for closing

*** Break *** segmentation violation
Generating stack trace…
0x4117f6b8 in from /lib/libc.so.6
0x4152b5bc in TProof::Collect(TSlave const *) + 0x4c from /usr/local/pub/debian

The master is authentificated correctly but the slave seems to have no idea of which authentification-method he should use, so he uses the default method 0. This crashes :


RpdAuthenticate got: 2036 – 3702 1002
RpdCheckAuthAllow: Checking file: /usr/local/pub/debian3.0/gcc295-04/rootmgr/400-04/etc/system.rootdaemonrc for meth:5 host:lxg0505.gsi.de (gNumAllow: 3)
RpdCheckAuthAllow: new auth method proposed by client not in the list or already attempted
Authenticate: method not in the list sent to the client

This session runs under version 400-04 on one host.
Using 400-06 (with -w -noauth on the proofd’s) is running
fine.

Any ideas what’s going on (wrong) ??

Cheers Carsten.

P.S.:Merry Christmas to everbody out there and a good start into the new year.

Hi Carsten,

Is the “last posting” you refer to the one in September ?
If yes, did you try what I suggested there?
If not, could you try forcing the master-slave authentication method
by adding the following lines in the system.rootauthrc file seen by the
master

<slave_name> list globus

or adding the keyword ‘globus’ in the slave line in proof.conf ?

Cheers, Gerri

Hallo Gerri,

thanks for your help. I do what you suggest and it seems to work now (the errors disappear)
and we can connect to the master.

Unfortunately there is now a new problem with the authentification between master and slaves but this seems to be a problem with our enviroment at GSI, because we can setup a complete PROOF-session if the master runs on a special host.

Is there anything which must be set or done if we like to setup not only a slave but also a master on a host?

Cheers Carsten.

Hi Carsten,

In principle master and slaves should be run on a node without any special setting. I suspect a configuration and/or privileges problem.

Could you please send (or re-send in some case) the following information:

  1. Exact names of master and slaves machines
  2. How the proofds are started (xinetd or by “hand”) and with which privileges (i.e. normal or super-user)
  3. Location of the “server-side” certificates/key the proofd’s are supposed to use for mutual authentication and the privileges needed to read them.

This information is needed to check (and setup) the configuration files you need, so that we may understand what’s wrong and where.

Cheers,

Gerri

Hi Gerri,

I think that it is a problem with the host-configuration, too.

After investigating the problem and errors through the PROOF source code I come to the point where the software should make a shared memory but instead of this the variable gShmIdCred/cShmIdCred
is set to 0 and the proofservers are started with this value (argvv[9]).
One posible reason for this behaviour is that the variable R__GLBS maybe is not defined (where must be this variable set and how can I check the state of this variable?).

We have one node which is our globus-front-end-host lxts04.gsi.de
which is accessible from outside the GSI. We can run a master on this host without any problems.
On this host we have the globus directory and also the /etc/grid-security
directory. These directorys are only visible from this host so the decided to mount them to an other host (lxts05.gsi.de).
Setting up a master on lxts05 and a slave on a different host ends with the error:
“GlobusGetLocalEnv:Delegate credentials undefined.”

The proods are started by hand with normal privilegs.
The server-certificates are stored in the default /etc/grid-sec…
and are readable by everyone ( -rw-r–r— , which is the same as on the originale file).

The names of the slaves are changing because of we want to run a PROOF cluster on a LSF-batch-farm and not on a dedicated PROOF-cluster.

One more question from me : must there be the hostcert.pem for/on
every host on which we will start a master?
Because we have copied the hostcert.pem from lxts04 to lxts05 and PROOF does not complain about it.

Cheers Carsten.

Hi Cartsen,

The compilation flag R__GLBS is defined by configure and you should find it in EXTRA_AUTHFLAGS in config/Makefile.config .

In principle you would need an hostcert.pem/hostkey.pem for each host that needs to act as server. In the Globus phylosophy, this would be needed only by the GateKeeper which, once it has identified you, starts the jobs on the internal nodes.

Unfortunately, this structure does not fit very well with what we have at present in proof, in particular in the case ones runs the proofds from unprivileged accounts: the hostkey.pem is indeed readable only as superuser, which means that a normal-privileged proofd cannot run mutual authentication. A possible workaround is to use a valid user proxy certificate (the normal user certificate is not ok because the related private key requires the password). This however is painful, especially in your case, where you do not know which machines are going to be used.

From your last and previous messages I understand that at least on the master proofd can be run with special privileges.
What I suggest, in this case, is a mixed solution: you authenticate to the master using globus, and from master to slaves using a password. This may be particularly convenient if the machines in the LSF cluster are sharing the same HOME directories.
You need to choose a password, eg ProofPwd, save its crypt-hash in the file $HOME/.rootdpass on the slave-proofds, and create on the master a $HOME/.rootnetrc file with a line like

machine lxts04 user password ProofPwd

for each slave host (no wild cards accepted her, unfortunately).
You have then to force password authentication by adding the the line

lxts*:proofd list usrpwd

in the $ROOTSYS/etc/system.rootauthrc on the master.

Please, let me know if this is of any help.

Cheers, Gerri

Hallo Gerri,

thanks for your help.

Unfortunately your suggested method to join globus-authentification with usrpwd-authentification doesn’t match very well with that what we prefer.

Your second suggestion seems to be what we need.
But what I don’t understand is the folloing sentence:

“A possible workaround is to use a valid user proxy certificate (the normal user certificate is not ok because the related private key requires the password.”

I think you mean the x509up_u… file, is that right?
How can I force PROOF to use this file instead of the usercert/userkey?

At this time we are copying the x509up_u… file to every host with a running proofd, to prevent the users from multiple grid-proxy-inits.

Thanks,
Carsten.

Hi Carsten,

Yes, I mean the proxy file of the form x509up_u created by grid-proxy-init.

This should work automatically, if the file exists in the standard location and with standard name (i.e. /tmp/x509up_u) and it is valid (i.e., not expired).

Cheers, Gerri

Hallo Gerri,

thanks for your answer.

What you describe in your posting is exactly that what we have at this time.
PROOF reads and uses the grid-proxy files.

What is about the hostcert.pem?
Is this file needed by PROOF in our case?
Can I force PROOF to read this file from an other location?

Thanks Carsten.