Sunday, January 11, 2009

Sun Cluster 3.2 Globaldevices issue...

Hi,
I had an issue with my Sun Cluster 3.2, for some reason only one node /global/.devices was mounted at the same time, because of that I couldn't switch the resources between nodes (the switch/remaster command hang) and in one node the cluster globaldevices service fail to start.

When the resources switch hang, there was no message in syslog, it just waited util it timedout and failed back the resource.

Also, when the server booted I could see this message:
mount: /dev/md/dsk/d6 is already mounted or /global/.devices/node@1 is busy
Trying to remount /global/.devices/node@1
mount: /dev/md/dsk/d6 is already mounted or /global/.devices/node@1 is busy

WARNING - Unable to mount one or more of the following filesystem(s):
/global/.devices/node@1
If this is not repaired, global devices will be unavailable.
Run mount manually (mount filesystem...).
After the problems are corrected, please clear the
maintenance flag on globaldevices by running the
following command:
/usr/sbin/svcadm clear svc:/system/cluster/globaldevices:default
The problem was that both nodes had the same physical device name /dev/md/dsk/d6 for /global/.devices , here is how my vfstab on each node before the fix:
Node1:
/dev/md/dsk/d6 /dev/md/rdsk/d6 /global/.devices/node@1 ufs 2 no global

Node2:
/dev/md/dsk/d6 /dev/md/rdsk/d6 /global/.devices/node@2 ufs 2 no global

To solve it all I had to do is rename the metadevice on both nodes using metarename and modify /etc/vfstab to include the new change:
Node1:
metarename d6 d601
Node2:
metarename d6 d602

after the change you can restart svc:/system/cluster/globaldevices:default on both nodes and it all works.

4 comments:

Unknown said...

You just save me from pulling our my hair. Thank you very much.

FAAS said...

Hi, I have the same problem, but when I try to rename the device the command metarename gives the output:

metarename: sun2-0: there are no existing databases

I'm using Solaris Cluster 3.3u1.

I'm trying to solve the problem and I'll appreciate your help a lot.

Anonymous said...

Hi FAAS
The point is that when the cluster is running on all node it will mount the /globadevice on all nodes , as such the device must be named uniquely , so if your /globaldevice is named like :
/dev/md/dsk/d30 /dev/md/rdsk/d30 /globaldevices ufs 2 yes -

on all nodes (so /dev/md/dsk/d30) then the cluster wont be able to cross mount the /globaldevice from the other servers , what need to be done is renaming the metadevice , in this case d30.
maybe you problem is that you do not use meta devices , what is the output of the command metastat ? do you see any output ?

FAAS said...

Hi, thanks for your answer.
The output f the command metastat is:

metastat: sun2-0: there are no existing databases

However, reading the scmountdev code I managed to get a solution by changing the metadevice in the /etc/vfstab file from:

/dev/did/dsk/d1s3 /dev/did/rdsk/d1s3
to:
/dev/did/dsk/d2s3 /dev/did/rdsk/d2s3

The cluster mount the /globaldevices with no sign of error, but I think this is not an appropiate solution.