Tuesday, January 20, 2009

Strange SMF "Bug?"

Yesterday I've installed apache 2 on Solaris 10 x86 (on two machines) from the Solaris CD:
pkgadd -d . SUNWapch2r
pkgadd -d . SUNWapch2u
pkgadd -d . SUNWapch2d
After the successful install I've tried to enable the service, but the service was not there :), The next step was to reimport the service to the SMF repository:
bash-3.00# svccfg -v import /var/svc/manifest/network/http-apache2.xml
svccfg: Scope "localhost" changed unexpectedly (service "network/http" added).
svccfg: Could not refresh svc:/network/http:apache2 (deleted).
svccfg: Successful import.
and again, the service was not there, searching the net for it, and I've found this: http://mail.opensolaris.org/pipermail/smf-discuss/2006-October/005565.html
"This happens when a buggy i.manifest is used. If your package has an
i.manifest file which uses SVCCFG_REPOSITORY, then that is your problem.
You can fix it by restarting svc.configd ("pkill configd" as root).
Then you should fix your package, or file a bug."
By reading it I've thought that it is a long shot, but it worked! "killing" configd and reimporting the service worked:
bash-3.00# svccfg -v import /var/svc/manifest/network/http-apache2.xml
svccfg: Taking "previous" snapshot for svc:/network/http:apache2.
svccfg: Upgrading properties of svc:/network/http according to instance "apache2".
svccfg: Taking "last-import" snapshot for svc:/network/http:apache2.
svccfg: Refreshed svc:/network/http:apache2.
svccfg: Successful import.
Regards

Sunday, January 11, 2009

Fast and simple Solaris Zone creation

- Run zonecfg, for the first zone configuration:
bash-3.00# zonecfg -z mynewzone
create -b
set zonepath=/zones/mynewzone
set autoboot=true
set ip-type=shared
add net
set address=192.168.0.122
set physical=nxge1
set defrouter=192.168.0.254
end

commit
exit

NOTE: we can also save the commands in a file and run zonecfg like this:
zonecfg -z mynewzone -f myzone.conf

- Create the zone directory and change the permissions:
mkdir /zones/mynewzone
chmod 700 /zones/mynewzone

The zone installation will fail if the zone directory will have the wrong permissions:
/zones/mynewzone must not be group readable.
/zones/mynewzone must not be group executable.
/zones/mynewzone must not be world readable.
/zones/mynewzone must not be world executable.
could not verify zonepath /zones/mynewzone because of the above errors.
zoneadm: zone mynewzone failed to verify
- Install the zone (takes time):
zoneadm -z mynewzone install

- Check the log:
grep -v "successfully installed" /zones/mynewzone/root/var/sadm/system/logs/install_log | grep -v ^$

- list the zones:
zoneadm list -cv

ID NAME STATUS PATH BRAND IP
0 global running / native shared
1 mynewzone running /zones/mynewzone native shared
- Boot the zone:
zoneadm -z
mynewzone boot

- Now we need do some last standard Solaris configurations, we will login to the zone console and follow the questions for basic configuratio (language, terminal, network...):
zlogin -C
mynewzone
If this step will be skipped many of the services won't start.
Done.

Sun Cluster 3.2 Globaldevices issue...

Hi,
I had an issue with my Sun Cluster 3.2, for some reason only one node /global/.devices was mounted at the same time, because of that I couldn't switch the resources between nodes (the switch/remaster command hang) and in one node the cluster globaldevices service fail to start.

When the resources switch hang, there was no message in syslog, it just waited util it timedout and failed back the resource.

Also, when the server booted I could see this message:
mount: /dev/md/dsk/d6 is already mounted or /global/.devices/node@1 is busy
Trying to remount /global/.devices/node@1
mount: /dev/md/dsk/d6 is already mounted or /global/.devices/node@1 is busy

WARNING - Unable to mount one or more of the following filesystem(s):
/global/.devices/node@1
If this is not repaired, global devices will be unavailable.
Run mount manually (mount filesystem...).
After the problems are corrected, please clear the
maintenance flag on globaldevices by running the
following command:
/usr/sbin/svcadm clear svc:/system/cluster/globaldevices:default
The problem was that both nodes had the same physical device name /dev/md/dsk/d6 for /global/.devices , here is how my vfstab on each node before the fix:
Node1:
/dev/md/dsk/d6 /dev/md/rdsk/d6 /global/.devices/node@1 ufs 2 no global

Node2:
/dev/md/dsk/d6 /dev/md/rdsk/d6 /global/.devices/node@2 ufs 2 no global

To solve it all I had to do is rename the metadevice on both nodes using metarename and modify /etc/vfstab to include the new change:
Node1:
metarename d6 d601
Node2:
metarename d6 d602

after the change you can restart svc:/system/cluster/globaldevices:default on both nodes and it all works.

Wednesday, January 07, 2009

Add zfs dataset to a zone

I think that the easiest method for adding (and after that, managing) a zfs pool/file system to a zone is by adding it as a dataset.

Adding the zfs as a dataset will allow the zone administrator to manage this pool/filesystem and use it as a normal zfs filesystem.
zonecfg -z myzone
zonecfg:zion> add dataset
zonecfg:zion:dataset> set name=zpool/mydata
zonecfg:zion:dataset> end
There are few properties that the zone admin won't be able to change and it could be that the zone admin will see more than you wants him to see. for more information about the delegation properites: http://docs.sun.com/app/docs/doc/819-5461/gbbsn?a=view

There are times that all you want to do is just share space between the global zone and the non-global zone, in this case, you can add the zfs file system as a generic filesystem:
# zfs set mountpoint=legacy zpool/mydata
# zonecfg -z myzone
zonecfg:zion> add fs
zonecfg:zion:fs> set type=zfs
zonecfg:zion:fs> set special=zpool/mydata
zonecfg:zion:fs> set dir=/data
zonecfg:zion:fs> end

Tuesday, January 06, 2009

Adding new shared disk to a running sun cluster 3.2

Adding a new shared disk to a running Solaris system, is easy, just add the device, and wait few secs for the OS to recognize it.
After that step, we need the cluster to be aware of the new added disk, and create a new DID device, just run: "cldevice refresh" and then you will be able to see it in "cldevice list -v"