vxdisk


The original can be found here.

Exact Error Message
DEVICE TYPE DISK GROUP STATUS
c0t1d0s2 sliced online failing

Details:

During a failure that causes Veritas Volume Manager to see a disk as failing when indeed the disk is not bad, the failing flag for the disk can be cleared.

This should not be performed if the status of a disk is unknown. However, if the status does not appear to be failing and there are no failure messages in the system logs (such as /var/adm/messages), the flag can be reset.

In addition, perform the steps below if the flag was set due to a different hardware failure (ie; a controller failure). If there actually are disk problems, the failure flag will get set again and once the flag is cleared, try simple disk accesses, followed by reviewing the online status, before modifying anything on this disk.

Note: If f the disk is truly failing and this flag is cleared, there is the risk of losing data due to hardware problems.

1. List the disks under Veritas Volume Manager control to determine which disk is marked bad:

# vxdisk list
DEVICE TYPE DISK GROUP STATUS
c0t1d0s2 sliced disk01 rootdg online failing
c0t3d0s2 sliced rootdisk rootdg online
……..(more)
2. Clear the failing flag for the disk that is marked as failing:

# vxedit set failing=off ( vxedit set failing=off disk01 )

3. Verify that the flag has been cleared:

# vxdisk list - to verify that the flag has been changed
DEVICE TYPE DISK GROUP STATUS
c0t1d0s2 sliced disk01 rootdg online
c0t3d0s2 sliced rootdisk rootdg online

……..(more)

On a system with a large amount of SAN attached EMC disk, I’ve found that when you add and remove disk, over time, the Veritas disk get out of sync with the OS native disk. This can be dangerous in that you may be working with one disk that you think is not used, but in reality it is being used in another disk group. Here is an example using the command vxdisk -e list:

DEVICE TYPE DISK GROUP STATUS OS_NATIVE_NAME
c1t0d0s2 auto - - online c1t0d0s2
c3t3d0s2 auto - - error c3t3d107s2
c3t3d1s2 auto - - error c3t3d156s2
c3t3d2s2 auto - - error c3t3d99s2

That looks bad in that if you didn’t use the vxdisk -e list, you may never know until it’s too late. Anyway, this typically happens in a clustered environment where many servers share the same disk. You add disk to the environment, but forget to update the other servers. To fix this, just do the following:

# vxdiskconfig
# rm /etc/disk.info
# vxconfigd -k

The last command will rebuild the /etc/disk.info file with the correct info. Also, keep in mind if you are in a clustered environment, there is a possibility that when you run vxconfigd -k any VCS servicegroup using disk could possibly fail so you might want to freeze the service group.