Friday, December 9, 2011

Issues with TSM server not able to talk to client...

We ran into this today:

ANR2716E Schedule prompter was not able to contact client using type 1

After some Google searching:

If you use a firewall, your clients must use the option

SCHEDMODE POLLING

The clients will then start their backup when the scheduled time has come.

Monday, June 20, 2011

TSM Client cluster password deleted when generic resource brought online

Problem
TSM Client cluster password deleted when generic resource brought online
 
Solution
Manually updating the cluster node password while the MSCS generic resource is offline causes the checkpoint file held on the quorum disk to become out of sync with the password entry for the cluster node in the registry. When bringing the generic resource online, the password entry for the cluster node is deleted and the ANS2050E error is observed in the error log.

Assumptions:
  • Resetting the password on the TSM client on each machine has not resolved the issue.
  • The encrypted values match for the password key in the registry key: HKEY_LOCAL_MACHINE\SOFTWARE\IBM\ADSM\CurrentVersion\BackupClient\Nodes\NODENAME\SERVERNAME
  • When the generic service resource is brought online, the password value is deleted from the registry.

In a cluster environment, the generic service resource used by TSM is used to control the stopping and starting of the scheduler service. It is also used to start the TSM scheduler service on the failover machine when a failover occurs. When the generic service resource is initialized, it compares the registry value of:

HKEY_LOCAL_MACHINE\SOFTWARE\IBM\ADSM\CurrentVersion\BackupClient\Nodes\NODENAME\SERVERNAME

With a checkpoint file located on the quorum drive (.cpt file). If the password for the client node is changed while the generic service resource was offline, this checkpoint file and the registry may become out of sync. When this occurs, the generic service resource will overwrite the value in the registry with the value in the checkpoint file, or it will remove the password value in the registry.

One way to verify if the checkpoint file and registry have become out of sync, is to take the generic service resource offline, reset the password for the client node (using DSMC Q SE -OPTFILE=XXXX from the client command line), and try to start the TSM scheduler service without the generic service resource. If the scheduler service starts and maintains a "started" state, this confirms the out of sync state between the checkpoint file and the registry.


There are two possible solutions; one is to contact Microsoft support to recreate the checkpoint file. The other is to follow the steps below which should also create the checkpoint file
  1. Reset the clusternode password on the TSM server.
  2. On the active node, open a command line and start up dsmc with the appropriate dsm.opt specified for the clusternode.
  3. TSM will prompt for the new password, and load this into the registry when it is supplied.
  4. The clusternode scheduler can then be started manually, as a local service.
  5. Once the clusternode scheduler is started as a local service, the cluster Generic Resource which manages it can be manually brought online through the Cluster Administrator. If it is started after the scheduler is started as a local service, and neither it nor the cluster are bounced, it should stay online.
  6. *While the Generic Service is running* reset the clusternode password at the TSM server *again*.
  7. Again, open up a command line, start up dsmc with the appropriate dsm.opt file, and fill in the password when requested.
  8. Fail the nodes over, so that the active node is now passive and vice versa.
  9. The cluster Generic Service, with its newly-filled-in password, should successfully fail over as well, and stay online.
  10. Start up a command-line dsmc session with the appropriate dsm.opt file, to fill in the new password if necessary and to check that the session is connecting properly.

The new checkpoint file has been written, matches the registry key, and the clusternode TSM scheduler can once again run under the control of the cluster Generic Resource.

Without that second password reset, the Generic Resource fails as soon as the cluster fails over. With the second password reset, done while the Generic Resource is running, it rewrites the checkpoint file.

Friday, February 11, 2011

TSM NDMP Restore Setup

Many thanks to my TSM co-worker Bill, you are the man!

http://www-01.ibm.com/support/docview.wss?uid=swg21254984

Wednesday, December 15, 2010

VFiler Commands

vfiler create vfilername [-n] [-s ipspace ] -i ipaddr [-i ipaddr ]… path [ path ...] vfiler create vfilername -r path
vfiler destroy [-f] vfilername
vfiler rename old_vfilername new_vfilername
vfiler add vfilername [-f] [-i ipaddr [-i ipaddr]…] [ path [ path ...]]
vfiler remove vfilername [-f] [-i ipaddr [-i ipaddr]…] [ path [path ...]]
vfiler limit [ max_vfilers ]
vfiler move vfiler_from vfiler_to [-f] [-i ipaddr [-i ipaddr]…] [path [path ...]]
vfiler start vfilertemplate
vfiler stop vfilertemplate
vfiler status [-r|-a] [ vfilertemplate]
vfiler run [-q] vfilertemplate command [args]
vfiler allow vfilertemplate [proto=cifs] [proto=nfs] [proto=rsh] [proto=iscsi] [proto=ftp] [proto=http]
vfiler disallow vfilertemplate [proto=cifs] [proto=nfs] [proto=rsh] [proto=iscsi] [proto=ftp] [proto=http]
vfiler context vfilername
vfiler dr configure [-l user:password ] [-e ifname:IP address:netmask, ... ] [-d dns_server_ip:... ] [-n nis_server_ip:... ] [-s ] remote_vfiler@remote_filer
vfiler dr status remote_vfiler@remote_filer
vfiler dr delete [-f] remote_vfiler@remote_filer
vfiler dr activate remote_vfiler@remote_filer
vfiler dr resync [-l remote_login:remote_passwd ] [-a alt_src, alt-dst ] [-s ] vfilername@destination_filer
vfiler migrate [-m nocopy [-f]] [-l user:password ] [-e ifname:IP address:netmask, ... ] remote_vfiler@remote_filer
vfiler migrate start [-l user:password ] [-e ifname:IP address:netmask, ... ] remote_vfiler@remote_filer
vfiler migrate status remote_vfiler@remote_filer
vfiler migrate cancel remote_vfiler@remote_filer
vfiler migrate complete remote_vfiler@remote_filer
vfiler help

Monday, November 1, 2010

AIX Device Commands


lscfg lists all installed devices
lscfg -v lists all installed devices in detail
lscfg -vl (device name) lists device details

bootinfo -b reports last device the system booted from
bootinfo -k reports keyswitch position
1=secure, 2=service, 3=normal

bootinfo -r reports amount of memory (/ by 1024)
bootinfo -s (disk device) reports size of disk drive
bootinfo -T reports type of machine ie rspc

lsattr -El sys0 -a realmem reports amount of useable memory

mknod (device) c (major no) (minor no) Creates a /dev/ device file.
mknod /dev/null1 c 2 3

lsdev -C lists all customised devices ie installed
lsdev -P lists all pre-defined devices ie supported
lsdev -(C or P) -c (class) -t (type) -s (subtype)

chdev -l (device) -a (attribute)=(new value) Change a device attribute
chdev -l sys0 -a maxuproc=80

lsattr -EH -l (device) -D Lists the defaults in the pre-defined db
lsattr -EH -l sys0 -a modelname

rmdev -l (device) Change device state from available to defined
rmdev -l (device) -d Delete the device
rmdev -l (device) -SR S stops device, R unconfigures child devices

lsresource -l (device) Displays bus resource attributes of a device.

pmctrl -a Displays the Power Management state

rmdev -l pmc0 Unconfigure Power Management
mkdev -l pmc0 Configure Power Management