Thursday, December 17, 2009

DS4300 Controller Issues - CONTROLLER HEALTH CHECK FAILURE

This past weekend we started encountering some odd messages in the AIX error report.  When inspecting the Storage Manager client we also saw that all paths had moved over to the second controller.

BC669AA7   1212153609 P H dac1           CONTROLLER HEALTH CHECK FAILURE
BC669AA7   1212152409 P H dac1           CONTROLLER HEALTH CHECK FAILURE
483C9D10     1212151109 I H dac0            ARRAY ACTIVE CONTROLLER SWITCH
D5385D18     1212151109 T H hdisk3       ARRAY OPERATION ERROR
C86ACB7E    1212151109 I H hdisk3        ARRAY CONFIGURATION CHANGED
483C9D10      1212151109 I H dac0            ARRAY ACTIVE CONTROLLER SWITCH
D5385D18     1212151109 T H hdisk5        ARRAY OPERATION ERROR
C86ACB7E    1212151109 I H hdisk5         ARRAY CONFIGURATION CHANGED
BC669AA7    1212150309 P H dac1           CONTROLLER HEALTH CHECK FAILURE

After inspecting the controller errors via the Storage Manager client, I Google'd them and saw references to the controller being faulty.  And when looking at the details of the AIX errors, they also seemed to point to the controller as the issue.  All HBA's were also online and available so it wasn't a connection issue in those terms.

After a call to IBM support and some onsite troubleshooting with the CE, we reseated the controller.  This also had the benefit of power cycling it.  Once the controller was reseated and brought online, it functioned just fine and even moved the paths back over automatically.

Monday, December 7, 2009

What’s the Login User Name and Password for VMWare Server 2.0?


from:  www.tipandtrick.net
VMWare Server 2.0 (currently in Beta 2 release) has departed significantly from legacy VMWare Server 1.0. It’s still a free virtualization software to users to create, manage and run virtual machines, but instead of usual standalone desktop (or notebook laptop) based application, VMWare Server 2.0 now runs solely on web-based management user interface, plus many other new features, enhancements and improvements.

To log in to VMWare Server 2.0, users will have to access https://localhost:8333/ui/ or http://localhost:8222/ui/ for non-secure connection (the URL may takes the form of your computer name) with a web browser to come to VMware Infrastructure (VI) Web Access management interface, which is VMWare Server Console, normally simply call Web-UI. Don’t worry about your system doesn’t have a web server such as Apache or Microsoft IIS running, VMWare Server 2.0 install Tomcat web server in the background.


But users will come to a VI Credentials page asking for Login Name and Password, as shown in the screenshot below

VMWare Server 2.0 Login Screen

What login user name and password to use? This probably your first installation of VMWare Server 2.0, and even if you have installed VMWare Server 1.0 before, it never ask for creation of any user account or its user ID or password during installation of whatever versions. And searching up and down in Start Menu’s VMWare Server program folder doesn’t reveal any program to create or manage user’s login name for VMWare console too.


Actually, VMWare Infrastructure Web Access, and hence VMWare Server 2.0, uses user account of the operating system, i.e Windows XP, Windows Vista, Windows Server 2003, Windows Server 2008 and Linux distro. So to login to VMWare Server 2.0 Web Console, logon with an administrative account’s user name of Windows or Linux (Administrator or root) and the corresponding password. Note the password is a must. In Windows, most built-in Administrator account does not have password by default even after been enabled, and so a password must be assigned.


It’s possible to create and add another user account with administrator’s privileges specially for VMWare Server login purpose.

Thursday, October 8, 2009

TSM Select Queries - Updated 1-4-2010

This list will continue to grow as I add more queries over time. Newest additions are at the bottom.

List Out Activity Log by Message Number, Message and Date and Time
select DATE_TIME,MSGNO,MESSAGE from ACTLOG where MESSAGE like '%PROD_DISK%' and date_time>timestamp(current date -4 day,'00:00:00')and (msgno=1210 or msgno=1214 or msgno=2753)

List Scratch Volumes:


select volume_name from libvolumes where status='Scratch'

Number of Files grouped by node name:

select node_name, sum(num_files) from occupancy where node_name like 'HAIM%' group by node_name

DSMADMC command for command line example, exports to file:

dsmadmc -optfile=tsmaprod01.opt -id= -password= -tab select node_name, sum(num_files) from occupancy where node_name like 'AX%' group by node_name >c:\temp\occupancy.txt

dsmadmc -optfile=tsmblibm.opt -id=-password= -tab select volume_name from libvolumes where status='Scratch' >c:\temp\scratchb.txt

dsmadmc -optfile=tsmblibm.opt -id= -password= -tab all:q stg old >c:\temp\oldstg.txt

dsmadmc -optfile=tsmblibm.opt -id= -password= -tab select * from archives where node_name='PRDAUSRVS01' and ll_name='a4200.rpt' >c:\temp\a4200.rpt.txt ***This is case sensitive***

Total Tapes Used by Node for all Storage Pools

select node_name,stgpool_name,count(distinct volume_name) as TOTAL_TAPES from volumeusage where node_name='A4ID3P01' and stgpool_name in (select stgpool_name from stgpools where pooltype='PRIMARY') group by node_name,stgpool_name

***This was also handy in a restore situation to tell how many tape TSM needed to mount to restore the data needed.

Locate Individual File

select * from backups where node_name='NODENAMEINALLCAPS' and ll_name='file.txt'

TSM Script Containing Select Statement to Gather Client Error Messages - use DSMADMC to run the script and export to a file

/*-----
*/
/* Script Name: Q_Client_Errors */
/* Description: Find client errors since 18:00 */
/* yeesterday */
/*-----
*/
set sqldisplaymode wide
select date_time, msgno, nodename, substr(message,27) as MESSAGE from actlog where date_time>timestamp(current date - 1
day,'18:00:00') and (msgno=4005 or msgno=4007 or msgno=4037 or msgno=4987)

Using ABC Client and Showing Backup Statistics

select cast(sum(cast(substr(message,31,12) as float(12)))/1024/1024 as decimal(12,2)) as "GB" from actlog where date_time>'2009-01-20 06:00:00' and date_time<='2009-01-21 06:00' and msgno=4990 and message like '%Data transferred%' and nodename like 'AV%' Show GB/Week for all servers select entity, sum(bytes)/1024/1024/1024 as "GB/week" from summary where start_time>'2009-01-15 18:00:00' and activity='BACKUP' group by entity order by 2 desc

Select Servers on a particular subnet

select node_name, tcp_address from nodes where tcp_address like '192.168.50.%'

Show Objects Backed Up on a Node in Last 24 Hours

select backup_date, hl_name, ll_name from backups where node_name='NODENAME' and backup_date>=current_timestamp-24 hours

List All Volumes for a Specific Storage Pool

select volume_name,stgpool_name from volumes where stgpool_name like 'ST06_TAPEC_01_OLD%'

Select Management Classes that are associate with backups. Allows you to see if there are any longer retention backups.
select distinct class_name from backups where node_name='AWSQLP23'

Total Backed Up in 24 hours by node, platform and affected.
SELECT node_name as NODE,platform_name as PLATFORM, activity,sum(cast(bytes/1024/1024/1024 as decimal(6,2))) as GB, affected FROM nodes, summary WHERE (end_time between current_timestamp - 24 hours and current_timestamp) and activity='BACKUP' and ((node_name=entity)) GROUP BY node_name, platform_name, activity, affected ORDER BY platform, GB, node asc


Total Size from Filespaces
select sum(pct_util*capacity/100) from filespaces

Total GB's Per Node from Backup Activity
select node_name, type, sum (logical_mb)/1024 as logical_gb from occupancy where stgpool_name in (select stgpool_name from stgpools where pooltype='PRIMARY') group by node_name, type

Filespace Reporting - Sorted by Node name showing the last backed up date
select node_name, filespace_name, backup_end from filespaces order by node_name

Total GB's Per Node from Backup Activity in Last 24 Hours - can be used for Archive, just change activity= from backup to archive.
SELECT entity, activity, CAST(FLOAT(SUM(bytes)) / 1024 / 1024 / 1024 AS DECIMAL(8,2)) as "GB" FROM summary WHERE end_time>current_timestamp-24 hours and activity='BACKUP' GROUP BY entity, activity

Report Start Time of a Backup for a Particular Node
all:Select entity, start_time from summary where entity like '%ALWDM%' and start_time >{ts '2009-10-2 22:00:00'} AND start_time <{ts '2009-10-03 06:30'}

Report on Schedule Associations, Domain, Setup Time and other various things - to be used as dsmadmc with XL's concatenate

dsmadmc -optfile=instancename -id= -password= -tabdelimited -dataonly=yes select * from associations, client_schedules where associations.node_name='nodename' and client_schedules.domain_name=associations.domain_name and client_schedules.schedule_name=associations.schedule_name >>c:\temp\sox\scheds.txt

Find Backups from a Specific Node
select * from backups where node_name='NODENAME'

Find a File from Backups of a Specific Node
select * from backups where node_name='NODENAME' and ll_name='file_name'

Find all backup objects in a given Filespace from Backups of a Specific Node
select * from backups where node_name='NODENAME' and filespace_name='/filespace'

List out the Filespace Name, Directory, and the File from Backups of a Specific Node
select filespace_name,hl_name,ll_name from backups where node_name='NODENAME'

List out the Filespace Name, Directory, File, and Archive Date from Archives of a Specific Node
select filespace_name, hl_name, ll_name, archive_date from archives where node_name like 'NODENAME' order by ll_name

List out the Filespace Name, Directory, File, and Deactivation Date from Backups of a Specific Node
select deactivate_date, filespace_name, type, ll_name, hl_name from backups where node_name='NODENAME'

List out the Filespace Name, Directory, File, and Deactivation Date from Backups of a Specific Node grouped by active/inactive version.
select deactivate_date, filespace_name, type, ll_name, hl_name, deactivate_date, class_name, state from backups where node_name='NODENAME' order by state

List out the Filespace Name, Directory, File, and Deactivation Date from Backups of a Specific Node grouped by backup date
select deactivate_date, filespace_name, type, ll_name, hl_name, deactivate_date, class_name, state, backup_date, node_name from backups where node_name='NODENAME' order by backup_date

List what volumes contain active file data for a particular node.
select node_name, filespace_name, stgpool_name, volume_name from volumeusage where node_name='NODENAME' and node_name in (select node_name from backups where state='ACTIVE_VERSION' group by node_name)

List what volumes contain a particular file for a particular node
select volume_name from contents where file_name='/ .list_filesets.out' and node_name='NODENAME'

List what archives have been created within the last 8 hours.
select node_name, hl_name, ll_name from archives where archive_date>(current_timestamp-8 hours) order by node_name

Generates a count of archives by nodename.
select node_name, count(*) from archives group by node_name

Generates a list of all full volumes that are checked into the library.
select distinct volumes.volume_name, volumes.status, volumes.pct_reclaim,
libvolumes.library_name from volumes, libvolumes where volumes.status='FULL' and
volumes.volume_name in (select volume_name from libvolumes where
library_name='LIBRARYNAME')

Generates a count of volumes by type, private or scratch, for a particular library.
select status, count(*) from libvolumes where library_name='LIBRARY' group by status

Saturday, August 1, 2009

OPT File Domain Statements

I've been working to clean up duplicate backups in our Windows Cluster environment. The general backup was also picking up all the cluster drives, so a modification of the domain statement was needed. Here are two examples of domain statements I wanted to document for future use:

Win2k3 Server TSM Client
DOMAIN C: D: SYSTEMSTATE SYSTEMSERVICES

Win2k Server TSM Client
DOMAIN C: D: SYSTEMOBJECT

You should also restart the Sched and CAD services after you modify the domain statement to ensure that the TSM server sees the update OPT file.

Friday, March 27, 2009

Problems Starting TSM Services - Cryptic Messages

When attempting to start TSM services, I recieved some cryptic messages that looked like memory error failures. Upon further investigation, I found some interesting errors in the Application Eventlog of Windows:

TSM Client Acceptor terminated abnormally: appMain exit code 959.

Error Opening/Creating Registry Path 'SOFTWARE\IBM\ADSM\CurrentVersion\BackupClient\Scheduler Service'

TSM Central Scheduler Service terminated abnormally: Error Obtaining Registry Parameters..

These eventually led me to believe the services were corrupted. I uninstalled and reinstalled them and I was able to start them.

Cryptic Messages when Starting Services

I was receiving some cryptic error messages when trying to start TSM services. They looked like memory errors. Upon further investigation and checking the event logs. I noticed entries that reported this:


TSM Central Scheduler Service terminated abnormally: Error Obtaining Registry Parameters..

or

Error Opening/Creating Registry Path 'SOFTWARE\IBM\ADSM\CurrentVersion\BackupClient\Scheduler Service'
or
TSM Client Acceptor terminated abnormally: appMain exit code 959.

Monday, March 9, 2009

Domino TDP "Return Code is 804 Error"

I am documenting this error here because I can never find this on a web search:

Error was found in domarch.log

Return Code is 804
Current date is:
Sun 03/08/2009
Current time is:
08:01 AM

IBM Tivoli Storage Manager for Mail:
Data Protection for Lotus Domino
Version 5, Release 4, Level 2.01
(C) Copyright IBM Corporation 1999, 2007. All rights reserved.

ACD0053E License file (C:\Program Files\Tivoli\TSM\domino\domclient.lic) could not be opened.

Return Code is 804

TSM 6.1 Data Dump for Reference

TSM 6.1 Data Dedupe and DB2 - Curtis Preston

http://www.backupcentral.com/content/view/223/47/

Volume Shadow Services (VSS) - Bane of TSM Administrators

Persistent and frustrating VSS errors have been plaguing our systems. I recently found some good info I wanted to document.


Windows 2003 Volume Shadow Copy Service (VSS) Fixes for Systemstate Backup

Problem
TSM client backups of Windows 2003 system state fail with varying VSS errors

Cause
Microsoft VSS failures

Solution
This list represents current fixes available from Microsoft, listed in historical order, for Windows 2003 systemstate backup problems experienced by 3rd party backup vendors, including Tivoli Storage Manager (TSM).
Blue represents the most current fix that is available. All the below VSS fixes should be requested from Microsoft.

KBASE# FIX ### WEBSITE LINK:



--------------------------------------------------------------------------------
826936 139876 http://support.microsoft.com/kb/826936
833167 158865 http://support.microsoft.com/kb/833167
867686 188827 No web link
887827 unknown No web link

All of the above fixes are rolled into Windows 2003 SP1. The Microsoft kbase document with the list of fixes included in SP1:
http://support.microsoft.com/?kbid=824721

Subsequent to Win2k3 SP1, Microsoft has provided VSS Post SP1 fix packages:
KBASE# FIX ### WEBSITE LINK:

--------------------------------------------------------------------------------
891957 unknown http://support.microsoft.com/?id=891957
903234 unknown http://support.microsoft.com/?id=903234

Both of the above fixes require this fix as a pre-requisite:
913648 unknown http://support.microsoft.com/?id=913648

http://www-1.ibm.com/support/docview.wss?uid=swg21242128

Redirected Restore for DB2 using TSM

I was asked last week about restoring a DB2 database from one server to another server. After a quick bit of searching, I found this article. The DBA's are going to test it this week to see how well it works. I thought I would document it so I don't lose the link:


http://www.ibm.com/developerworks/db2/library/techarticle/0212mulligan/0212mulligan.html

NetApp Quick Learnings

• 16 TB's of raw storage per aggregate.
• Data not shared between aggregates.
• Always leave at least one spare disk in NetApp. If you don't, NetApp freaks out.
• A mixture of SATA and FC drives can be used in the same NetApp device. However, you cannot mix drives in an aggregate. They must all be the same type of drive.
• Min number of drives per array is 3, but drives can be added to expand the array.
• When creating an array, 2 drives must be used as parity.

Maximum and default RAID group sizes - from NetApp

Maximum and default RAID group sizes vary according to the storage system model, level of RAID group protection provided, and the types of disks used in the RAID group. Generally, you should use the default RAID group sizes.

Table 1. RAID group sizing for RAID-DP groups
Disk type Minimum group size Maximum group size Default group size
ATA or SATA 3 16 14
FC or SAS 3 28 16

Table 2. RAID group sizing for RAID4 groups
Storage system model Minimum group size Maximum group size Default group size
FAS250 2 14 7
All other models using ATA or SATA disks 2 7 7
All other models using FC or SAS disks 2 14 8

NetApp Ontapp Simulator + CentOS + Computer Hardware = NetApp Learning Fun!!

You'll need a NOW account to get the Ontap Simulator. Here's a link to some general info about it:
http://partners.netapp.com/go/techontap/matl/sample/0206tot_monthlytool.html
CentOS to get some RH Linux experience:
http://www.centos.org/
http://centos.mirrors.tds.net/pub/linux/centos/5.2/isos/x86_64/

TSM Operational Reporter

What is TSM Operational Reporter and where to get it?

http://www.tek-tips.com/faqs.cfm?fid=5643

Old PPT file showing TSM-OR:

ftp://service.boulder.ibm.com/storage/tivoli-storage-management/techprev/tsmopreport/latest/tsmrept_walk_through_d7.ppt

The IBM FTP site:

ftp://ftp.software.ibm.com/storage/tivoli-storage-management/maintenance/server/v5r4/WIN/5.4.3.0

Download: tsmcon54X0_win.exe

View DSMError.log and DSMSched.log Files Batch File

This batch file has been invaluable to me in checking error and sched logs every day. Drop this into a text file and save it as a batch. When launched the batch file deletes the current Z drive mapping, asks for the server name you would like to view. After entering a server name and hitting enter the batch pings the server and maps a new Z drive to the baclient folder of TSM. It then copies over the DSMError.log and DSMSched.log files to your C:\Temp folders and opens them one at a time starting with the DSMError.log. You have to close the error log for the DSMSched.log to come up. If the batch file cannot map the drive, it may hang or a message in the command line window and then open the last dsm files that were copied. So in a way, it troubleshoots the connection to the server right off the bat.

@echo off
net use z: /delete
Echo View TSM DSMError Logss
Echo Enter the name of the server you would like to view:
SET /p servername=
ping "%servername%"
net use z: "\\%servername%\c$\program files\tivoli\tsm\baclient"
z:
copy "z:\dsmerror.log" "C:\temp"
copy "Z:\dsmsched.log" "C:\temp"
"C:\temp\dsmerror.log"
"C:\temp\dsmsched.log"
quit

TSM Client for Windows 2k and Windows 2k3 Regarding Core System File Restores

Windows Server 2k refers to registry and other core files as System Objects.Windows Server 2k3 refers to registry and other core files as System State and System Services.

Both groups of files can only be restored while connected to TSM through the original machine. You cannot restore these files to another server first. Restore will then prompt for registry activation. It is not recommended to restore these files and not activate them. If you choose to activate, TSM will prompt for restart when restore is complete.