Thursday, January 15, 2009

Not able to install softwares giving bosboot verification failure

when i try to install software on aix box it gives me error of bosboot verification failed.

i check and found that /dev/ipldevice was not present. this file is a symlink of /dev/hdisk0 ( boot disk ).

so recreate the file /dev/ipldevice and make a hardlink of /dev/hdisk0

ln /dev/hdisk0 /dev/ipldevice

and then do bosboot -ad /dev/hdisk0

and then i tried to install software it works.

Tuesday, December 16, 2008

Procedure for exchanging a "Hot Swap" mirrored rootvg disk.

In the following example, an RS6000 has 3 disks, 2 of which have the AIX
filesystems mirrored on. The boolist contains both hdisk0 and hdisk1.
There are no other logical volumes in rootvg other than the AIX system
logical volumes. hdisk0 has failed and need replacing, both hdisk0 and hdisk1
are in "Hot Swap" carriers and therefore the machine does not need shutting
down.

lspv

hdisk0 00522d5f22e3b29d rootvg
hdisk1 00522d5f90e66fd2 rootvg
hdisk2 00522df586d454c3 datavg

lsvg -l rootvg

rootvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
hd6 paging 4 8 2 open/syncd N/A
hd5 boot 1 2 2 closed/syncd N/A
hd8 jfslog 1 2 2 open/syncd N/A
hd4 jfs 1 2 2 open/syncd /
hd2 jfs 12 24 2 open/syncd /usr
hd9var jfs 1 2 2 open/syncd /var
hd3 jfs 2 4 2 open/syncd /tmp
hd1 jfs 1 2 2 open/syncd /home



1, Reduce the logical volume copies from both disks to hdisk1 only :-

rmlvcopy hd6 1 hdisk0
rmlvcopy hd5 1 hdisk0
rmlvcopy hd8 1 hdisk0
rmlvcopy hd4 1 hdisk0
rmlvcopy hd2 1 hdisk0
rmlvcopy hd9var 1 hdisk0
rmlvcopy hd3 1 hdisk0
rmlvcopy hd1 1 hdisk0

2, Check that no logical volumes are left on hdisk0 :-

lspv -p hdisk0

hdisk0:
PP RANGE STATE REGION LV ID TYPE MOUNT POINT
1-101 free outer edge
102-201 free outer middle
202-301 free center
302-401 free inner middle
402-501 free inner edge

3, Remove the volume group from hdisk0

reducevg -df rootvg hdisk0

4, Recreate the boot logical volume on hdisk1, and reset bootlist:-

bosboot -a -d /dev/hdisk1
bootlist -m normal rmt0 cd0 hdisk1

5, Check that everything has been removed from hdisk0 :-

lspv

hdisk0 00522d5f22e3b29d None
hdisk1 00522d5f90e66fd2 rootvg
hdisk2 00522df586d454c3 datavg

6, Delete hdisk0 :-

rmdev -l hdisk0 -d

7, Remove the failed hard drive and replace with a new hard drive.

8, Configure the new disk drive :-

cfgmgr

9, Check new hard drive is present :-

lspv

10, Include the new hdisk in root volume group :-

extendvg rootvg hdisk? (where hdisk? is the new hard disk)

11, Re-create the mirror :-

mirrorvg rootvg hdisk? (where hdisk? is the new hard disk)

12, Syncronise the mirror :-

syncvg -v rootvg

13, Reset the bootlist :-

bootlist -m normal rmt0 cd0 hdisk0 hdisk1

14, Turn off Quorum checking on rootvg :-

chvg -Q n rootvg

Monday, December 8, 2008

Fix a Full / Filesystems

Use the Following steps:

1. Use the who command to read the contents of the /etc/security/failedlogin file:
# who /etc/security/failedlogin
The condition of TTYs respawning too rapidly can create failed login entries.
To clear the file after reading or saving the output, execute the following
command:
# cp /dev/null /etc/security/failedlogin

2. Check for very large files that might be removed using the find command. For
example, to find all files in the root (/) directory larger than 1 MB, use the
following command:
# find / -xdev -size +2048 -ls |sort -r +6

Before removing any files, use the command fuser to ensure a file is not
currently in use by a user process:
fuser filename

Fix a full /var Filsystem

You have to do the following Steps:

1. You can use the find command to look for large files in the /var directory. For
example:
# find /var -xdev -size +2048 -ls| sort -r +6

2. Check for obsolete or leftover files in /var/tmp.

3. Check the size of the /var/adm/wtmp file, which logs all logins, rlogins and
telnet sessions. The log will grow indefinitely unless system accounting is
running. System accounting clears it out nightly. The /var/adm/wtmp file can
be cleared out or edited to remove old and unwanted information. To clear it,
use the following command:
# cp /dev/null /var/adm/wtmp

4. Clear the error log in the /var/adm/ras directory using the following procedure.
The error log is never cleared unless it is manually cleared.

a. Stop the error daemon using the following command:
# /usr/lib/errstop

b. Remove or move to a different file system the error log file by using one of
the following commands:
# rm /var/adm/ras/errlog
or
# mv /var/adm/ras/errlog filename

c. Restart the error daemon using the following command:
# /usr/lib/errdemon


5. Check whether the trcfile file in this directory is large.
If it is large and trace is not currently being run,
you can remove the file using the following
command:
# rm /var/adm/ras/trcfile

6. If your dump device is set to hd6 (which is the default), there might be a
number of vmcore* files in the /var/adm/ras directory. If their file dates are old
or you do not want to retain them, you can remove them with the rm
command.

7. Check the /var/spool directory, which contains the queuing subsystem files.
Clear the queuing subsystem using the following commands:
# stopsrc -s qdaemon
0513-044 The qdaemon Subsystem was requested to stop.
# rm /var/spool/lpd/qdir/*
# rm /var/spool/lpd/stat/*
# rm /var/spool/qdaemon/*
# startsrc -s qdaemon
0513-059 The qdaemon Subsystem has been started. Subsystem PID is 291042.
#

8. Check the /var/adm/acct directory, which contains accounting records. If
accounting is running, this directory may contain several large files.

9. Modify the /var/tmp/snmpd.log, which records events from the snmpd daemon.
If the file is removed it will be recreated by the snmpd daemon.

10. Modify the /var/adm/sulog file, which records the number of attempted uses of
the su command and whether each was successful. This is a flat file and can
be viewed and modified with a favorite editor. If it is removed, it will be
recreated by the next attempted su command

Sunday, December 7, 2008

Aix Tunable parameter settings

Parameters Commands Expected Results.

Operating system Command oslevel -r 5.3.0.0 - 02, or Latest Level with Latest ML
Output

Command lppchk -c No Errors
Output

OS Settings Command date Date and Time
Output

Command date -u Diff 5:30 Hrs
Output

Command cat /etc/enviorment | grep TZ TZ=IST-5:30,, or TZ=IST-5:30
Output

Command lsattr -El mem0 As per the config
Output

Command lsps -a 2xMemory Size, if the memory Size is less than 8GB. If it s more than 8 GB Then 1.5 Times and if it s more than 16GB, same as much as Physical memory. Also note that If the paging space is in the rootvg only then make one paging space only.
Output

Command lsattr -El sys0 | grep maxuproc Value must be more 2048, ( minimus or as per the customer requirement)
Output

Command lsattr -El sys0 | grep maxpout Value must 32 ( If there is Cluster only)
Output

Command lsattr -El sys0 | grep minpout Value must be 24 ( If there is Cluster only)
Output

Command ulimit -a Ulimited for all except core size.
Output

RootVG Settings Command df -k / more than 512 MB
Output

Command df -k /var More than 1GB
Output

Command df -k /tmp More than 1 GB
Output

Command df -k /usr More than 2.5GB
Output

Command df -k /opt More than 512MB
Output

Command df -k /home More than 512MB
Output

Command lsvg -l rootvg All Lvs mirrored and syncd except
Output lg_dumplv

Command sysdumpdev -l always allow dump true, dump compression on
Output

Command sysdumpdev -e as per the memory size
Output

Command lslv lg_dumplv above value x 1.2
Output

Virtual Memory Settings Command vmo -L | grep grep maxfree 128*No CPUs
Output

Command vmo -L | grep minfree 120* No CPUs
Output

Command vmo -L | grep maxperm 20% Pages
Output

Command vmo -L | grep minperm 10 % pages
Output

Network Parameters Command no -L | grep sb_max value 1310720
Output

Command no -L | grep rfc1323 value 1
Output

Command no -L | grep tcp_sendspace value 221184
Output

Command no -L | grep tcp_recievespace value 221184
Output

Command no -L | grep udp_sendspace value 65536
Output

Command no -L | grep udp_recievespace. value 655360
Output

Command no -L | grep ipqmaxlen Value=512
Output Needs reboot.

IO Parameters Command ioo -L | grep sync_release_ilock Value 1
Output

Error Report Command errpt NO Permanent Hardware Errors
Output

Command diag NO Problem found in system Verification mode
Output after selecting all the resources

Problem Determination Server hungs at 546 LED due to bosboot failure

1. As server had not booted in normal mode. Once server gets hung & stucked on 546 LED during booting.It means its bosboot command to re-create boot image failed on server. Boot server in maintainence mode via CD. Access shell whithout mounting filesystem.



1. Check file system consistency

fsck /dev/hd4
fsck /dev/hd2
fsck /dev/hd3
fsck /dev/hd9var
fsck /dev/hd1


2. Format JFS log

/usr/sbin/logform /dev/hd8


3. Check disk in root vg

lsvg -p vg --> hdisk0, hdisk3


4. Check boot partitions

lslv -m hd5 --> hdisk0


5. Check bootlist in normal mode.

bootlist -m normal -o --> hdiks0

cd /dev, ls -l | grep ipl --> nothing comes back as there is proble,



Actual output should be if every thing is ok

# cd /dev
# ls -l | grep ipl
crw-rw---- 2 root system 10, 1 Aug 04 18:21 ipl_blv
crw------- 2 root system 17, 0 Aug 04 18:19 ipldevice
#


6. Create link as mentioned below


ln rhd5 ipl_blv -->ok
ln hdisk0 ipldevice -->ok


7. Set bootlist

bootlist -m normal hdisk0 -->ok
bootlist -m normal -o -->

8. Re-create boot image

bosboot -ad /dev/ipldevice --> 0301-162 save base failed /dev/hdisk0
0301-165 bosboot failed do not attempt to boot device
savebase -->ok
bosboot -ad /dev/ipldevice --> 0301-162 save base failed /dev/hdisk0
0301-165 bosboot failed do not attempt to boot device


9. List hd5 in rootvg.

lsvg -l rootvg --> hd5 has 1 pp and not mirrored


10. Remove hd5

rmlv hd5


11. Clear boot image

chpv -c hdisk0
chpv -c hdisk3

11. Create hd5

mklv -t boot -y hd5 -ae -c 1 rootvg 1 hdisk0 -->savebase failed
savebase -d /dev/hdisk0 -->ok

mklv -t boot -y hd5 -ae -c 1 rootvg 1 hdisk0 -->savebase failed

synclvodm -Pv rootvg
savebase -d /dev/hdisk0

mklv -t boot -y hd5 -ae -c 1 rootvg 1 hdisk0 -->savebase failed
lsvg -l rootvg --> hd5 was there
savebase -d /dev/hdisk0
ln /dev/rhd5 /dev/ipl_blv
ls -al | grep ipl -> make sure his has ipldevice and ipl_blv
bosboot -ad /dev/ipldevice
ipl_varyon -i --> yes in front of hdisk0
bootlist -m normal -o --> hdisk0 blv=hd5
sync;sync;sync;reboot --> worked we are booted up

Thursday, November 13, 2008

Crontab Example

Crontab file
___________
Crontab syntax :-
A crontab file has five fields for specifying day , date and time followed by the command to be run at that interval.

* * * * * command to be executed
- - - - -
| | | | |
| | | | +----- day of week (0 - 6) (Sunday=0)
| | | +------- month (1 - 12)
| | +--------- day of month (1 - 31)
| +----------- hour (0 - 23)
+------------- min (0 - 59)