Search: For:
Browsing Single Category
www.dbatoolz.com ORACLE DBA Forums SAN/RAC › Topic Id: 2745 | Permalink

Solaris Host to Pillar redundancy test using ORACLE Database

Topic ID: 2745
Created By: 2007-NOV-13 19:08:15 [Vitaliy]
Updated By: 2007-NOV-14 18:32:58 [Vitaliy]
Status: Open
Severity: Normal
Read Only: No
8516
2007-NOV-13 19:08:15
Moderator
 
 
Registered On: Mar 2006
Total Posts: 267
2007-NOV-13 19:19 Vitaliy eros_perf_test.txt 1757 Bytes    
##
## TEST I
##    Objective - "find what it takes to break it"
##

Our Current Connection map from our test host (eros) to two Pillar Control 
Units (c0 and c1), each Control Unit has two redundant ports (p0 and p1):


                       o-> Brocade0 |-> c0p0
                       |            |-> c1p0
   EROS -> Dual HBA |--o
                       |            |-> c0p1
                       o-> Brocade1 |-> c1p1

If you are in a hurry here's - Conclusion of Solaris Host to Pillar redundancy test using ORACLE Database

## PHASE - I
##
## Disconnect just one of the uplinks from EROS to Brocade0
## 


Nov 13 17:33:40 eros qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(0): Link OFFLINE


Nov 13 17:35:10 eros fctl: [ID 517869 kern.warning] WARNING: fp(1)::OFFLINE timeout


Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=a (trace=0), target=10100 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=9 (trace=0), target=10100 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=8 (trace=0), target=10100 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=7 (trace=0), target=10100 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=6 (trace=0), target=10100 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=5 (trace=0), target=10100 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=4 (trace=0), target=10100 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=3 (trace=0), target=10100 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=2 (trace=0), target=10100 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=1 (trace=0), target=10100 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=0 (trace=0), target=10100 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=a (trace=0), target=10000 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=9 (trace=0), target=10000 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=8 (trace=0), target=10000 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=7 (trace=0), target=10000 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=6 (trace=0), target=10000 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=5 (trace=0), target=10000 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=4 (trace=0), target=10000 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=3 (trace=0), target=10000 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=2 (trace=0), target=10000 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=1 (trace=0), target=10000 (trace=2800004)
Nov 13 17:35:29 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fcp1):
Nov 13 17:35:29 eros    offlining lun=0 (trace=0), target=10000 (trace=2800004)
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b080014002056 (ssd10) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2200000b08043bd0,a is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b080013002056 (ssd11) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2200000b08043bd0,9 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b080012002056 (ssd12) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2200000b08043bd0,8 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b080011002056 (ssd13) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2200000b08043bd0,7 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b080010002056 (ssd14) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2200000b08043bd0,6 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000f002056 (ssd15) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2200000b08043bd0,5 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000e002056 (ssd16) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2200000b08043bd0,4 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000d002056 (ssd17) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2200000b08043bd0,3 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000c002056 (ssd18) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2200000b08043bd0,2 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000b002056 (ssd19) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2200000b08043bd0,1 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000a002056 (ssd20) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2200000b08043bd0,0 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b080014002056 (ssd10) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2100000b08043bd0,a is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b080013002056 (ssd11) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2100000b08043bd0,9 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b080012002056 (ssd12) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2100000b08043bd0,8 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b080011002056 (ssd13) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2100000b08043bd0,7 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b080010002056 (ssd14) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2100000b08043bd0,6 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000f002056 (ssd15) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2100000b08043bd0,5 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000e002056 (ssd16) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2100000b08043bd0,4 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000d002056 (ssd17) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2100000b08043bd0,3 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000c002056 (ssd18) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2100000b08043bd0,2 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000b002056 (ssd19) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2100000b08043bd0,1 is offline Load balancing: round-robin
Nov 13 17:35:29 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000a002056 (ssd20) multipath status: optimal, path /pci@1d,700000/SUNW,qlc@1/fp@0,0 (fp1) to target address: w2100000b08043bd0,0 is offline Load balancing: round-robin
#


PHASE - I Result:
   Everything continues to function fine

continued in the next section ...
[edited by: Vitaliy at 19:17 (CST) on Nov. 13, 2007]
8517
2007-NOV-13 19:08:39
Moderator
 
 
Registered On: Mar 2006
Total Posts: 267
## PHASE - II
##
## disconnect c0p1 (link from C0P1 to Brocade1)
## this still leaves one other P1 on c1 working so we should be OK
##

Nov 13 17:40:54 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b08000b002056 (ssd19):
Nov 13 17:40:54 eros    Error for Command: write(10)               Error Level: Retryable
Nov 13 17:40:54 eros scsi: [ID 107833 kern.notice]      Requested Block: 18464                     Error Block: 18464
Nov 13 17:40:54 eros scsi: [ID 107833 kern.notice]      Vendor: Pillar                             Serial Number:            
Nov 13 17:40:54 eros scsi: [ID 107833 kern.notice]      Sense Key: Unit Attention
Nov 13 17:40:54 eros scsi: [ID 107833 kern.notice]      ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Nov 13 17:40:54 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b08000d002056 (ssd17):
Nov 13 17:40:54 eros    Error for Command: write(10)               Error Level: Retryable
Nov 13 17:40:54 eros scsi: [ID 107833 kern.notice]      Requested Block: 18464                     Error Block: 18464
Nov 13 17:40:54 eros scsi: [ID 107833 kern.notice]      Vendor: Pillar                             Serial Number:            
Nov 13 17:40:54 eros scsi: [ID 107833 kern.notice]      Sense Key: Unit Attention
Nov 13 17:40:54 eros scsi: [ID 107833 kern.notice]      ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Nov 13 17:40:54 eros fctl: [ID 517869 kern.warning] WARNING: fp(0)::GPN_ID for D_ID=10000 failed
Nov 13 17:40:54 eros fctl: [ID 517869 kern.warning] WARNING: fp(0)::N_x Port with D_ID=10000, PWWN=2300000b08043bd0 disappeared from fabric
Nov 13 17:40:55 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b08000e002056 (ssd16):
Nov 13 17:40:55 eros    Error for Command: write(10)               Error Level: Retryable
Nov 13 17:40:55 eros scsi: [ID 107833 kern.notice]      Requested Block: 9347056                   Error Block: 9347056
Nov 13 17:40:55 eros scsi: [ID 107833 kern.notice]      Vendor: Pillar                             Serial Number:            
Nov 13 17:40:55 eros scsi: [ID 107833 kern.notice]      Sense Key: Unit Attention
Nov 13 17:40:55 eros scsi: [ID 107833 kern.notice]      ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0


Nov 13 17:41:14 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:41:14 eros    offlining lun=a (trace=0), target=10000 (trace=2800004)
Nov 13 17:41:14 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:41:14 eros    offlining lun=9 (trace=0), target=10000 (trace=2800004)
Nov 13 17:41:14 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:41:14 eros    offlining lun=8 (trace=0), target=10000 (trace=2800004)
Nov 13 17:41:14 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:41:14 eros    offlining lun=7 (trace=0), target=10000 (trace=2800004)
Nov 13 17:41:14 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:41:14 eros    offlining lun=6 (trace=0), target=10000 (trace=2800004)
Nov 13 17:41:14 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:41:14 eros    offlining lun=5 (trace=0), target=10000 (trace=2800004)
Nov 13 17:41:14 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:41:14 eros    offlining lun=4 (trace=0), target=10000 (trace=2800004)
Nov 13 17:41:14 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:41:14 eros    offlining lun=3 (trace=0), target=10000 (trace=2800004)
Nov 13 17:41:14 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:41:14 eros    offlining lun=2 (trace=0), target=10000 (trace=2800004)
Nov 13 17:41:14 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:41:14 eros    offlining lun=1 (trace=0), target=10000 (trace=2800004)
Nov 13 17:41:14 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:41:14 eros    offlining lun=0 (trace=0), target=10000 (trace=2800004)
Nov 13 17:41:14 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b080014002056 (ssd10) multipath status: degraded, path /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fp0) to target address: w2300000b08043bd0,a is offline Load balancing: round-robin
Nov 13 17:41:14 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b080013002056 (ssd11) multipath status: degraded, path /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fp0) to target address: w2300000b08043bd0,9 is offline Load balancing: round-robin
Nov 13 17:41:14 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b080012002056 (ssd12) multipath status: degraded, path /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fp0) to target address: w2300000b08043bd0,8 is offline Load balancing: round-robin
Nov 13 17:41:14 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b080011002056 (ssd13) multipath status: degraded, path /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fp0) to target address: w2300000b08043bd0,7 is offline Load balancing: round-robin
Nov 13 17:41:14 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b080010002056 (ssd14) multipath status: degraded, path /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fp0) to target address: w2300000b08043bd0,6 is offline Load balancing: round-robin
Nov 13 17:41:14 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000f002056 (ssd15) multipath status: degraded, path /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fp0) to target address: w2300000b08043bd0,5 is offline Load balancing: round-robin
Nov 13 17:41:14 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000e002056 (ssd16) multipath status: degraded, path /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fp0) to target address: w2300000b08043bd0,4 is offline Load balancing: round-robin
Nov 13 17:41:14 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000d002056 (ssd17) multipath status: degraded, path /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fp0) to target address: w2300000b08043bd0,3 is offline Load balancing: round-robin
Nov 13 17:41:14 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000c002056 (ssd18) multipath status: degraded, path /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fp0) to target address: w2300000b08043bd0,2 is offline Load balancing: round-robin
Nov 13 17:41:14 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000b002056 (ssd19) multipath status: degraded, path /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fp0) to target address: w2300000b08043bd0,1 is offline Load balancing: round-robin
Nov 13 17:41:14 eros genunix: [ID 834635 kern.info] /scsi_vhci/ssd@g000b08000a002056 (ssd20) multipath status: degraded, path /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fp0) to target address: w2300000b08043bd0,0 is offline Load balancing: round-robin
Nov 13 17:41:14 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b080013002056 (ssd11):
Nov 13 17:41:14 eros    Error for Command: write(10)               Error Level: Retryable
Nov 13 17:41:14 eros scsi: [ID 107833 kern.notice]      Requested Block: 10687                     Error Block: 10687
Nov 13 17:41:14 eros scsi: [ID 107833 kern.notice]      Vendor: Pillar                             Serial Number:            
Nov 13 17:41:14 eros scsi: [ID 107833 kern.notice]      Sense Key: Unit Attention
Nov 13 17:41:14 eros scsi: [ID 107833 kern.notice]      ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Nov 13 17:41:14 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b08000f002056 (ssd15):
Nov 13 17:41:14 eros    Error for Command: write(10)               Error Level: Retryable
Nov 13 17:41:14 eros scsi: [ID 107833 kern.notice]      Requested Block: 4477504                   Error Block: 4477504
Nov 13 17:41:14 eros scsi: [ID 107833 kern.notice]      Vendor: Pillar                             Serial Number:            
Nov 13 17:41:14 eros scsi: [ID 107833 kern.notice]      Sense Key: Unit Attention
Nov 13 17:41:14 eros scsi: [ID 107833 kern.notice]      ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Nov 13 17:41:14 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b08000a002056 (ssd20):
Nov 13 17:41:14 eros    Error for Command: write(10)               Error Level: Retryable
Nov 13 17:41:14 eros scsi: [ID 107833 kern.notice]      Requested Block: 14095968                  Error Block: 14095968
Nov 13 17:41:14 eros scsi: [ID 107833 kern.notice]      Vendor: Pillar                             Serial Number:            
Nov 13 17:41:14 eros scsi: [ID 107833 kern.notice]      Sense Key: Unit Attention
Nov 13 17:41:14 eros scsi: [ID 107833 kern.notice]      ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Nov 13 17:41:14 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b08000c002056 (ssd18):
Nov 13 17:41:14 eros    Error for Command: write(10)               Error Level: Retryable
Nov 13 17:41:14 eros scsi: [ID 107833 kern.notice]      Requested Block: 18464                     Error Block: 18464
Nov 13 17:41:14 eros scsi: [ID 107833 kern.notice]      Vendor: Pillar                             Serial Number:            
Nov 13 17:41:14 eros scsi: [ID 107833 kern.notice]      Sense Key: Unit Attention
Nov 13 17:41:14 eros scsi: [ID 107833 kern.notice]      ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0


Nov 13 17:41:45 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b08000c002056 (ssd18):
Nov 13 17:41:45 eros    Error for Command: write(10)               Error Level: Retryable
Nov 13 17:41:45 eros scsi: [ID 107833 kern.notice]      Requested Block: 14705712                  Error Block: 14705712
Nov 13 17:41:45 eros scsi: [ID 107833 kern.notice]      Vendor: Pillar                             Serial Number:            
Nov 13 17:41:45 eros scsi: [ID 107833 kern.notice]      Sense Key: Unit Attention
Nov 13 17:41:45 eros scsi: [ID 107833 kern.notice]      ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Nov 13 17:41:45 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b08000a002056 (ssd20):
Nov 13 17:41:45 eros    Error for Command: write(10)               Error Level: Retryable
Nov 13 17:41:45 eros scsi: [ID 107833 kern.notice]      Requested Block: 11250352                  Error Block: 11250352
Nov 13 17:41:45 eros scsi: [ID 107833 kern.notice]      Vendor: Pillar                             Serial Number:            
Nov 13 17:41:45 eros scsi: [ID 107833 kern.notice]      Sense Key: Unit Attention
Nov 13 17:41:45 eros scsi: [ID 107833 kern.notice]      ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0


Nov 13 17:41:50 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b08000f002056 (ssd15):
Nov 13 17:41:50 eros    Error for Command: write(10)               Error Level: Retryable
Nov 13 17:41:50 eros scsi: [ID 107833 kern.notice]      Requested Block: 6074176                   Error Block: 6074176
Nov 13 17:41:50 eros scsi: [ID 107833 kern.notice]      Vendor: Pillar                             Serial Number:            
Nov 13 17:41:50 eros scsi: [ID 107833 kern.notice]      Sense Key: Unit Attention
Nov 13 17:41:50 eros scsi: [ID 107833 kern.notice]      ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0


Nov 13 17:41:57 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b080011002056 (ssd13):
Nov 13 17:41:57 eros    Error for Command: write(10)               Error Level: Retryable
Nov 13 17:41:57 eros scsi: [ID 107833 kern.notice]      Requested Block: 98460                     Error Block: 98460
Nov 13 17:41:57 eros scsi: [ID 107833 kern.notice]      Vendor: Pillar                             Serial Number:            
Nov 13 17:41:57 eros scsi: [ID 107833 kern.notice]      Sense Key: Unit Attention
Nov 13 17:41:57 eros scsi: [ID 107833 kern.notice]      ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0



Nov 13 17:42:37 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b080011002056 (ssd13):
Nov 13 17:42:37 eros    Error for Command: write(10)               Error Level: Retryable
Nov 13 17:42:37 eros scsi: [ID 107833 kern.notice]      Requested Block: 16                        Error Block: 16
Nov 13 17:42:37 eros scsi: [ID 107833 kern.notice]      Vendor: Pillar                             Serial Number:            
Nov 13 17:42:37 eros scsi: [ID 107833 kern.notice]      Sense Key: Unit Attention
Nov 13 17:42:37 eros scsi: [ID 107833 kern.notice]      ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0


PHASE - II Result:
   Everything continues to function fine.
   There are lots of errors but the setup continues to work!

continued in the next section ...
[edited by: Vitaliy at 19:13 (CST) on Nov. 13, 2007]
8518
2007-NOV-13 19:09:24
Moderator
 
 
Registered On: Mar 2006
Total Posts: 267
## PHASE - III
##
## disconnect last functioning P1 port for the c1 unit (link from c1p1 to Brocade1)
## 

## this kills connection to Pillar:

Nov 13 17:50:23 eros fctl: [ID 517869 kern.warning] WARNING: fp(0)::GPN_ID for D_ID=10100 failed
Nov 13 17:50:23 eros fctl: [ID 517869 kern.warning] WARNING: fp(0)::N_x Port with D_ID=10100, PWWN=2400000b08043bd0 disappeared from fabric


Nov 13 17:50:43 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:50:43 eros    offlining lun=a (trace=0), target=10100 (trace=2800004)
Nov 13 17:50:43 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:50:43 eros    offlining lun=9 (trace=0), target=10100 (trace=2800004)
Nov 13 17:50:43 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:50:43 eros    offlining lun=8 (trace=0), target=10100 (trace=2800004)
Nov 13 17:50:43 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:50:43 eros    offlining lun=7 (trace=0), target=10100 (trace=2800004)
Nov 13 17:50:43 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:50:43 eros    offlining lun=6 (trace=0), target=10100 (trace=2800004)
Nov 13 17:50:43 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:50:43 eros    offlining lun=5 (trace=0), target=10100 (trace=2800004)
Nov 13 17:50:43 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:50:43 eros    offlining lun=4 (trace=0), target=10100 (trace=2800004)
Nov 13 17:50:43 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:50:43 eros    offlining lun=3 (trace=0), target=10100 (trace=2800004)
Nov 13 17:50:43 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:50:43 eros    offlining lun=2 (trace=0), target=10100 (trace=2800004)
Nov 13 17:50:43 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:50:43 eros    offlining lun=1 (trace=0), target=10100 (trace=2800004)
Nov 13 17:50:43 eros scsi: [ID 243001 kern.info] /pci@1d,700000/SUNW,qlc@1,1/fp@0,0 (fcp0):
Nov 13 17:50:43 eros    offlining lun=0 (trace=0), target=10100 (trace=2800004)
Nov 13 17:50:43 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b080013002056 (ssd11):
Nov 13 17:50:43 eros    transport rejected fatal error
Nov 13 17:50:43 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b08000b002056 (ssd19):
Nov 13 17:50:43 eros    transport rejected fatal error
Nov 13 17:50:43 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b080010002056 (ssd14):
Nov 13 17:50:43 eros    transport rejected fatal error
Nov 13 17:50:43 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b08000e002056 (ssd16):
Nov 13 17:50:43 eros    transport rejected fatal error
Nov 13 17:50:43 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b08000a002056 (ssd20):
Nov 13 17:50:43 eros    transport rejected fatal error
Nov 13 17:50:43 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b08000c002056 (ssd18):
Nov 13 17:50:43 eros    transport rejected fatal error
Nov 13 17:50:43 eros ufs: [ID 702911 kern.warning] WARNING: Error writing master during ufs log roll
Nov 13 17:50:43 eros ufs: [ID 127457 kern.warning] WARNING: ufs log for /u08 changed state to Error
Nov 13 17:50:43 eros ufs: [ID 616219 kern.warning] WARNING: Please umount(1M) /u08 and run fsck(1M)
Nov 13 17:50:43 eros scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g000b08000d002056 (ssd17):
Nov 13 17:50:43 eros    transport rejected fatal error
Nov 13 17:50:43 eros ufs: [ID 702911 kern.warning] WARNING: Error writing ufs log
Nov 13 17:50:43 eros ufs: [ID 127457 kern.warning] WARNING: ufs log for /u03 changed state to Error
Nov 13 17:50:43 eros ufs: [ID 616219 kern.warning] WARNING: Please umount(1M) /u03 and run fsck(1M)
Nov 13 17:50:43 eros ufs: [ID 702911 kern.warning] WARNING: Error writing master during ufs log roll
Nov 13 17:50:43 eros ufs: [ID 127457 kern.warning] WARNING: ufs log for /u07 changed state to Error
Nov 13 17:50:43 eros ufs: [ID 616219 kern.warning] WARNING: Please umount(1M) /u07 and run fsck(1M)
Nov 13 17:50:43 eros ufs: [ID 702911 kern.warning] WARNING: Error writing master during ufs log roll
Nov 13 17:50:43 eros ufs: [ID 127457 kern.warning] WARNING: ufs log for /u04 changed state to Error
Nov 13 17:50:43 eros ufs: [ID 616219 kern.warning] WARNING: Please umount(1M) /u04 and run fsck(1M)
Nov 13 17:50:43 eros ufs: [ID 702911 kern.warning] WARNING: Error writing master during ufs log roll
Nov 13 17:50:43 eros ufs: [ID 127457 kern.warning] WARNING: ufs log for /u09 changed state to Error
Nov 13 17:50:43 eros ufs: [ID 616219 kern.warning] WARNING: Please umount(1M) /u09 and run fsck(1M)
Nov 13 17:50:43 eros ufs: [ID 702911 kern.warning] WARNING: Error writing master during ufs log roll
Nov 13 17:50:43 eros ufs: [ID 127457 kern.warning] WARNING: ufs log for /u10 changed state to Error
Nov 13 17:50:43 eros ufs: [ID 616219 kern.warning] WARNING: Please umount(1M) /u10 and run fsck(1M)


Nov 13 17:51:20 eros ufs: [ID 702911 kern.warning] WARNING: Error writing master during ufs log roll
Nov 13 17:51:20 eros ufs: [ID 127457 kern.warning] WARNING: ufs log for /u05 changed state to Error
Nov 13 17:51:20 eros ufs: [ID 616219 kern.warning] WARNING: Please umount(1M) /u05 and run fsck(1M)
#



## and it also kills Oracle:

17:31:58 SYSTEM@AGP:eros> exec pillar_test_proc(10000) ;
BEGIN pillar_test_proc(10000); END;

*
ERROR at line 1:
ORA-03113: end-of-file on communication channel


17:50:13 SYSTEM@AGP:eros>


Tue Nov 13 17:50:43 2007
KCF: write/open error block=0x33203 online=1
     file=10 /u10/oradata/AGP/AGILE_INDX201AGP.ora
     error=27063 txt: 'SVR4 Error: 5: I/O error
Additional information: -1
Additional information: 8192'
Tue Nov 13 17:50:43 2007
Errors in file /u01/app/oracle/admin/AGP/bdump/agp_lgwr_3220.trc:
ORA-00345: redo log write error block 282970 count 2048
ORA-00312: online log 2 thread 1: '/u03/oradata/AGP/log2AGP.ora'
ORA-27063: number of bytes read/written is incorrect
SVR4 Error: 5: I/O error
Additional information: -1
Additional information: 1048576
Tue Nov 13 17:50:43 2007
Errors in file /u01/app/oracle/admin/AGP/bdump/agp_ckpt_3222.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/u10/oradata/AGP/ctl03AGP.ora'
ORA-27063: number of bytes read/written is incorrect
SVR4 Error: 5: I/O error
Additional information: -1
Additional information: 16384
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/u09/oradata/AGP/ctl02AGP.ora'
ORA-27063: number of bytes read/written is incorrect
SVR4 Error: 5: I/O error
Additional information: -1
Additional information: 16384
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/u08/oradata/AGP/ctl01AGP.ora'
ORA-27063: number of bytes read/written is incorrect
SVR4 Error: 5: I/O error
Additional information: -1
Additional information: 16384
Tue Nov 13 17:50:43 2007
Errors in file /u01/app/oracle/admin/AGP/bdump/agp_ckpt_3222.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/u10/oradata/AGP/ctl03AGP.ora'
ORA-27063: number of bytes read/written is incorrect
SVR4 Error: 5: I/O error
Additional information: -1
Additional information: 16384
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/u09/oradata/AGP/ctl02AGP.ora'
ORA-27063: number of bytes read/written is incorrect
SVR4 Error: 5: I/O error
Additional information: -1
Additional information: 16384
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '/u08/oradata/AGP/ctl01AGP.ora'
ORA-27063: number of bytes read/written is incorrect
SVR4 Error: 5: I/O error
Additional information: -1
Additional information: 16384
Tue Nov 13 17:50:43 2007
CKPT: terminating instance due to error 221
Tue Nov 13 17:50:44 2007
Errors in file /u01/app/oracle/admin/AGP/bdump/agp_lgwr_3220.trc:
ORA-00340: IO error processing online log 2 of thread 1
ORA-00345: redo log write error block 282970 count 2048
ORA-00312: online log 2 thread 1: '/u03/oradata/AGP/log2AGP.ora'
ORA-27063: number of bytes read/written is incorrect
SVR4 Error: 5: I/O error
Additional information: -1
Additional information: 1048576
Instance terminated by CKPT, pid = 3222



PHASE - III Result:
   I left no available path for data to travel from
   host->brocade->Pillar so there's no surprise that
   everything came down.

continued in the next section ...
[edited by: Vitaliy at 19:20 (CST) on Nov. 13, 2007]
8519
2007-NOV-13 19:09:39
Moderator
 
 
Registered On: Mar 2006
Total Posts: 267
TEST I Conclusion:

   Our setup offers fully redundant connection from 
   our host all the way to Pillar:
   
      host/dual-HBA-card -> brocade0
                         -> brocade1
      
      brocade0 -> c0p0 (Pillar's CU0-Port-0)
               -> c1p0 (Pillar's CU1-Port-0)
      
      brocade1 -> c0p1 (Pillar's CU0-Port-1)
               -> c1p1 (Pillar's CU1-Port-1)
    
   PHASE - I:
      we cut host->brocade0
      data is routed via brocade1 to both Control Units via Port-1
      
   PHASE - II:
      we cut brocade1->c0p1
      there's no direct path to c0 so Pillar routes all data 
      destined to c0 via internal interconnect from c1->c0
   
   PHASE - III:
      we cut the last link brocade1->c1p1
      no available path exists for data to travel from host
      to Pillar causing SUN to offline luns,
      ORACLE goes down
   
   
   

##
## Fix test HOST (EROS) after TEST I
##


# ls -l /u02 /u03 /u04 /u05 /u06 /u07 /u08 /u09 /u10 /arch /copy
/u03: I/O error
/u04: I/O error
/u05: I/O error
/u07: I/O error
/u08: I/O error
/u09: I/O error
/u10: I/O error
/arch:
total 0

/copy:
total 0

/u02:
total 0

/u06:
total 0
#


# egrep "/u03|/u04|/u05|/u07|/u08|/u09|/u10" /etc/vfstab
/dev/dsk/c5t000B080010002056d0s2 /dev/rdsk/c5t000B080010002056d0s2 /u03  ufs 1 yes -
/dev/dsk/c5t000B08000E002056d0s2 /dev/rdsk/c5t000B08000E002056d0s2 /u04  ufs 1 yes -
/dev/dsk/c5t000B080013002056d0s2 /dev/rdsk/c5t000B080013002056d0s2 /u05  ufs 1 yes -
/dev/dsk/c5t000B08000A002056d0s2 /dev/rdsk/c5t000B08000A002056d0s2 /u07  ufs 1 yes -
/dev/dsk/c5t000B08000B002056d0s2 /dev/rdsk/c5t000B08000B002056d0s2 /u08  ufs 1 yes -
/dev/dsk/c5t000B08000C002056d0s2 /dev/rdsk/c5t000B08000C002056d0s2 /u09  ufs 1 yes -
/dev/dsk/c5t000B08000D002056d0s2 /dev/rdsk/c5t000B08000D002056d0s2 /u10  ufs 1 yes -

I also mapped out target ID's with LUNS/CUs to see why only specific LUNS are 
being offlined but what I think is happening is this - when a LUN is being 
accessed and it's not available Sun offlines it to avoid further damage.  So 
it's just a lottery - if you happen to access a particular file system the 
underlying LUN will be taken offline otherwise it stays online.

Target ID on Host         LUN Name on Pillar  Pillar CU Id
------------------------  ------------------- -------------
c5t000B080010002056d0s2   Eros-P-C1-L6        CU1 
c5t000B08000E002056d0s2   Eros-H-AU-L4        CU1
c5t000B080013002056d0s2   Eros-H-AU-L9        CU0
c5t000B08000A002056d0s2   Eros-M-C0-L0        CU0
c5t000B08000B002056d0s2   Eros-M-C1-L1        CU1
c5t000B08000C002056d0s2   Eros-M-C0-L2        CU0
c5t000B08000D002056d0s2   Eros-M-C1-L3        CU1


## unmount and fix affected file systems
##

umount /u03
umount /u04
umount /u05
umount /u07
umount /u08
umount /u09
umount /u10


## note below that /u10 and /u08 now have WRONG SUPERBLOCK
## which we fix using fsck
##


# fsck /dev/rdsk/c5t000B080010002056d0s2
** /dev/rdsk/c5t000B080010002056d0s2
** Last Mounted on /u03
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3a - Check Connectivity
** Phase 3b - Verify Shadows/ACLs
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cylinder Groups
6 files, 614747 used, 3604099 free (11 frags, 450511 blocks, 0.0% fragmentation)
#
# fsck /dev/rdsk/c5t000B08000E002056d0s2
** /dev/rdsk/c5t000B08000E002056d0s2
** Last Mounted on /u04
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3a - Check Connectivity
** Phase 3b - Verify Shadows/ACLs
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cylinder Groups
5 files, 3073531 used, 5377162 free (10 frags, 672144 blocks, 0.0% fragmentation)
#
#
# fsck /dev/rdsk/c5t000B080013002056d0s2
** /dev/rdsk/c5t000B080013002056d0s2
** Last Mounted on /u05
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3a - Check Connectivity
** Phase 3b - Verify Shadows/ACLs
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cylinder Groups
9 files, 717267 used, 7733426 free (10 frags, 966677 blocks, 0.0% fragmentation)
#
# fsck /dev/rdsk/c5t000B08000A002056d0s2
** /dev/rdsk/c5t000B08000A002056d0s2
** Last Mounted on /u07
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3a - Check Connectivity
** Phase 3b - Verify Shadows/ACLs
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cylinder Groups
6 files, 4098043 used, 4352650 free (10 frags, 544080 blocks, 0.0% fragmentation)
#
# fsck /dev/rdsk/c5t000B08000B002056d0s2
** /dev/rdsk/c5t000B08000B002056d0s2
** Last Mounted on /u08
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3a - Check Connectivity
** Phase 3b - Verify Shadows/ACLs
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cylinder Groups

FILE SYSTEM STATE IN SUPERBLOCK IS WRONG; FIX? yes

7 files, 4107187 used, 4343506 free (10 frags, 542937 blocks, 0.0% fragmentation)
#
# fsck /dev/rdsk/c5t000B08000B002056d0s2
** /dev/rdsk/c5t000B08000B002056d0s2
** Last Mounted on /u08
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3a - Check Connectivity
** Phase 3b - Verify Shadows/ACLs
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cylinder Groups
7 files, 4107187 used, 4343506 free (10 frags, 542937 blocks, 0.0% fragmentation)
#
# fsck /dev/rdsk/c5t000B08000C002056d0s2
** /dev/rdsk/c5t000B08000C002056d0s2
** Last Mounted on /u09
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3a - Check Connectivity
** Phase 3b - Verify Shadows/ACLs
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cylinder Groups
7 files, 4189147 used, 4261546 free (10 frags, 532692 blocks, 0.0% fragmentation)
#
# fsck /dev/rdsk/c5t000B08000D002056d0s2
** /dev/rdsk/c5t000B08000D002056d0s2
** Last Mounted on /u10
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3a - Check Connectivity
** Phase 3b - Verify Shadows/ACLs
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cylinder Groups

FILE SYSTEM STATE IN SUPERBLOCK IS WRONG; FIX? yes

7 files, 4107187 used, 4343506 free (10 frags, 542937 blocks, 0.0% fragmentation)
#
# fsck /dev/rdsk/c5t000B08000D002056d0s2
** /dev/rdsk/c5t000B08000D002056d0s2
** Last Mounted on /u10
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3a - Check Connectivity
** Phase 3b - Verify Shadows/ACLs
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cylinder Groups
7 files, 4107187 used, 4343506 free (10 frags, 542937 blocks, 0.0% fragmentation)
#
#


## mount the file systems back
##

mount /u03
mount /u04
mount /u05
mount /u07
mount /u08
mount /u09
mount /u10


# ls -l /u10
total 18
drwx------   2 root     root        8192 Nov  8 16:13 lost+found
drwxr-xr-x   3 oracle   dba          512 Nov 12 16:16 oradata

# ls -l /u08
total 18
drwx------   2 root     root        8192 Nov  8 16:12 lost+found
drwxr-xr-x   3 oracle   dba          512 Nov 12 16:16 oradata
#


# find /u10 -ls
    2    1 drwxr-xr-x   4 root     root          512 Nov 12 16:16 /u10
    3    8 drwx------   2 root     root         8192 Nov  8 16:13 /u10/lost+found
    4    1 drwxr-xr-x   3 oracle   dba           512 Nov 12 16:16 /u10/oradata
    5    1 drwxr-xr-x   2 oracle   dba           512 Nov 12 16:38 /u10/oradata/AGP
    6 9144 -rw-r-----   1 oracle   dba       9355264 Nov 13 17:49 /u10/oradata/AGP/ctl03AGP.ora
    7 2049016 -rw-r-----   1 oracle   dba      2097160192 Nov 13 17:49 /u10/oradata/AGP/AGILE_INDX201AGP.ora
    8 2049016 -rw-r-----   1 oracle   dba      2097160192 Nov 13 17:49 /u10/oradata/AGP/AGILE_INDX401AGP.ora
#
#
# find /u08 -ls
    2    1 drwxr-xr-x   4 root     root          512 Nov 12 16:16 /u08
    3    8 drwx------   2 root     root         8192 Nov  8 16:12 /u08/lost+found
    4    1 drwxr-xr-x   3 oracle   dba           512 Nov 12 16:16 /u08/oradata
    5    1 drwxr-xr-x   2 oracle   dba           512 Nov 12 16:38 /u08/oradata/AGP
    6 9144 -rw-r-----   1 oracle   dba       9355264 Nov 13 17:49 /u08/oradata/AGP/ctl01AGP.ora
    7 2049016 -rw-r-----   1 oracle   dba      2097160192 Nov 13 17:49 /u08/oradata/AGP/AGILE_DATA201AGP.ora
    8 2049016 -rw-r-----   1 oracle   dba      2097160192 Nov 13 17:49 /u08/oradata/AGP/AGILE_DATA401AGP.ora
#


## restart ORACLE
##

eros.AGP-> ps -ef | grep oracle
  oracle  2023     1   0   Nov 12 ?           0:04 /u01/app/oracle/product/10.2.0/db_1/bin/tnslsnr LISTENER -inherit
  oracle  2052   825   0   Nov 12 pts/2       0:00 -ksh
  oracle  2329   801   0   Nov 12 pts/1       0:00 -ksh
  oracle  3416  2052   0 17:47:33 pts/2       0:00 tail -f alert_AGP.log
  oracle  3482  2329   0 18:30:01 pts/1       0:00 grep oracle
  oracle  3481  2329   0 18:30:01 pts/1       0:00 ps -ef
eros.AGP-> sqlplus /nolog

SQL*Plus: Release 10.2.0.3.0 - Production on Tue Nov 13 18:30:03 2007

Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.

SQL> connect / as sysdba
Connected to an idle instance.
SQL> startup
ORACLE instance started.

Total System Global Area 1694498816 bytes
Fixed Size                  2030616 bytes
Variable Size             637535208 bytes
Database Buffers         1040187392 bytes
Redo Buffers               14745600 bytes
Database mounted.
Database opened.
SQL>


Alert log shows clean crash recovery was performed:

   Tue Nov 13 18:30:20 2007
   ALTER DATABASE OPEN
   Tue Nov 13 18:30:20 2007
   Beginning crash recovery of 1 threads
    parallel recovery started with 2 processes
   Tue Nov 13 18:30:20 2007
   Started redo scan
   Tue Nov 13 18:30:30 2007
   Completed redo scan
    1601030 redo blocks read, 42805 data blocks need recovery
   Tue Nov 13 18:30:33 2007
   Started redo application at
    Thread 1: logseq 31, block 468530
   Tue Nov 13 18:30:33 2007
   Recovery of Online Redo Log: Thread 1 Group 3 Seq 31 Reading mem 0
     Mem# 0: /u02/oradata/AGP/log3AGP.ora
   Tue Nov 13 18:30:35 2007
   Recovery of Online Redo Log: Thread 1 Group 4 Seq 32 Reading mem 0
     Mem# 0: /u03/oradata/AGP/log4AGP.ora
   Tue Nov 13 18:30:49 2007
   Recovery of Online Redo Log: Thread 1 Group 1 Seq 33 Reading mem 0
     Mem# 0: /u02/oradata/AGP/log1AGP.ora
   Tue Nov 13 18:31:06 2007
   Recovery of Online Redo Log: Thread 1 Group 2 Seq 34 Reading mem 0
     Mem# 0: /u03/oradata/AGP/log2AGP.ora
   Tue Nov 13 18:31:14 2007
   Completed redo application
   Tue Nov 13 18:31:14 2007
   Completed crash recovery at
    Thread 1: logseq 34, block 282970, scn 1241643
    42805 data blocks read, 42805 data blocks written, 1601030 redo blocks read
   Tue Nov 13 18:31:14 2007
   LGWR: STARTING ARCH PROCESSES
[edited by: Vitaliy at 18:32 (CST) on Nov. 14, 2007]