RAC with asm on AIX, CSS Initialization wait event , ORA-01114 error


Need to check the oracle bug : OCSSD threads are not set to the correct priority.
[oracle document id : 1493943.1]
OCSSD.BIN threads must be running in Real-Time.


SOLUTION :
1. bug 13940331, fixed in 11.2.0.4, request/apply patch 13940331 if it affects business.
2. bug 16586971, fixed in 12.1.0.2, request/apply patch 13940331 if it affects business.

/usr/sysv/bin/ps -eLo user,s,pid,lwp,pri,args | grep ocss

[on AIX]
/usr/sysv/bin/ps -eLo  user,s,pid,lwp,pri,args | grep ocss 
       grid S   6947068  16187409  60 /data/grid/bin/ocssd.bin  
       grid S   6947068  25690307   0 /data/grid/bin/ocssd.bin  


60 is not real time priority.

1. Our Environment

  • 11g RAC (with asm) R2 on AIX
  • version : 11.2.0.3

2. Our Symptoms

  • Session is waiting for a long time, “CSS Initialization” 
  • ORA-01114 raised
  • Database alert log
  • Tue Oct 02 16:02:39 2012
    Errors in file mydb_ora_55707154.trc:
    ORA-01114: IO error writing block to file (block # )
  • Trace file
2012-09-02 16:02:39.409: [ CSSCLNT]clssscConnect: gipcWait failed with 16 (12)
2012-09-02 16:02:39.409: [ CSSCLNT]clsssInitNative: connect to (ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=61100)) failed, rc 16
kgxgncin: CLSS init failed with status 3
kgxgncin: return status 3 (1311719766 SKGXN not av) from CLSS
kjfmsgr: unable to connect to NM for reg in shared group
ORA-01114 .....
  • ocssd.log
  • 2012-10-02 16:01:25.645: [GIPCXCPT][1029] gipcmodClscCallback: async request failed req 122591870 [0000000000737522] { gipcSendRequest : addr '', data 122174250, len 48, olen 0, parentEndp 128bd5250, ret gipcretConnectionLost (12), objFlags 0x0, reqFlags 0x224 }, ret gipcretConnectionLost (12)
    2012-10-02 16:01:25.648: [GIPCXCPT][1029] gipcmodMuxTransferAccept: internal accept request failed endp 111274cd0, child 128bd5250, ret gipcretConnectionInvalid (13)
    2012-10-02 16:01:25.648: [ GIPCMUX][1029] gipcmodMuxTransferAccept: EXCEPTION[ ret gipcretConnectionInvalid (13) ] error during accept on endp 111274cd0
    2012-10-02 16:01:25.649: [GIPCXCPT][1029] gipcmodClscCallback: async request failed req 1221466b0 [000000000073754d] { gipcSendRequest : addr '', data 1157026f0, len 48, olen 0, parentEndp 128eeae10, ret gipcretConnectionLost (12), objFlags 0x0, reqFlags 0x224 }, ret gipcretConnectionLost (12)
    2012-10-02 16:01:25.650: [GIPCXCPT][1029] gipcmodMuxTransferAccept: internal accept request failed endp 111274cd0, child 128eeae10, ret gipcretConnectionInvalid (13)
    2012-10-02 16:01:25.650: [ GIPCMUX][1029] gipcmodMuxTransferAccept: EXCEPTION[ ret gipcretConnectionInvalid (13) ] error during accept on endp 111274cd0