The other day I did a switchover in a RAC environment , which went pretty smooth , but after doing the switchover in the primary, I kept getting the following error:
select dest_name,status,error from gv$archive_dest_status where dest_id=2;
DEST_NAME
--------------------------------------------------------------------------------
STATUS ERROR
--------- -----------------------------------------------------------------
LOG_ARCHIVE_DEST_2
ERROR ORA-01033: ORACLE initialization or shutdown in progress
LOG_ARCHIVE_DEST_2
ERROR ORA-01033: ORACLE initialization or shutdown in progress
I went and checked the standby, and saw the standby was in recover mode and waiting for the redo log
PROCESS STATUS CLIENT_P CLIENT_PID THREAD# SEQUENCE# BLOCK# ACTIVE_AGENTS KNOWN_AGENTS
------- ------------ -------- ---------- ---------------- ---------------- ---------------- ---------------- ----------------
ARCH CONNECTED ARCH 44474 0 0 0 0 0
RFS IDLE ARCH 133318 0 0 0 0 0
RFS IDLE ARCH 50602 0 0 0 0 0
ARCH CLOSING ARCH 44470 1 21623 14336 0 0
ARCH CLOSING ARCH 44476 1 21624 1 0 0
ARCH CLOSING ARCH 44472 2 19221 96256 0 0
RFS IDLE LGWR 133322 1 21625 17157 0 0
RFS IDLE LGWR 50620 2 19222 36611 0 0
MRP0 WAIT_FOR_LOG N/A N/A 2 19222 36617 0 0
My first train of thought was that the password file was incorrect, so I recreated them and copied them from the primary to the standby nodes, but I still kept getting the same error. I reviewed the environment with the scripts in DOC ID 1581388.1 and everything seemed alright. It really kept bugging me that the logs were not being applied even though the logs were being shipped to the standby (so it did have to do with the password file), but what really bothered me, was that I had just recreated the password file in $ORACLE_HOME/dbs and I still kept getting the same error.
So after a while of troubleshooting, I found that in the new primary the password file was residing in an ASM Diskgroup, and that was the main culprit. This meant that I had to copy the password file from the ASM diskgroup in the primary to the standby.
Primary
[[email protected] trace]$ srvctl config database -d renedb
Database unique name: renedb
Database name:
Oracle home: /u01/app/oracle/product/12.1.0.2/db_1
Oracle user: oracle
Spfile: +DATA1/renedb/spfilerenedb.ora
Password file: +DATA1/renedb/PASSWORD/pwrenedb
Domain:
Start options: open
Stop options: immediate
Database role: PHYSICAL_STANDBY
Management policy: AUTOMATIC
Server pools:
Disk Groups: ARCH1,DATA1,REDO
Mount point paths:
Services:
Type: RAC
Start concurrency:
Stop concurrency:
OSDBA group: dba
OSOPER group: oper
Database instances: renedb1,renedb2
Configured nodes: localhost,localhost
Database is administrator managed
[[email protected] trace]$ exit
-bash-4.1$ sudo su - grid
[sudo] password for pythian:
[[email protected] ~]$ . oraenv
ORACLE_SID = [+ASM1] ?
The Oracle base remains unchanged with value /u01/app/grid
[[email protected] ~]$ asmcmd
ASMCMD> pwcopy +DATA1/renedb/PASSWORD/pwrenedb /tmp/pwrenedb
copying +DATA1/renedb/PASSWORD/pwrenedb -> /tmp/pwrenedb
ASMCMD> exit
Standby
[[email protected] dbs]$ scp 10.10.0.1:/tmp/pwrenedb /tmp/pwrenedb_stby
pwrenedb_stby_phdb 100% 7680 7.5KB/s 00:00
[[email protected] dbs]$ exit
logout
[[email protected] ~]$ sudo su - grid
[sudo] password for pythian:
Last login: Fri Mar 31 21:55:53 MST 2017
[[email protected] ~]$ . oraenv
ORACLE_SID = [grid] ? +ASM1
The Oracle base has been set to /u01/app/grid
[[email protected] ~]$ asmcmd
ASMCMD> mkdir DATA/RENEDB/PASSWORD
ASMCMD> pwcopy /tmp/pwrenedb_stby_phdb +DATA/RENEDB/PASSWORD/pwrenedb_stby
copying /tmp/pwrenedb_stby_phdb -> +DATA/RENEDB/PASSWORD/pwrenedb_stby
ASMCMD> exit
[[email protected] ~]$ exit
logout
[[email protected] ~]$ sudo su - oracle
Last login: Sat Apr 1 01:35:46 MST 2017 on pts/4
The Oracle base has been set to /u01/app/oracle
[[email protected] dbs]$ srvctl modify database -d renedb_stby -pwfile +DATA/RENEDB/PASSWORD/pwrenedb_stby
[[email protected] dbs]$ srvctl config database -d renedb_stby
Database unique name: renedb_stby
Database name:
Oracle home: /u01/app/oracle/product/12.1.0/dbhome_1
Oracle user: oracle
Spfile: /u01/app/oracle/product/12.1.0/dbhome_1/dbs/spfilerenedb_stby.ora
Password file: +DATA/RENEDB/PASSWORD/pwrenedb_stby
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools:
Disk Groups: ARCH,DATA,REDO
Mount point paths:
Services:
Type: RAC
Start concurrency:
Stop concurrency:
OSDBA group: dba
OSOPER group: oper
Database instances: renedb_stby1,renedb_stby2
Configured nodes: *******,***********
Database is administrator managed
Once I did this, the standby started applying the redo logs and after the gap was closed the Primary switchover status was “TO STANDBY”
Primary
Primary Site last generated SCN
*******************************
DB_UNIQUE_NAME SWITCHOVER_STATUS CURRENT_SCN
--------------- -------------------- ----------------
renedb TO STANDBY 134480468945
Standby
Data Guard Apply Lag
********************
NAME LAG_TIME DATUM_TIME TIME_COMPUTED
------------ -------------------- -------------------- --------------------
apply lag +00 00:00:00 04/01/2017 04:05:51 04/01/2017 04:05:52
1 row selected.
Data Guard Gap Problems
***********************
no rows selected
PROCESS STATUS CLIENT_P CLIENT_PID THREAD# SEQUENCE# BLOCK# ACTIVE_AGENTS KNOWN_AGENTS
------- ------------ -------- ---------- ---------------- ---------------- ---------------- ---------------- ----------------
ARCH CONNECTED ARCH 44474 0 0 0 0 0
RFS IDLE ARCH 133318 0 0 0 0 0
RFS IDLE ARCH 50602 0 0 0 0 0
ARCH CLOSING ARCH 44470 1 21623 14336 0 0
ARCH CLOSING ARCH 44476 1 21624 1 0 0
ARCH CLOSING ARCH 44472 2 19221 96256 0 0
RFS IDLE LGWR 133322 1 21625 17157 0 0
RFS IDLE LGWR 50620 2 19222 36611 0 0
MRP0 APPLYING_LOG N/A N/A 2 19222 36617 33 33
9 rows selected.
Conclusion
In 12.1 it is recommended as per DOC ID 1984091.1, to have the password file in ASM diskgroups. So once I did this, I was able to workaround error ORA-01033 and able to sleep well!
Note.- This was originally published in rene-ace.com
4 Comments. Leave new
Will the asm remote cp syntax only work with ASM instances in the same cluster? i.e. is that why you had to do a file system copy on both nodes?
No real reason as to why using I used file system copy instead of remote ASM , you can use ASM remote cp if you can
Hi,
great article.
If you use the “–dbuniquename” option for the pwcopy, then you don’t need to use “srvctl modify database”, because the resource will be automatically updated.
asmcmd pwcopy –dbuniquename RENEDB_STBY /tmp/pwrenedb_stby_phdb +DATA/RENEDB/PASSWORD/pwrenedb_stby
Hi Dejan
Thank you for the great suggestion I will try it out next time I have to do it :)
Regards
René