Recovering from quarterly PSU patching disaster

Posted in: Technical Track

Applying GRID INFRASTRUCTURE PATCH SET UPDATE 11.2.0.4.181016(Patch 28429134) for AIX (64-Bit) was a disaster for Oracle Restart environment, as described in a previous post.

# $GRID_HOME/OPatch/opatch auto $GI_PSU -ocmrf /tmp/ocm.rsp

patch /stage/psu/11g/GI/2018OCT/28429134/27735020/custom/server/27735020  apply successful for home  /u01/app/oracle/product/11.2.0/db_1
patch /stage/psu/11g/GI/2018OCT/28429134/28204707  apply failed  for home  /u01/app/oracle/product/11.2.0/db_1

patch /stage/psu/11g/GI/2018OCT/28429134/27735020  apply successful for home  /u01/11.2.0/grid
patch /stage/psu/11g/GI/2018OCT/28429134/28204707  apply failed  for home  /u01/11.2.0/grid

After failed patching, the following commands failed; hence, a rollback was not a feasible option and we needed to restore GI and DB Home.

# crsctl config has
exec(): 0509-036 Cannot load program crsctl.bin because of the following errors:
rtld: 0712-001 Symbol ztca_Shutdown was referenced
      from module crsctl.bin(), but a runtime definition
      of the symbol was not found.
rtld: 0712-001 Symbol ztpk_SetKeyInfo was referenced
      from module /u01/11.2.0/grid/lib/libhasgen11.so(), but a runtime definition
      of the symbol was not found.
rtld: 0712-001 Symbol ztpk_DestroyKey was referenced
      from module /u01/11.2.0/grid/lib/libhasgen11.so(), but a runtime definition
      of the symbol was not found.
rtld: 0712-001 Symbol ztpk_Sign was referenced
      from module /u01/11.2.0/grid/lib/libhasgen11.so(), but a runtime definition
      of the symbol was not found.
rtld: 0712-001 Symbol ztpk_Verify was referenced
      from module /u01/11.2.0/grid/lib/libhasgen11.so(), but a runtime definition
      of the symbol was not found.
rtld: 0712-002 fatal error: exiting.

$ srvctl config database -d $ORACLE_SID
exec(): 0509-036 Cannot load program getcrshome because of the following errors:
rtld: 0712-001 Symbol ztpk_SetKeyInfo was referenced
      from module /u01/app/oracle/product/11.2.0/db_1/lib/libhasgen11.so(), but a runtime definition
      of the symbol was not found.
rtld: 0712-001 Symbol ztpk_DestroyKey was referenced
      from module /u01/app/oracle/product/11.2.0/db_1/lib/libhasgen11.so(), but a runtime definition
      of the symbol was not found.

Below are the high level steps we implemented to restore, reconfigure, and relink GI and DB Home. These may vary by environment.

Restore GI HOME:
cd $GRID_HOME
cp /backup/GIHome.tar .
tar -xvpf GIHome.tar
$GRID_HOME/crs/install/roothas.pl -unlock
cd $GRID_HOME/rdbms/lib/
ls -l config.o
mv config.o config.o.bkup
$GRID_HOME/bin/relink all

# ./rootadd_rdbms.sh
# ./roothas.pl -patch

Restore DB HOME:
cd $ORACLE_HOME
cp /backup/DBHome.tar .
tar -xvpf DBHome.tar
$ORACLE_HOME/bin/relink all

In conclusion, don’t depend solely on rollback to recover from failed patching. It might be a good idea to back up binaries.

Some useful references:
How To Relink The Oracle Grid Infrastructure Standalone (Restart) Installation Or Oracle Grid Infrastructure RAC/Cluster Installation (11.2 or 12c). ( Doc ID 1536057.1 )
How to backup a Grid Infrastructure installation (Doc ID 1482803.1)
Relinking Oracle Home FAQ ( Frequently Asked Questions) (Doc ID 1467060.1)

It’s nice to see Oracle’s updated Patch 28429134 has been withdrawn.

Master Note for Database Proactive Patch Program (Doc ID 756671.1)
GI PSU 11.2.0.4.181016 Patch 28429134
All except HP-UX PA-RISC (64-Bit), AIX (64-Bit)
ETA: 13-November-2018

email
Want to talk with an expert? Schedule a call with our team to get the conversation started.

5 Comments. Leave new

Hi Michael,

Binary restore after patch failure can be pretty challenging at times for DBAs. I would like to add on more point when backing up $OH and $GI i.e further gzipping tar ball to preserve some space.

tar -cf – ${target1} | gzip > ${bkp_loc}/ORACLE_HOME_Bkp_${db}_${date}.tar.gz

Regards,
Maaz Khan

Reply
Michael Dinh
March 13, 2019 8:45 am

Thank you Maaz for sharing.

Here is the updated version I am planing/liking to use.

nohup tar -czvpf /tmp/backup_`echo $ORACLE_HOME | awk -F/ ‘{print $NF}’`_”$(hostname -s)”.tar.gz . > /tmp/backup_`echo $ORACLE_HOME | awk -F/ ‘{print $NF}’`_”$(hostname -s)”.log 2>&1 &

Please try it out and let me know what you think.

Reply

Hi Michael,

I am liking it too. :)

Just tried it recently. .log file give a clearer way of tracking files being backed-up and reviewing any files for failures, specially permission issues when backing up grid binaries.

Regards,
Maaz

Reply
Michael Dinh
May 24, 2019 8:24 am

Hello Maaz,

It might be better to use root to avoid permission issues. I have found instances where DB has items owned by root. tar -czvpf (p – preserve permission).

-Michael.

Reply

Hi Michael,

Yes, we can always have these commands run as root user (requesting os admins run them for us). Your command will help me get log file and review it just to be sure grid binaries are backed-up before I move ahead for patching.

Regards,
Maaz

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *