How I Finished a GI OOP Patching From 19.6 to 19.8 After Facing cluutil: No Such File or Directory and clsrsc-740 Errors

Posted in: Oracle, Technical Track

This past weekend I was doing a production grid infrastructure (GI) out-of-place patching (OOP) from 19.6 to 19.8 for a client. While doing this exercise, I hit a couple of bugs (bugs 20785766 and 27554103).

This post explains how I solved them. I hope it saves you a lot of time if you ever face these issues.

As I have already blogged in the past on how to do a GI OOP, I won’t go into all the details of this process. I will just address those relevant to today’s post.

I did the switchGridHome from 19.6 to 19.8 without any issues and successfully ran root.sh in node1.

[grid@hostname1 grid]$ ./gridSetup.sh -switchGridHome -silent
Launching Oracle Grid Infrastructure Setup Wizard...

You can find the log of this install session at:
 /u01/app/oraInventory/logs/cloneActions2020-11-20_09-10-17PM.log


As a root user, execute the following script(s):
        1. /u01/app/19.8.0.0/grid/root.sh

Execute /u01/app/19.8.0.0/grid/root.sh on the following nodes:
[hostname1, hostname2]

Run the scripts on the local node first. After successful completion, run the scripts in sequence on all other nodes.

Successfully Setup Software.
...
[root@hostname1 ~]# /u01/app/19.8.0.0/grid/root.sh
Check /u01/app/19.8.0.0/grid/install/root_oracle-db01-s01_2020-11-20_21-13-24-032842094.log for the output of root script

When I ran the root.sh in node2, I ran into the error The CRS executable file ‘clsecho’ does not exist. I went and checked, and indeed the file didn’t exist in GI_HOME/bin. Doing a check between node1 and node2, there was a difference of about 100 files for this directory.

[root@hostname2 ~]$ /u01/app/19.8.0.0/grid/root.sh
Check /u01/app/19.8.0.0/grid/install/root_hostname2_2020-11-21_03-53-47-360707303.log for the output of root script

[root@hostname2 ~]$ tail /u01/app/19.8.0.0/grid/install/root_hostname2_2020-11-21_03-53-47-360707303.log
2020-11-20 21:42:27: The 'ROOTCRS_PREPATCH' is either in START/FAILED state
2020-11-20 21:42:27:  The CRS executable file /u01/app/19.8.0.0/grid/bin/cluutil either does not exist or is not executable
2020-11-20 21:42:27: Invoking "/u01/app/19.8.0.0/grid/bin/cluutil -ckpt -oraclebase /u01/app/oracle -chkckpt -name ROOTCRS_PREPATCH -status"
2020-11-20 21:42:27: trace file=/u01/app/oracle/crsdata/hostname2/crsconfig/cluutil3.log
2020-11-20 21:42:27: Running as user grid: /u01/app/19.8.0.0/grid/bin/cluutil -ckpt -oraclebase /u01/app/oracle -chkckpt -name ROOTCRS_PREPATCH -status
2020-11-20 21:42:27: Removing file /tmp/X9bxqSWx3c
2020-11-20 21:42:27: Successfully removed file: /tmp/X9bxqSWx3c
2020-11-20 21:42:27: pipe exit code: 32512
2020-11-20 21:42:27: /bin/su exited with rc=127

2020-11-20 21:42:27: bash: /u01/app/19.8.0.0/grid/bin/cluutil: No such file or directory

2020-11-20 21:42:27:  The CRS executable file /u01/app/19.8.0.0/grid/bin/clsecho either does not exist or is not executable
2020-11-20 21:42:27: The CRS executable file 'clsecho' does not exist.
2020-11-20 21:42:27: ###### Begin DIE Stack Trace ######
2020-11-20 21:42:27:     Package         File                 Line Calling
2020-11-20 21:42:27:     --------------- -------------------- ---- ----------
2020-11-20 21:42:27:  1: main            rootcrs.pl            357 crsutils::dietrap
2020-11-20 21:42:27:  2: crspatch        crspatch.pm          2815 main::__ANON__
2020-11-20 21:42:27:  3: crspatch        crspatch.pm          2203 crspatch::postPatchRerunCheck
2020-11-20 21:42:27:  4: crspatch        crspatch.pm          2015 crspatch::crsPostPatchCkpts
2020-11-20 21:42:27:  5: crspatch        crspatch.pm           394 crspatch::crsPostPatch
2020-11-20 21:42:27:  6: main            rootcrs.pl            370 crspatch::new
2020-11-20 21:42:27: ####### End DIE Stack Trace #######

2020-11-20 21:42:27:  checkpoint has failed

########################################################################
## Difference of Number of files between node1 and node2
########################################################################
[root@hostname1 ~]$ ls -ltr /u01/app/19.8.0.0/grid/bin | wc -l
405
[root@hostname2 ~]$ ls -ltr /u01/app/19.8.0.0/grid/bin | wc -l
303

The first thing I did after the failure, was to check the status of the cluster with Fred Dennis’s script rac_status. I found that everything was up and the crs status was in ROLLING PATCH mode. The crs was running with the 19.8 version in node1, and with the 19.6 version in node2 .

[grid@hostname1 antunez]$ ./rac_status.sh -a

                Cluster rene-ace-cluster

        Type      |      Name      |      hostname1    |      hostname2      |
  ---------------------------------------------------------------------------
   MGMTLSNR       | MGMTLSNR       |       Online       |          -         |
   asm            | asm            |       Online       |       Online       |
   asmnetwork     | asmnet1        |       Online       |       Online       |
   chad           | chad           |       Online       |       Online       |
   cvu            | cvu            |          -         |       Online       |
   dg             | ORAARCH        |       Online       |       Online       |
   dg             | ORACRS         |       Online       |       Online       |
   dg             | ORADATA        |       Online       |       Online       |
   dg             | ORAFLASHBACK   |       Online       |       Online       |
   dg             | ORAREDO        |       Online       |       Online       |
   network        | net1           |       Online       |       Online       |
   ons            | ons            |       Online       |       Online       |
   qosmserver     | qosmserver     |          -         |       Online       |
   vip            | hostname1      |       Online       |          -         |
   vip            | hostname2      |          -         |       Online       |
   vip            | scan1          |       Online       |          -         |
   vip            | scan2          |          -         |       Online       |
   vip            | scan3          |          -         |       Online       |
  ---------------------------------------------------------------------------
    x  : Resource is disabled
       : Has been restarted less than 24 hours ago
       : STATUS and TARGET are different

      Listener    |      Port      |     hostname1      |      hostname2     |     Type     |
  ------------------------------------------------------------------------------------------
   ASMNET1LSNR_ASM| TCP:1526       |       Online       |       Online       |   Listener   |
   LISTENER       | TCP:1521,1525  |       Online       |       Online       |   Listener   |
   LISTENER_SCAN1 | TCP:1521,1525  |       Online       |          -         |     SCAN     |
   LISTENER_SCAN2 | TCP:1521,1525  |          -         |       Online       |     SCAN     |
   LISTENER_SCAN3 | TCP:1521,1525  |          -         |       Online       |     SCAN     |
  ------------------------------------------------------------------------------------------
       : Has been restarted less than 24 hours ago

         DB       |     Version    |      hostname1     |      hostname2     |    DB Type   |
  ------------------------------------------------------------------------------------------
   mgm            |            (2) |        Open        |         -          |  MGMTDB (P)  |
   prod           | 12.1.0     (1) |        Open        |        Open        |    RAC (P)   |
  ------------------------------------------------------------------------------------------
  ORACLE_HOME references listed in the Version column ("''" means "same as above")

         1 : /u01/app/oracle/product/12.1.0/db_1        oracle oinstall
         2 : %CRS_HOME%                                 grid      ''

       : Has been restarted less than 24 hours ago
       : STATUS and TARGET are different

[grid@hostname1 antunez]$ crsctl query crs activeversion -f
Oracle Clusterware active version on the cluster is [19.0.0.0.0]. The cluster upgrade state is [ROLLING PATCH]. The cluster active patch level is [2701864972].

I found MOS note Grid Infrastructure root script (root.sh etc) fails as remote node missing binaries (Doc ID 1991928.1) It explains there’s a bug (20785766) in the GI installer in 12.1 for files missing in the GI_HOME/bin and/or GI_HOME/lib. Even though the document mentions 12.1, I hit it with the 19.8 version. It applied to my issue, so I did what the note says which is:

“… the workaround is to manually copy missing files from the node where installer was started and re-run root script.”

I excluded the soft link lbuilder as that was already created in the second node. I also changed ownership of root:oinstall to the GI_HOME/bin files in node2.

########################################################################
## From node2
########################################################################
[root@hostname2 bin]# ls -al | grep "lbuilder"
lrwxrwxrwx.  1 grid oinstall        24 Nov 20 21:10 lbuilder -> ../nls/lbuilder/lbuilder

########################################################################
## From node1
########################################################################
[root@hostname1 ~]$ cd /u01/app/19.8.0.0/grid/bin 
[root@hostname1 ~]$ find . ! -name "lbuilder" | xargs -i scp {} hostname2:/u01/app/19.8.0.0/grid/bin

########################################################################
## Difference of Number of files between node1 and node2
########################################################################
[root@hostname1 ~]$ ls -ltr /u01/app/19.8.0.0/grid/bin | wc -l
405
[root@hostname2 ~]$ ls -ltr /u01/app/19.8.0.0/grid/bin | wc -l
405

########################################################################
## Changed the ownership to root:oinstall in hostname2
########################################################################
[root@hostname2 ~]$ cd /u01/app/19.8.0.0/grid/bin 
[root@hostname2 bin]$ chown root:oinstall ./*

Now that I had copied the files, I did a relink of the GI_HOME in node2, using this documentation note, as the sticky bits were lost with the scp.

A few notes on the relink in this situation:

  1. As the active GI binaries in node2 were still from the 19.6 GI_HOME, I didn’t need to run rootcrs.sh -unlock.
  2. I didn’t run rootadd_rdbms.sh, as this runs as part of the /u01/app/19.8.0.0/grid/root.sh which I was going to rerun after the fix above.
  3. Similar to point 1, I didn’t run rootcrs.sh -lock.
[grid@hostname2 ~]$ export ORACLE_HOME=/u01/app/19.8.0.0/grid
[grid@hostname2 ~]$ $ORACLE_HOME/bin/relink

After the relink, I reran the /u01/app/19.8.0.0/grid/root.sh in node2. This time I received a new error — CLSRSC-740: inconsistent options specified to the postpatch command.

[root@hostname2 ~]$ /u01/app/19.8.0.0/grid/root.sh
Check /u01/app/19.8.0.0/grid/install/crs_postpatch_hostname2_2020-11-20_11-39-26PM.log for the output of root script

[root@hostname2 ~]$ tail /u01/app/19.8.0.0/grid/install/crs_postpatch_hostname2_2020-11-20_11-39-26PM.log

2020-11-20 23:39:28: NONROLLING=0

2020-11-20 23:39:28: Succeeded to get property value:NONROLLING=0

2020-11-20 23:39:28: Executing cmd: /u01/app/19.8.0.0/grid/bin/clsecho -p has -f clsrsc -m 740
2020-11-20 23:39:28: Executing cmd: /u01/app/19.8.0.0/grid/bin/clsecho -p has -f clsrsc -m 740
2020-11-20 23:39:28: Command output:
>  CLSRSC-740: inconsistent options specified to the postpatch command
>End Command output
2020-11-20 23:39:28: CLSRSC-740: inconsistent options specified to the postpatch command
2020-11-20 23:39:28: ###### Begin DIE Stack Trace ######
2020-11-20 23:39:28:     Package         File                 Line Calling
2020-11-20 23:39:28:     --------------- -------------------- ---- ----------
2020-11-20 23:39:28:  1: main            rootcrs.pl            357 crsutils::dietrap
2020-11-20 23:39:28:  2: crspatch        crspatch.pm          2212 main::__ANON__
2020-11-20 23:39:28:  3: crspatch        crspatch.pm          2015 crspatch::crsPostPatchCkpts
2020-11-20 23:39:28:  4: crspatch        crspatch.pm           394 crspatch::crsPostPatch
2020-11-20 23:39:28:  5: main            rootcrs.pl            370 crspatch::new
2020-11-20 23:39:28: ####### End DIE Stack Trace #######

2020-11-20 23:39:28:  checkpoint has failed

After investigation I saw that the checkpoint ROOTCRS_PREPATCH status was marked as successful from the previous failed run of the root.sh command.

[grid@hostname2 ~]$ /u01/app/19.8.0.0/grid/bin/cluutil -ckpt -oraclebase /u01/app/oracle -chkckpt -name ROOTCRS_PREPATCH -status
SUCCESS

Continuing to investigate showed that this error was part of bug 27554103. I solved this error by changing the checkpoint  ROOTCRS_PREPATCH to the status “start” and rerunning the /u01/app/19.8.0.0/grid/root.sh in node2. 

[root@hostname2 ~]# /u01/app/19.8.0.0/grid/bin/cluutil -ckpt -oraclebase /u01/app/oracle -writeckpt -name ROOTCRS_PREPATCH -state START

[root@hostname2 ~]# /u01/app/19.8.0.0/grid/bin/cluutil -ckpt -oraclebase /u01/app/oracle -chkckpt -name ROOTCRS_PREPATCH -status
START

[root@hostname2 ~]# /u01/app/19.8.0.0/grid/root.sh
Check /u01/app/19.8.0.0/grid/install/root_hostname2_2020-11-21_03-53-47-360707303.log for the output of root script

After completing the steps above, I saw everything was now as it should be in both nodes and the cluster upgrade state was in NORMAL state.

[grid@hostname1 antunez]$ ./rac_status.sh -a

                Cluster rene-ace-cluster

        Type      |      Name      |      hostname1    |      hostname2      |
  ---------------------------------------------------------------------------
   MGMTLSNR       | MGMTLSNR       |       Online       |          -         |
   asm            | asm            |       Online       |       Online       |
   asmnetwork     | asmnet1        |       Online       |       Online       |
   chad           | chad           |       Online       |       Online       |
   cvu            | cvu            |          -         |       Online       |
   dg             | ORAARCH        |       Online       |       Online       |
   dg             | ORACRS         |       Online       |       Online       |
   dg             | ORADATA        |       Online       |       Online       |
   dg             | ORAFLASHBACK   |       Online       |       Online       |
   dg             | ORAREDO        |       Online       |       Online       |
   network        | net1           |       Online       |       Online       |
   ons            | ons            |       Online       |       Online       |
   qosmserver     | qosmserver     |          -         |       Online       |
   vip            | hostname1      |       Online       |          -         |
   vip            | hostname2      |          -         |       Online       |
   vip            | scan1          |       Online       |          -         |
   vip            | scan2          |          -         |       Online       |
   vip            | scan3          |          -         |       Online       |
  ---------------------------------------------------------------------------
    x  : Resource is disabled
       : Has been restarted less than 24 hours ago
       : STATUS and TARGET are different

      Listener    |      Port      |     hostname1      |      hostname2     |     Type     |
  ------------------------------------------------------------------------------------------
   ASMNET1LSNR_ASM| TCP:1526       |       Online       |       Online       |   Listener   |
   LISTENER       | TCP:1521,1525  |       Online       |       Online       |   Listener   |
   LISTENER_SCAN1 | TCP:1521,1525  |       Online       |          -         |     SCAN     |
   LISTENER_SCAN2 | TCP:1521,1525  |          -         |       Online       |     SCAN     |
   LISTENER_SCAN3 | TCP:1521,1525  |          -         |       Online       |     SCAN     |
  ------------------------------------------------------------------------------------------
       : Has been restarted less than 24 hours ago

         DB       |     Version    |      hostname1     |      hostname2     |    DB Type   |
  ------------------------------------------------------------------------------------------
   mgm            |            (2) |        Open        |         -          |  MGMTDB (P)  |
   prod           | 12.1.0     (1) |        Open        |        Open        |    RAC (P)   |
  ------------------------------------------------------------------------------------------
  ORACLE_HOME references listed in the Version column ("''" means "same as above")

         1 : /u01/app/oracle/product/12.1.0/db_1        oracle oinstall
         2 : %CRS_HOME%                                 grid      ''

       : Has been restarted less than 24 hours ago
       : STATUS and TARGET are different

[grid@hostname1 antunez]$ crsctl query crs activeversion -f
Oracle Clusterware active version on the cluster is [19.0.0.0.0]. The cluster upgrade state is [NORMAL]. The cluster active patch level is [441346801].

Hopefully this blog post saves you from a few headaches and working long hours overnight if you ever hit these two bugs while doing an OOP for your 19.x GI.

Note: This was originally published on rene-ace.com.

email

Author

Want to talk with an expert? Schedule a call with our team to get the conversation started.

About the Author

Currently I am an Oracle ACE ; Speaker at Oracle Open World, Oracle Developers Day, OTN Tour Latin America and APAC region and IOUG Collaborate ; Co-President of ORAMEX (Mexico Oracle User Group); At the moment I am an Oracle Project Engineer at Pythian. In my free time I like to say that I'm Movie Fanatic, Music Lover and bringing the best from México (Mexihtli) to the rest of the world and in the process photographing it ;)

No comments

Leave a Reply

Your email address will not be published. Required fields are marked *