Oracle RAC on the Cloud, Part 3

Posted in: Cloud, Technical Track

This is part 3 of a multipart series of getting Oracle RAC running on a cloud environment. In part 1, we set up a NFS server for shared storage. In part 2, we set up OS components for each RAC server. Now we finish up the OS configuration and move to the Oracle grid infrastructure.

Passwordless SSH, take two

Now that we have Oracle users on both rac01 and rac02, we need to configure passwordless SSH between them. (It’s also possible to do it from the installer, but I prefer to do it myself)

On rac01-pub as Oracle:

cd ~/.ssh
scp rac02-pub:$PWD/id_rsa.pub rac02.pub
(enter oracle user password, and confirm the hostkey addition)
cat rac02.pub >> authorized_keys

And on rac02-pub, again as oracle:

cd ~/.ssh
scp rac01-pub:$PWD/id_rsa.pub rac01.pub
(shouldn't have a password prompt, but you can confirm hostkey addition)
cat rac01.pub >> authorized_keys

Getting RAM for the install

Before we run the Oracle installer, we should expand the physical RAM for each machine. This can be done from the Gandi control panel for each server. When I first tried this I got a quota error, and had to raise a support ticket (and wait for a response) to get the quota raised. A second issue with the RAM is that the the VM doesn’t see the full amount of RAM allocated: when I tried firing up a 4GB instance, Linux only saw 3667716k available, and the Oracle installer promptly complained about insufficient memory.

So instead of 4096MB of memory, we’re going to adjust rac01 and rac02 to have 4800MB. After adjusting in the control panel, you may see that the operation is complete within a minute or so, but the server didn’t consistently get more memory. So while logged onto rac01 as oracle, have a look:

for host in rac01-pub rac02-pub; do echo $host; ssh $host free; done

If each node shows 4388612 total memory, you’re golden. Otherwise, reboot the nodes.
(And yes, 700mb seems like an awful amount of memory to simply not be available to the OS; I’m wondering what’s using the space?)

Getting ready for the installer

By now the Oracle software download should be complete, and we need to give the downloaded files .zip extensions and install an unzipper to use it. (Note to Oracle packagers: unzip isn’t _all_ that common in the Linux world, and gzip provides better compression rates anyways. Why not do tarballs?)

Back on rac01:

cd /srv/datadisk01/dl
for file in *zip?AuthParam*[0-9a-f]; do sudo mv "$file" "$file".zip; done
sudo yum -y install unzip
for file in *.zip; do sudo unzip $file; done

Now setting up a remote VNC so that we can actually run the Oracle installer, as well as a firewall rule to let us connect:

sudo yum -y install tigervnc-server xterm twm
sudo iptables -I INPUT 4 -p tcp --dport 5901 -j ACCEPT

Starting the server; you’ll need to supply a password the first time around. (still logged in as oracle)

vncpasswd
vncserver :1

Now we start a VNC viewer locally. If you don’t have one already, you can download one from www.realvnc.com

Grid Infrastructure Install

Connecting to display 1 of the server external IP, you should get an xterm window if all went well. Running the installer:

cd /srv/datadisk01/dl/grid
./runInstaller

Skipping software updates, and doing a cluster install, using a standard cluster. Doing an advanced install. Using the default language. Under “grid plug and play” we need to set up the node naming. Using cluster name “rac-cluster”, and SCAN name “rac-cluster” as defined in /etc/hosts earlier. On the cluster node screen, we should see that rac01-pub has been detected. Adding rac02-pub too, with rac02-pub-vip as its VIP address.

Now comes the validation, where we learn if SSH, naming etc were properly set up. If all goes well, you’ll make it to the network interface usage screen. Here we need to make changes: eth0 shouldn’t be used, eth1 is public, and eth2 is private. The management repository is a choice: it takes up memory and install time, but it does allow us to use such things as QoS management, and it can only be created at install time. I chose to skip.

For storage, we’re using a shared file system: the NFS we created. Using external redundancy since it’s a single disk anyway. Doing the same for the voting disk.

Not using IPMI. We’ll also leave the ASM oper group blank, and accept the warning.

Using the default “/u01/app/oracle” and “/u01/app/12.1.0/grid” directories for ORACLE_BASE and grid home. Using /u01/app/oraInventory for oraInventory. You can either run sudo yourself or let the installer do it. I like to run it myself to have more control over re-running and deconfigs if required.

Now it’s time for the prerequisite checks. If all previous steps have succeeded, you shouldn’t see any warnings at all.

Saving the response file and kicking off the install itself.

Running the orainstRoot.sh from the oraInventory, plus root.sh from the grid home. Running on rac01 first.

Just got errors starting ASM:

PRCR-1079 : Failed to start resource ora.asm
CRS-2672: Attempting to start 'ora.asm' on 'rac01-pub'
CRS-5017: The resource action "ora.asm start" encountered the following error:
ORA-00600: internal error code, arguments: [SKGMHASH], [1], [18446744073549507196], [0], [0], [], [], [], [], [], [], []
. For details refer to "(:CLSN00107:)" in "/u01/app/12.1.0/grid/log/rac01-pub/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
CRS-2674: Start of 'ora.asm' on 'rac01-pub' failed
CRS-2679: Attempting to clean 'ora.asm' on 'rac01-pub'
CRS-2681: Clean of 'ora.asm' on 'rac01-pub' succeeded
CRS-2674: Start of 'ora.asm' on 'rac01-pub' failed
2013/11/05 21:28:33 CLSRSC-113: Start of ASM instance failed
Preparing packages for installation...
cvuqdisk-1.0.9-1
2013/11/05 21:28:54 CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster ... succeeded

But there is no ASM here, so ignoring the error.

On rac02, it didn’t even try starting ASM with root.sh.

Running the rootinventory and root.sh since we have sudo running. It does take some time to run as the grid infrastructure is shut down and started up a few times.

Database install

Now that the grid infrastructure is in place, we can move onto the actual database install. We can re-use the same VNC window to install:

Skipping software updates and skipping the DB creation too (software only). Picking a RAC install. At this point we should see both nodes detected. Using the default language.

Installing enterprise edition with default home locations. In the group selection, it won’t let me select dba, even though the group was installed by the preinstall RPM. For now I’ll select oinstall.

The rest are default.

Running root.sh, which this time is very short.

Database creation assistant

With a database home we can run the creation assistant. But first, working on a hugepage configuration. /proc/meminfo is missing the HugePages lines entirely, and it does look like, regrettably, the supplied kernel does not support hugepages:

[[email protected] ~]$ zgrep HUGETLB /proc/config.gz
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set

And a quick web search seems to show that, while custom kernel support has been a long-standing user request at Gandi, it’s still not available.

So onto the install. From the same VNC session:

/u01/app/oracle/product/12.1.0/dbhome_1/bin/dbca

Creating a new database. Using the advanced install with:

  • RAC database (default)
  • Admin-managed
  • General purpoase/transaction processing
  • DB name: racdb
  • non-PDB
  • Selecting both nodes to run on
  • Configuring EM express
  • Running CVU periodically
  • Picking a password
  • File system storage
  • /srv/datadisk01/oradata/oradata – default
  • Default FRA, using the default size of 5G
  • Archiving disabled
  • Skipping sample schemas and database value
  • Unselecting automatic memory management
  • Leaving the remaining parameters default

And we’re installed and have a database. It can be tested via SQL*Plus:

export ORACLE_SID=racdb
export ORAENV_ASK=NO
. oraenv
export ORACLE_SID=racdb1
sqlplus "/ as sysdba"

If all went well, you should see a SQL prompt:

[[email protected] ~]$ sqlplus "/ as sysdba"
SQL*Plus: Release 12.1.0.1.0 Production on Wed Nov 6 23:38:51 2013
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Advanced Analytics
and Real Application Testing options
SQL>

And that’s it for the series. I made it through with 50,000 credits remaining in my Gandi account to play with.

Feel free to ping me in case of issues getting running. Some of these steps are a combination of several iterations as bugs were worked out, so it’s likely that there are some gremlins still lurking, and I’ll try and incorporate fixes as issues are discovered.

Lessons Learned

  • Yes, Oracle RAC can installed cleanly on a cloud environment, and at $17, the price is right
  • True shared storage from a cloud provider is still hard to come by, limiting the high-availability potential
  • There are quite a few extra steps required to satisfy the RAC installer and its prerequisite checks
  • In the Gandi environment, you need to overallocate RAM as not all of it is visible to the OS
  • The lack of hufepage support in the Gandi kernel (and complete lack of custom kernel support) further increases memory requirements
  • A dummy oracle-release RPM is all we need to keep the OS prerequisite checks happy
email

Author

Want to talk with an expert? Schedule a call with our team to get the conversation started.

About the Author

Marc is a passionate and creative problem solver, drawing on deep understanding of the full enterprise application stack to identify the root cause of problems and to deploy sustainable solutions. Marc has a strong background in performance tuning and high availability, developing many of the tools and processes used to monitor and manage critical production databases at Pythian. He is proud to be the very first DataStax Platinum Certified Administrator for Apache Cassandra.

3 Comments. Leave new

Hi Marc, great article! You mentioned that it cost around 17-18$ for the initial configuration, with 50,000 credits left once everything was setup. I assume you’ve had time to play around with it since the article was written, would you have any details on the average usage cost so far? I looked on the Gandi site for usage calculators similar to Amazon but couldn’t seem to find anything.

Reply
Marc Fielding
April 30, 2014 7:40 am

Hi Oscar,

In my use case, months or sometimes even years can go by without me actually using a specific VM environment, so the disk space costs are the biggest consideration. Even with the servers shut down, 80GB of storage costs $12.80/month when bought in units of 150k credits, according to https://www.gandi.net/hosting/iaas/prices. This compares to $4/month for Amazon EBS with the latest price cuts, and even less if you go to the trouble to archive to S3 or even Glacier.

Your use case may be very different though. It shouldn’t be too hard to create your own calculation spreadsheet using the info from the pricing page to figure out how your costs would run. but I would suspect that AWS would continue to beat Gandi in most cases. It’s just unfortunate that they don’t support VLANs.

Marc

Reply

Hi Marc, thanks for the reply. I did actually do a quick estimate using the figures on their site, but was also looking for feedback on real world usage to compare it with, even if just for testing purposes :) Anyway I’ll try it myself and see how I go. Thanks again

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *