I’ve been working on setting up a demo for my upcoming presentation on application continuity at RMOUG training days later this month. The challenge is to get a multi-node cluster, plus a load generator, and a host OS, to fit on a memory-constrained laptop.
According to the Oracle grid installation guide, 4GB per virtual host is the minimum requirement. However with a few tweaks I’ve been able to get the full stack to run in 2GB of memory. For anyone else out there installing 12c clusters into virtual machines, here are a few tips.
But first the disclaimer: these changes are mostly unsupported by Oracle, and intended for test and demo databases. They can potentially cause unexpected behaviour, hangs, or crashes. Don’t try this in production!
- Grid Infrastructure management repository database (GIMR): This is a full Oracle database that stores operating system workload metrics generated by the cluster health monitor, for use Oracle QoS management and troubleshooting. Being a full database, it has a large memory and CPU footprint. I originally installed Oracle 220.127.116.11 skipping the database on install, and upgraded to 18.104.22.168 without it. However, it looks like it’s no longer possible to skip the installation on the GUI. My colleague Gleb suggests adding
-J-Doracle.install.mgmtDB=falseon the installer command line to skip it.
- Cluster health monitor (CHM): this tool colleccts a myriad fo worklaod-related metrics to store in the GIMR. And it uses a surprisingly high amount of CPU: it was the top CPU consumer in my VM before removal. It can be disabled fairly easily, with a hat tip to rocworks:
$ crsctl stop res ora.crf -init
# crsctl delete res ora.crf -init
- Trace File Analyzer Collector (TFA): collects log and trace files from all nodes and products into a single location. Unfortunately it’s written in Java with its own Java Virtual Machine, again requiring a large memory footprint for the heap etc. It can be removed wit ha single command, though note that next time you run rootcrs.pl (patching for example) it will reinstall itself.
# tfactl uninstall
- Cluster Verification Utility (CVU): As you install Oracle Grid Infrastructure, the CVU tool automatically runs, pointing out configuration issues that may affect system operation (such as running under 4GB of RAM). In Oracle 22.214.171.124, it also gets scheduled to run automatically every time the cluster is started and periodically after that. The CVU itself and checks use CPU and RAM resources, and are better run manually when such resources are limited. It’s also a quick removal:
$ srvctl cvu stop
$ srvctl cvu disable
- OC4J: Every Oracle 12c grid infrasturucture install contains OC4J, Oracle’s old Java J2EE web application server, since replaced with WebLogic. And no, please don’t make me install WebLogic too now, Oracle! I’m honestly not sure what it’s used for, but I’ve been able to disable it without any obvious ill effects
$ srvctl stop oc4j
$ srvctl disable oc4j
- ASM memory target: as of 12c, the ASM instance has a default memory target of 1 gigabyte, a big jump from the 256mb of Oracle 11g. And if you set a lower target, you’ll find it’s ignored unless it’s overridden with a hidden parameter. I’ve set it to 750mb with good results, and it can possibly be set even lower in light-utilization workloads:
$ sqlplus "/ as sysasm"
alter system set "_asm_allow_small_memory_target"=true scope=spfile;
alter system set memory_target=750m scope=spfile;
alter system set memory_max_target=750m scope=spfile;
# service ohasd stop
# service ohasd start
A non-memory issue I’ve run into is the VKTM, virtual keeper, to time background process using large amounts of CPU time in both ASM and database instances. I’ve noticed it to be especially pronounced in virtual environments, and in Oracle Enterprise Linux 6. I’ve ended up disabling it completely without obvious ill effects, but as always, don’t try on your “real” production clusters.
alter system set "_disable_highres_ticks"=TRUE scope=spfile;
(Hat tip to MOS community discussion 3252157, also this IBM slide deck)
Additionally, Jeremy Schneider has taken on the biggest remaining GI memory user, the Oracle cluster synchronization service daemon (OCSSD). This is an important cluster management process, and Jeremy figured out a way to unlock its memory in the gdb debugger, allowing it to be swapped out. My own tests were less successful: the process wasn’t swapped out even after trying his changes. But his blog post is worth a read, and others may have more success than I did.
I also noted that during the link phase of installation and patching, the ld process alone takes over 1GB of RAM. So either shut down clusterware or add swap and wait while linking.
So to wrap up, I’ve managed to get a full Oracle GI 126.96.36.199 stack including database to run in a virtual machine with 2GB RAM. Readers, any other tips to put the goliath that is Oracle GI on a diet?
Love to read most of the technical tracks of the Pythian Team, but here I have to say some “but” …
Even if these suggestions are interesting from a technical Point of view, I very dislike these kind of solutions.
Your own disclaimer notice say it all – The fact that you are not mentioned any ill behavoirs does not mean that you are not running into them – in the worst case that can make your testing results use-less or non-reproducible. Or you are ending up wasting your time searching for root causes of errors based on side-effects of the non-running components.
Personally, I prefer do add some more physical memory to my testing machines to get rid of memory shortages when testing new products.
I have tried and couldn’t run with less than 3 GB. Without MGMTDB database, ASM with memory_target=256M, database instance with memory_target=436M, no TFA, no OC4J, no ora.cvu, no ora.crf. It ran, but started swapping.
Thank you for useful tips!
I was able to reduce memory consumption by my sandbox system.
It seems like typo. Order is wrong:
srvctl cvu stop
Also I had to stop and remove mgmtdb, as stopping ora.crf was not enough:
srvctl stop mgmtdb
srvctl remove mgmtdb
oc4j is used for QoS Managment. In your case, where you don’t use any services and server pools it can be disabled without problems.
But not sure if runtime load balancing is still working if you disable it.