ASM is definitely one of the coolest technologies inside the Oracle Database. On the other hand, the ability of the storage arrays to provide a read/write access to a copy or a “snapshot” of its content is something we can easily leverage as an Oracle DBA. For a couple of weeks, I wanted to copy a database stored in an ASM Disk Group with one of those storage technologies and mount it on the same server; unfortunately, this is not supported even with 18.104.22.168. The good news is that I finally overcame all the obstacles to do it in a specific case. This post relates a couple of the tips I used to get to that result.
What You May Want to Know Before You Start
Several people helped me, even if most of them don’t know. Luca Canali did an awesome presentation about ASM at UKOUG. Alexandro Vargas’s Blog is a must-have in the list of your RSS feed if you work with ASM or RAC. Last but not least, Alex Fatkulin rocks when you have a problem with Oracle. I’m very lucky to work 10 metres away from him and other amazing folks at Pythian.
N.B.: What I describe below is among the worst things you can ever do with ASM. You can use it to play around, but never use it with anything other than test data. If you lose something because of me, youâ€™ll be the only one to blame!
I’ve tested the whole process with Oracle 22.214.171.124 on Ubuntu Gutsy Gibbon, based on a previous post I made on this blog. In order to copy the ASM disks, I’ve used the
dd command, but it should not make any difference with a storage array command.
The only way I’ve found to have the original Disk Group and its copy mounted all together on the same server is to rename the cloned Disk Groups and Disks. Running the copy of the Disk Group on separate ASM instances just failed for me. The error will come up later in this post. Obviously, there is no command for now to rename the Disk Group or Disk, and I have no idea how to generate those Metadata with a C program, though.
Here’s a simple idea: let Oracle create the Disk Group and Group Metadata by creating a brand new Disk Group with its set of Disks that exactly fit the layout of the original Disk Group. Once this is done, you’ll:
- Copy the Metadata with a couple of
ddcommands in a set of files
- Turn off ASM rebalancing
- Set your database in backup mode
- Clone/Copy the disks into the set of disks you used to create the new Disk Group
- End the backup mode for the database and turn on ASM rebalancing
- Inject the New Disk and Disk Group metadata in the copy/clone of your disk
- Mount the copy of the Disk Group with its new name in ASM
- End with the database cloning process (i.e.: make sure you’ve deleted the copies of control files, spfiles, redo logs and file alias in the new disk group, and created a new one to mount a new instance with the different file paths)
Step 1: Create a Separate ASM Instance
Why use a separate ASM instance? It may be related to the fact the test case was not running on a supported platform, but once a new the diskgroup created, and even after changing the
asm_diskstring parameter and querying
v$asm_disk, the devices couldn’t get dropped. The
lsof command was always showing
LGWR processes using them, and
losetup -d was complaining because of that. Anyway, to prevent us from stopping the primary ASM instance, you can always remove one device from the ASM discovery string by changing the
asm_diskstring on the primary ASM instance:
SQL> alter system set asm_diskgroup='/dev/loop[0-6]';
Once you have done this, you can create a new ASM instance on the same server. To do that, all you need is a separate parameter file. Here is an example with an instance named
+CLONE, and its
*.asm_diskstring='/dev/loop7' *.diagnostic_dest='/u01/app/oracle' *.instance_type='asm' *.large_pool_size=12M *.db_unique_name='+CLONE'
Note that, if you want to mount a Disk Group with the same name as one Disk Group this is already mounted in the primary ASM instance (i.e. +ASM) in a separate instance, you get an error like the one below:
SQL> alter diskgroup DGBLUJ mount; alter diskgroup DGBLUJ mount * ERROR at line 1: ORA-15032: not all alterations performed ORA-15003: diskgroup "DGBLUJ" already mounted in another lock name space
We tried a couple of things but haven’t been able to overcome this situation. This is just a shame; it would have enabled us to use a copy of the Disk Group without renaming it with the
_asmsid parameter on a new database instance.
Step 2: Create a Disk Group in the New ASM Instance
To help you understand the test case, I’ll assume the original Disk Group is named
DGBLUJ and it has only one disk in it. Its redundancy is set to external. We’ll create a new Disk Group named
DGREDX. You can look at
V$ASM_DISK to find what your next mount should look like. Once you have done this, create the new Disk Group in the ASM secondary instance. The set of commands to do so, in my case, is the following:
$ . oraenv ORACLE_SID = [+CLONE] ? +CLONE The Oracle base for ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1 $ sqlplus / as sysdba SQL> create diskgroup DGREDX external redundancy disk '/dev/loop7';
You can then shut down the secondary ASM instance and delete your device from the server (in this test with
SQL> shutdown immediate; SQL> exit; $ losetup -d /dev/loop7
Step 3: Copy the Disk Group Information from DGREDX
I wish I could tell you more about why I caught those data. I may not have spent enough time with
strace yet. What I suspected was that the ASM Disk’s and Disk Group’s information was in the file #0 and file #2 headers. Despite the block size being 4k for the ASM files in my case, it worked perfectly when I copied the first 2k of those files. I’m pretty sure many of you know more than me. Assuming the
disk.bis file is actually the disk with the
DGREDX data, I use the
commands as below:
$ dd if=disk.bis of=dgredx-file0.2k bs=2048 count=1 1+0 records in 1+0 records out 2048 bytes (2.0 kB) copied, 6.4291e-05 seconds, 31.9 MB/s $ dd if=disk.bis of=dgredx-file2.2k bs=2048 count=1 skip=1536 1+0 records in 1+0 records out 2048 bytes (2.0 kB) copied, 0.0437474 seconds, 46.8 kB/s
To view the content of this segment, you can use vi as below:
$ vi -b dgredx-file0.2k [Esc]+:%!xxd [Esc]+:w dgredx-file0.2k.txt [Esc]+:q!
If you want to investigate further, I suggest you read the web sites I earlier pointed out. It will give you many useful information about
X$KFFI, and many more. You may also want to use the
amdu utility that comes with 11g. Try
amdu -help to start. Here is an example of such commands to look at the content of the Disk Group MetaData:
$ amdu -diskstring /dev/loop0 -dump DGBLUJ -noimage $ amdu -diskstring /dev/loop7 -dump DGREDX -noimage
Step 4: Make Your Clone
As I mentioned earlier, this is important. You turn off the ASM rebalancing and start the the database BEGIN backup. Once this is done, you’ll have to use your favourite storage commands. In my case, the command is:
$ dd if=disk of=disk.bis bs=4096
Then you can end with the database backup mode and restart the ASM automatic I/O rebalancing.
Step 5: Merge the Meta Data from DGREDX Into the DGBLUJ Copy
Once you’ve got the Disk Group clone, you can merge the metadata you’ve copied from
DGREDX into the
DGBLUJ clone. You can use the
dd command as below:
$ dd if=dgredx-file0.2k of=disk.bis bs=2048 conv=notrunc,nocreat 1+0 records in 1+0 records out 2048 bytes (2.0 kB) copied, 0.0156065 seconds, 131 kB/s $ dd if=dgredx-file2.2k of=disk.bis bs=2048 seek=1536 conv=notrunc,nocreat 1+0 records in 1+0 records out 2048 bytes (2.0 kB) copied, 0.0488208 seconds, 41.9 kB/s
Step 6: Create the Device on the Server
Once again, this operation is very specific to this test case; to create a device with
losetup, use the commands below:
# losetup /dev/loop1 /asmdisks/disk.bis # losetup -a /dev/loop0: :7438421 (/asmdisks/disk) /dev/loop1: :7438425 (/asmdisks/disk.bis)
To avoid any problem, check the privileges of the files and of the devices.
Step 7: Mount the Diskgroup in the Primary ASM Instance
Once the disk is copied, patched, and mounted to a system device, you can simply mount it on the primary ASM instance as below:
$ . oraenv ORACLE_SID = [BLACK] ? +ASM The Oracle base for ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1 $ sqlplus / as sysdba SQL> alter diskgroup DGREDX mount; Diskgroup altered. SQL> exit;
The safer way to continue is probably to delete the files that pointed to files in the primary Disk Group. Here is an example showing how to delete the controlfile copies:
$ asmcmd ASMCMD> ls DGBLUJ/ DGREDX/ ASMCMD> ls DGREDX/BLUJ/CONTROLFILE/* Current.260.646592475 Current.261.646592475 ASMCMD> rm DGREDX/BLUJ/CONTROLFILE/* You may delete multiple files and/or directories. Are you sure? (y/n) y ASMCMD> exit
This not only includes the controlfiles but also:
- the ASM aliases
- the SPFILE
- the Redo Logs — because ASM uses OMF and you’ll have to open the clone of the database you’ll create with RESETLOGS. You shouldn’t be able to rename those files anyway.
Step 8: End by Cloning the Database
I guess if you’re come this far, this step should be easy for you.
Anyway, this time more than ever, I’ll be very interested if you go further into this test case.