When I started my work in IT, I used to be in a very small shop. And even though we had people in several places in the same state, everything used to be very centralized, and work was from 9 to 5. Since we were basically only 2 people , our action plan used to be a talk over the lunch table, and that would be it. We would go ahead and execute it after 5 PM and, I won’t lie, sometimes before 5 :) .
Over the years, I learned that even if you are a 2 guy shop or a team of 15 separated by oceans and being miles apart, communication is the most important thing to have on your team. One of the means of communication to employ is having an action plan in place for any major/medium change you do in your organization. This will generate discussions among your teammates and reduce the possibility of errors when you are faced with time and pressure constraints when implementing it.
This might sometimes feel like a mundane and boring task, as the elaboration takes effort and the verification takes time, but when game day comes along, you will see the great benefit of having an action plan.
Another great benefit of having an action plan is that you also have a road map if you need to rollback your change. That is critical because normally, any major change or rollback is not done only by one person. Take for example a change that takes about 7 or 8 hours to be done, and at the end when the UAT (User Application Testing) is done, 1 or 2 more hours. If the application team decides that a rollback is needed, you are probably not in a good state of mind to do the rollback after 8 hours of continuous work. However, if you have an action plan, one of your teammates can step in and allow you to rest, even if it is just to go to the kitchen and have 10 minutes to yourself for a sandwich and a coke.
Having an action plan doesn’t mean that everything will go smoothly or that you won’t have an error in there. But believe me: It will reduce the possibility of errors that would occur if you executed it by memory or by doing one yourself without revision.
I hope that you already have an action plan as part of your major/medium changes, but if you don’t, it is time to get FIT-ACER. Here is an example (Kudos to Cesar Sanchez, as it is his Action Plan Template). Use and modify it to your needs. It’s a good start!
Implementation environment: =========================== SERVERS: DATABASES: Action Plan Overview: ============ - The high-level steps for the maintenance are: --> Tasks below will be performed A.1 - Client : Pre-requisite Tasks on SERVER. A.2 - Pythian : Patch Installation on SERVER. A.3 - Pythian : Post Patch Installation in SERVER. Approved by: ============ This has been approved by John Doe. Created by: ============ Rene Antunez Reviewed by: ============ John Doe II Jane Doe Implementation Window Date/Time: ================================= A.1 - Any time before MM-DD-YYYY 3:00pm EDT - Pythian : Pre-requisite Tasks on SERVER. A.2 - MM-DD-YYYY 3:00pm EDT - Pythian : Patch Installation on SERVER. A.3 - MM-DD-YYYY 4:00pm EDT - Pythian : Post Patch Installation in SERVER. Detailed Communication Plan: ============================ - All the communication for this task will be done via . - Escalation Path in case of any issue is: Detailed Action Plan: ===================== A.1 Pre-requisite Tasks on SERVER. A.1.1 Backup of Oracle Home A.1.2 Stage the Patch in SERVER in directory A.1.3 Opatch Conflict Resolution A.1.4 Create the OCM response file at DIRECTORY/ocm.rsp A.2 Patch Installation on SERVER. A.2.1 Database Backup verification A.2.2 Blackout any monitoring in SERVER (OEM) A.2.3 Notify that the PSU Application is about to begin A.2.4 Verify one last time that the staged binaries are in DIRECTORY are owned by the GI_HOME user owner and group and not by root A.2.5 Stop the Oracle Agent in all servers A.2.6 Patch Installation on SERVER A.3 Post Patch Installation in SERVER A.3.1 Verify that the patch is applied succesfully in the GRID_HOME/RDBMS_HOME binaries A.3.2 If the SERVER instances are only mounted after the patch, set in DMGRL to "apply-on" A.3.3 Start the Oracle Agent in all servers A.3.4 End Blackout and start monitoring in SERVER (OEM) A.3.5 There are no POST Database Installation tasks in SERVER databases as it will be a Data Guard Standby-First Patch Apply ID 1265700.1 A.3.6 Notify that the PSU Application is complete Rollback Plan: ===================== R.1.1 Notify that the PSU Rollback is about to begin R.1.2 Blackout any monitoring in SERVER for 2 hours R.1.3 Verify that the Media recovery is stopped and STBY's instances are shutdown R.1.4 Rollback January 2013 PSU Patch one server at a time R.1.5 Restart the standby instances, as follows: R.1.6 End Blackout and start monitoring in SERVER (OEM) R.1.7 Notify that the PSU Rollback is complete