Hi,
My name is Mark Tiger, creator
of this blog. I am an Oracle Certified
Professional (OCP DBA 11g).
Gathering information for some
DBA tasks can be time-consuming, even although the commands that you need to
issue eventually can be over quite quickly.
I have gone through this process over and over again, and have decided
to help other Oracle DBA’s in the community.
In this blog, I will give you
the details of how to carry out those tasks; that typically need a lot of
research, before you can do them. I will
try to present the information in an easy to understand way. My hope is that this will save you lots of
time in research, and help to make you more productive as an Oracle DBA. The illustrations are primarily meant for
Linux, since this is a Platform; that enjoys preference from Oracle. However they are easily adaptable for
versions of UNIX/AIX and windows etc.
11g R2 - Configuring Automatic restart of an Oracle Database
Oracle Restart Overview
With Oracle Restart installed
and configured, various oracle components can be automatically restarted after
a software or hardware failure, or whenever your database Host server
restarts. These are the components that
Oracle Restart can handle:
Database Instance:
·
Oracle restart can handle multiple database
instances on a single host machine
Oracle NET Listener:
Database Services:
·
Does not include the default service created
when you install, because it is automatically managed by the Oracle
Database. Does not include any other
default services created during database creation.
Oracle Automatic Storage
Management (ASM) instance:
Oracle ASM disk groups:
·
Restarting a disk group means mounting it
Oracle Notification Services
(ONS)
·
In a standalone Server environment, ONS can be
used in Oracle Data Guard installations for automating failover of connections
between primary and standby database, through Fast Application Notification
(FAN). ONS is a service for sending FAN
events integrated clients upon failover.
Oracle restart will run
periodic check operations to monitor the
health of these components. If a check
operations fails for a specific component, then that component will be shutdown
and restarted.
Oracle Restart is designed for
standalone single instance environments.
For RAC environments the functionality to automatically restart
components is already provided by the Oracle Clusterware.
Oracle Restart runs out of the
Oracle Grid Infrastructure Home. This
home is installed separately from the Oracle Database homes.
Startup Dependencies
Oracle Restart ensures that
the components are started in the correct order. For example if you are using ASM, then Oracle
Restart ensures that the Oracle ASM instance is first started and the relevant
diskgroups are mounted, before starting the database up. Also if a component must be shutdown, then
Oracle ensures that the dependent components are shutdown first.
Also the relationship between
the listener and the database is managed.
Oracle restart will attempt to restart the listener before starting the
database instance. If the listener fails
during the operation of the database, the Oracle Restart does not restart the
database instance.
Starting and Stopping Components with Oracle Restart
Oracle Restart automatically
starts and stops dependent components in pre-defined orders; when you startup
or shutdown your system.
There may be times when you
want to start or stop just one component, without the dependent components
being affected. The way to do this is to
use srvctl, which is a utility that comes with Oracle Restart.
When you stop a component with
srvctl, then Oracle Restart does not automatically stop the dependent
components. When you start that same
component again with srvctl, then that component is again available for
automatic restart.
Utilities such as SQL*Plus,
lsnrctl(listener control), and ASMCMD are integrated with Oracle Restart. If you shut the database down using SQL*Plus,
the Oracle Restart does not try to restart the database. If you shut the Oracle ASM instance down with
SQL*Plus or ASMCMD, then Oracle Restart does not attempt to restart them.
The difference between
starting a component with srvctl or another utility is:
·
When you start a component with srvctl, all the
dependent components are first started in the correct order.
·
When you start a component with a utility like
SQL*Plus, then the other components in the dependency chain are not
automatically started. You must first
ensure that all the components in the dependent chain are first started.
·
Oracle Restart enables you to start or stop all
of the dependent components in an Oracle Home or Grid infrastructure home with
a single command; if you are using srvctl.
CRSCTL
The crsctl utility starts and
stops Oracle Restart. Crsctl is also
used to enable and disable the Oracle High Availability services or
daemons. When the high availability
services are disabled, then none of the components managed by Oracle Restart are
started when a Node is rebooted.
Crsctl is useful, when you
need to stop Oracle Restart while installing a patch or carrying out OS
maintenance. When your maintenance is
complete, you can just start “Oracle restart” again with crsctl.
Oracle Restart Configuration
Oracle Restart maintains
information for each component. When a
component is started, it is started according to the configuration information
for that specific component. The
configuration includes the location of the spfile, and the TCP port that is
being used for listeners.
If you install Oracle Restart,
before you install the database with DBCA, then DBCA will automatically add the
database to the Oracle Restart Configuration.
Otherwise you can manually add
and remove components from the Oracle Restart configuration using the srvctl
tool. When you manually add a component
to Oracle restart using srvctl then Oracle Restart starts to manage the
component, restarting it when required.
Adding a component to the Oracle Restart configuration is also known as:
“Registering a component with Oracle Restart”.
Other Oracle tools will
automatically add the newly created components to the Oracle Restart
Configuration:
·
Create a database with OUI or DBCA
·
Create an Oracle ASM instance with OUI, DBCA or
ASMCA
·
Create a disk group using any method
·
Create a listener with NETCA
·
Create a database service with srvctl
In these cases the newly
created components are not added to the Oracle Restart Configuration.
·
Create a database with the CREATE DATABASE
statement in SQL*Plus
·
Create a database service, by modifying the
SERVICE_NAMES, initialization parameter
·
Create a database service with the
dbms_service.create_service() package
·
Create a standby database
Drop / remove / delete operations,
which update the Oracle Restart Configuration
·
Delete a database with DBCA
·
Delete a listener with NETCA
·
Drop an Oracle ASM disk group, using any method
·
Delete a database service with srvctl
Drop / remove / delete
operations, which don’t update the Oracle Restart Configuration
·
Delete a database by removing database files
with OS commands
·
Delete a database service without using srvctl
Data Guard
Oracle restart is integrated
with Data Guard and Data Guard Broker.
Following a Data Guard role
transition, all database services configured to run in the new role are
started, and all the services that are not configured are stopped.
When you add a database to the
Oracle Restart Configuration, you can specify the current Data Guard role for
the database:
PRIMARY, PHYSICAL_STANDBY,
LOGICAL_STANDBY, or SNAPSHOT_STANDBY
If the role is later changed
using the data Guard broker, then the new role is automatically updated on the
Oracle Restart Configuration. If you
change the database role without the broker, then you must manually update the
database role in the Oracle Restart configuration using srvctl.
When adding a database service
to the Oracle Restart Configuration, you can specify one or more Data Guard
roles for the service. If you have this
option configured, then Oracle Restart will only restart the database, if one
of the service roles matches the current database role.
Oracle Restart, uses Oracle
Notification Services(ONS) and Oracle Advanced Queues to publish Fast
Application Notification(FAN) high availability events. Clients integrated with Oracle Restart can
use FAN to provide fast notification to clients, when an instance or service
goes down. The client can automate the
failover between a primary database and a standby database.
Fast Application Notification(FAN)
Oracle Restart uses FAN to
notify other processes about configuration changes, and service status changes
that could be UP or DOWN events.
Integrated Oracle clients receive the events and respond.
Applications can respond by propagating
the error to the user, or by resubmitting the transaction and masking the error
from the application user. When a DOWN
event occurs, integrated clients immediately clean up the connections. When an UP event occurs, integrated clients
create new connections to the new primary database instance.
Oracle restart publishes FAN
events whenever a managed instance or service goes up or down. After a failover the Oracle Data Guard Broker
publishes FAN events:
·
Applications can use FAN with Oracle Restart
without programmatic changes, provided that they use one of these Oracle
Integrated Database Clients, which can be configured for FCF(Fast Connection
Failover), to automatically connect to a new primary database after a failure.
o Oracle
Database JDBC
o Universal
connection pool for Java
o Oracle
Call Interface
o Oracle
Database ODP.NET
o
·
FAN server side callouts can be configured on
the database tier
For DOWN events, such as a
failed primary database; FAN provides immediate notification to the clients. This enables the clients to failover to the
new primary database as fast as possible.
The clients don’t have to wait for a timeout, they are notified immediately,
and if they are configured to failover, can immediately failover, even before
the timeout occurs.
For UP events, when services
and instances are started, new connections can immediately be started to take
advantage of the of the extra resources.
FAN; using server side
callouts can:
·
Log Status information
·
Open support tickets, and page DBA’s when resources
fail to start
·
External dependent applications, that need to be
co-located with a service can be automatically started
FAN events are published using
ONS, Oracle streams , and Advanced Queues.
Queues are configured automatically when you create a service. ONS must be manually configured using srvctl.
The connection manager (CMAN)
and Oracle NET service listeners are integrated with FAN events. This enables CMAN and the listeners to
immediately de-register services that are associated with te failed instance. This is important, because it will avoid
requests being sent to a service or instance that is not available.
Oracle achieves high
availability with Oracle Restart and FAN.
When Oracle Restart detects an outage, then it isolates the failed
component, and recovers the dependent components. In the case that the failed component is an
instance, then after Data Guard has failed over to the standby database, then
Oracle Restart will start any
services that are defined with the current role.
FAN events are published by
Oracle Restart and Oracle Data Guard Broker through ONS and Advanced queuing. Fan callouts is another way that you can perform
notifications. Callouts are run
asynchronously, and are subject to scheduling variables. Therefore with Callouts you can’t guarantee the
order of events.
Oracle Restart; restarts and recovers
services and instances automatically.
This includes starting and recovering the listener processes and the ASM
instance if required. You can also use
FAN callouts to interface or report
faults to your fault management system, and to initiate repair jobs.
Managing Planned Outages with Oracle Restart
For Repairs, upgrades, and
maintenance that requires you to shut down the primary database. Oracle Restart provides interfaces that
disable and enable services to minimize service disruption to disruption to
application users. To achieve a coordinated
failover of the database service form the primary to the standby database; you
should use Oracle Data Guard with Oracle Restart. Once the maintenance is complete, then you can
revert the service back to normal operation.
The important configuration
with Oracle Restart is the management policy.
AUTOMATIC – service will start
automatically
MANUAL - you will have to start the service manually
FAN Events, for high availablility
Here is a description of the FAN
event record parameters, and a description of their meanings:
Parameter
|
Description
|
VERSION
|
Version of the event record,
Used to identify release changes
|
EVENT_TYPE
|
Database and instance types
provide the database service, like DB_UNIQUE_NAME. DB_DOMAIN
Service types would include:
SERVICE, SERVICE_MEMBER, DATABASE,
INSTANCE, NODE , ASM, SRV_PRECONNECT
|
DATABASE UNIQUE NAME
|
This is the database that is
supporting the service; matches the initialization parameter for DB_UNIQUE_NAME,
which defaults to the value of the initialization parameter DB_NAME
|
INSTANCE
|
The name of the instance that
supports the service, matches the initialization parameter ORACLE_SID
|
NODE NAME
|
This matches the name of the
node that support the service, or the name of the Node that has stopped. This matches the node name known to CSS(Cluster
Synchronization Services)
|
SERVICE
|
This matches the service
name in DBA_SERVICES
|
STATUS
|
UP, DOWN, NOT_RESTARTING, PRECONN_UP,
PRECONN_DOWN, UNKNOWN
|
REASON
|
Data_guard_failover, Failure,
Dependency, User, Autostart, Restart
|
CARDINALITY
|
The current active number of
service UP events
|
TIMESTAMP
|
Local time zone, this is
used when ordering notification events
|
|
|
A FAN record matches the
system context area signature per session.
SERVICE
|
sys_context(‘userenv’,’service_name’)
|
DATABASE UNIQUE NAME
|
sys_context(‘userenv’,’db_unique_name’)
|
INSTANCE
|
sys_context(‘userenv’,’instance_name’)
|
NODE NAME
|
sys_context(‘userenv’,’server_host’)
|
|
|
Using FAN callouts
FAN callouts are server side
executables. Oracle Restart executes a
FAN callout immediately, when a high availability event occurs.
FAN callouts can be used to
automate the following activities:
·
Opening fault tracking tickets
·
Sending messages to pages
·
Sending email
·
Starting and stopping server-side applications
·
Maintaining an uptime log. Each event is logged as it occurs.
To make use of FAN callouts,
you can place executables or shell scripts in this directory:
$GRID_HOME/racg/usrco You can have the same scripts on both the
primary and standby nodes.
For example:
$GRID_HOME/racg/usrco/Callout.sh
callout
#! /bin/ksh
FAN_LOGFILE=[<custom path>]/admin/log/`hostname`_uptime.log
echo $* “reported=”`date`
>> $FAN_LOGFILE &
You could get something like
the following as output from the previous script:
NODE VERSION=1.3 host=AIX5
status=nodedown reason=
Timestamp=12-AUG-2012 09:10:00
reported=<formatted data time stamp>
Because FAN records match the
sys context signature, you can try to determine which session matches the FAN
record.
Oracle has integrated FAN with
many of the Oracle Client drivers, that are used to connect to an “Oracle
Restart” configured database. You can
therefore use FAN, by using one of the integrated clients.
CMAN session pools, Oracle
Call Interface, Universal connection pool for Java, JDBC simplefan API, and
ODP.NET connection pools.
The goal should be to enable applications
to consistently obtain connections to the current or available Primary
database.
Mark Tiger,
Need a Database Health Check, Remote Monitoring, Support,
Maintenance, or a Security Audit?
P.S. I am busy
preparing Nine books on preparing for the Oracle 11g R2 OCM DBA exam.
Watch this spot for details on the books, as they start becoming available.
Hi,
ReplyDeleteI build the single instance database (On file system) manually through "create database ..." command on Windows 2008 Server and having issues with Database restart. Is there any way to configure the database NOW, with the "Oracle Restart" functionality? Please help.