3
Managing Site Objects

This chapter describes site objects and how the broker manages them during switchover and failover operations.

This chapter includes the following sections:

3.1 Site Objects

A site object is the middle level of the hierarchy of objects managed by the broker. A site object corresponds to a primary or standby site in a Data Guard configuration. Through site objects, you have the ability to centrally control the states and behavior of the primary and standby databases in the configuration, such as starting up and mounting the databases, starting and stopping log transport services and log apply services, performing a switchover or failover operation, dismounting and shutting down databases, and so on.

A site object may be enabled or disabled. When disabled, a site object is no longer managed and monitored by the broker. When enabled, a site object can be in an offline or an online state.

Offline: If a site's state is offline, the site has been shut down. If you take a site offline, its database instance is put into a started, nomount state. If this is a primary database resource, then the log transport services will stop sending archived redo logs to the standby database. If this is a standby database resource, then the log apply services will stop accepting and applying the archived redo logs to the standby database.
Online: If a site is online, the site is being managed by the broker and the database resource for the site will be put into its appropriate state:
- The primary database will be opened and the log transport services will ship archived redo log files to the standby databases.
- Physical standby databases will be mounted and the log apply services will apply archived redo logs to the databases.
- Logical standby databases will be opened (with the database guard set to on) and the log apply services will apply archived redo logs to the databases.

The state of a site object is dependent upon the state of the configuration containing the site, and the state of the database object is dependent upon that of the site. Thus, if a site is in an offline state, the database that is dependent on the site must also be in an offline state. Similarly, if the configuration is offline, all of the sites and resources in the configuration are also offline because all are logically dependent on the configuration object.

When in an online state and enabled, the broker manages the sites in a broker configuration in their mutually exclusive roles: primary or standby:

Primary role: In this role, the primary site contains the primary database from which redo logs are transmitted to one or more standby sites
Standby role: In this role, a standby site contains a standby database on which redo logs are received and applied to the standby database.

Thus, if a site is in an primary role, the database that is dependent on the site must also be in an primary role. With the broker, you can change these roles dynamically as a planned transition called a switchover operation, or you can change these roles as a result of a database failure through either a graceful failover or a forced failover operation. These are known as role transitions. The broker manages the steps involved in switchover and failover operations automatically for you by coordinating the role transitions for all of the affected sites and their dependent databases.

In configurations that include multiple standby sites, the standby sites that are not involved in the role transition are referred to as bystanders.

3.2 Role Management

When the primary site fails, such as when a system or software failure occurs, you may need to transition one of its corresponding standby sites to take over the primary role by performing a failover operation. Even in the absence of a disaster, you may have reason to perform a switchover operation to direct one of the standby sites to assume the role of being the primary site, while the former primary site assumes the role of being a standby site.

Without the broker, failover and switchover operations are manual processes that can be automated only by using script-based solutions. For example, if a physical standby site is in read-only mode (log apply services are offline) when a failure occurs on the primary site, you must change the standby database to managed recovery mode, apply archived redo logs that have not yet been applied to the standby database, and fail over the standby database to the primary role.

The broker simplifies the switchover or failover operations by allowing you to invoke them through a single command and then coordinating role transitions on all sites in the configuration.

Note:

If you are using Data Guard Manager and there are both physical and logical standby sites in the configuration, the broker will perform the switchover operation to a physical standby site. Data Guard Manager will switch over a logical standby site to the primary role only if there is no viable (enabled and online with NORMAL status) physical standby site. For failover operations, Data Guard Manager will switch over to the physical or logical standby database that you specify as the target of the failover.

3.2.1 Managing Switchover Operations

You can switch a site role from primary to standby, as well as from standby to primary, without resetting the online redo logs of the associated new primary database. This is known as a database switchover operation, because the standby database on the site that you specify becomes the primary database, and the original primary database becomes a standby database. There is no loss of application data, the data does not diverge between the original and the new primary database after the switchover operation completes, and there is no need to restart the bystander databases.

Whenever possible, you should always perform a switchover operation to a physical standby site:

If the switchover operation transitions a physical standby site to the primary role, then the original primary site will be switched to a physical standby role. The redo logs are continuously shipped from the new primary database to all standby sites in the configuration.
If the switchover operation transitions a logical standby site to the primary role, then the original primary site will be switched to a logical standby role. If there are physical bystanders in the configuration, they will not be able to serve as standby sites to the new primary site, because the new log stream is has become that of a logical standby site.
If the switchover operation transitions a physical standby site to the primary role, then both the primary databaes and the target standby database will be restarted after the switchover operation completes.
If the switchover operation transitions a logical standby site to the primary role, nothing needs to be restarted after the switchover operation completes. Neither the primary database nor the logical standby databases need to be restarted.

Warning:
Switchover operations to a logical standby database will result in the physical standby databases being permanently disabled in the configuration.

3.2.1.1 Before You Perform a Switchover Operation

Consider the following points before you begin a switchover operation:

When you start a switchover operation, the broker verifies that at least one standby database (including the new standby database that is about to be transitioned to the standby role) is configured to support the overall protection mode (maximum protection, maximum availability, or maximum performance).
You should prepare the primary database in advance for its possible future role as a standby database. For example, if the primary site might be transitioned to a physical standby role and the LogXptMode property is set to SYNC or ASYNC, then you need to set up standby redo logs on the primary site. If you pre-set database properties for the standby database role, note that these properties are not verified by the broker until you actually switch over the primary database to the standby role.
After a switchover operation completes, the overall Data Guard protection mode (maximum protection, maximum availability, or maximum performance) remains at the same protection level it was in prior to the switchover operation. Also, the log transport mode (SYNC, ASYNC, or ARCH) of bystanders does not change after a switchover operation. Log apply services for all bystanders automatically begin applying archived redo logs from the new primary database.
If there are both logical and physical standby database in the configuration and the switchover operation occurs to a logical standby database, you will need to reinstantiate all physical bystanders in the new configuration after the switchover operation completes.

3.2.1.2 Starting a Switchover Operation

The act of switching roles should be a well-planned activity. The primary and standby databases involved in the site switchover operation should have as small a transactional lag as possible. Oracle Corporation highly recommends that you consider performing a full, consistent backup of the primary database prior to starting the switchover operation. (Oracle9i Data Guard Concepts and Administration provides detailed information about setting up the sites and databases in preparation of a switchover operation.)

To start a switchover operation using Data Guard Manager, select the Data Guard broker configuration and select Switchover from the right-click menu to invoke the Switchover wizard. When using the CLI, you need to issue only one SWITCHOVER command to specify the name of the standby site that you want to change into the primary role.

The broker controls the rest of the switchover operation, as described in Section 3.2.1.3.

3.2.1.3 How the Broker Performs a Switchover Operation

Once you start the switchover operation, the broker:

Verifies that the primary and the target standby sites and databases are in the following states:
- The primary site and database must be enabled and online, with log transport services started. (For the CLI, this is the READ-WRITE-XPTON substate.)
- A participating physical standby site and database must be enabled and online, with log apply services started. (For the CLI, this is the PHYSICAL-APPLY-ON substate)
- A participating logical standby site and database must be enabled and online, with log apply services started. (For the CLI, this is the LOGICAL-APPLY-ON substate.)
The broker allows the switchover operation to proceed as long as there are no errors for the primary site and standby site that you selected to participate in the switchover operation. However, errors occurring for any bystanders will not stop the switchover operation.
Switches roles between the primary and standby sites.

The broker first converts the original primary database to run in the standby role. Then, the broker transitions the target standby database to the primary role. If any errors occur during either conversion, the broker stops the switchover operation. See Section 3.2.1.4 for more information.
Updates the Data Guard configuration file to record the change in roles.

Because the configuration file describes all site and resource objects in the configuration, this ensures that each object will run in the correct role.
Restarts the new primary database if the switchover operation occurs with a physical standby database, opening it in read/write mode, and starts log transport services shipping archived redo logs to the standby databases, including to the former primary database. If the switchover operation occurs to a logical standby database, then there is no need to restart any databases.
Restarts the new standby database if the switchover operation occurs with a physical standby database, and log apply services begin applying archived redo logs shipped from the new primary database.

The broker verifies the state and status of the database resources on each site to ensure that the switchover operation has successfully transitioned the sites to their new role correctly. Bystanders will continue operations in the state they were in before the switchover operation. For example, if a bystander physical standby database was in read-only mode, it will remain in that mode after switchover completes. Log apply services for all bystanders automatically begin applying archived redo logs from the new primary database.

3.2.1.4 Troubleshooting Switchover Operations

If the switchover operation fails due to problems with the configuration, the broker reports any problems it encounters. In general, you can choose another site for the switchover operation or fix the problem and then retry the switchover operation. The following subsections describe how to recover from the most common problems.

Problems Transitioning the Primary Site to the Standby Role

If the error messages returned indicate a problem when transitioning the original primary site and database to the standby role (including stopping log transport services and starting log apply services), use these general guidelines to fix the problem:

Investigate the error message returned by the broker to find the source of the problem on the primary site and correct it. For example, you can look in the Data Guard Manager Viewlog for alert log information.
Reenable the configuration to refresh and restore the sites and database resources to their original roles and states.
Perform the switchover operation again.

Problems Transitioning the Standby Site to the Primary Role

If the error messages that have been returned indicate that a problem occurred when transitioning the original standby database to the primary role (including stopping log apply services and starting log transport services), use these general guidelines to fix the problem:

Disable the configuration.
Investigate the error messages returned by the broker to find the source of the problem on the standby site and correct it.
Restart the original primary database to run in the standby role. (You must restart this site as a standby site and database because the switchover operation has already successfully transitioned it to run in the standby role.)
Execute SQL*Plus commands to convert the new standby database back to running in the primary database role. To do this, perform the following steps:
1. Locate the trace file in the log directory where you issued the SQL statements to create the control file for the original primary database.
2. Extract the SQL commands from the trace file into a temporary file and execute the file from the SQL*Plus command line.
3. Execute the SHUTDOWN IMMEDIATE command on the original primary database instance to restart it.
Restart the original primary database as the primary database.
Reenable the configuration.
Perform the switchover operation again.

3.2.2 Managing Failover Operations

Database failover transitions one of the standby sites to the role of primary site. You should perform a failover operation only when a catastrophic failure occurs on the primary site, and there is no possibility of recovering the primary site and database in a timely manner. The failed primary site is discarded and the target standby site and database assume the primary role.

The broker supports two grades of failover operations:

Graceful failover

This is the recommended failover option. Graceful failover automatically recovers some or all of the original primary database application data and attempts to bring along any bystander sites and databases to continue serving as standby databases to the new primary database:
- After a graceful failover to a physical standby database, the original primary database must be re-created. In addition, some physical standby databases may be permanently disabled if the broker detects that the data has diverged from the new primary database. However, physical standby databases that were disabled during the failover operation may be salvaged if the required logs are available and can be recovered. Otherwise, you must re-create bystanders that were permanently disabled prior to the failover operation or required reinstantiation as a result of the failover operation before they can serve as standby sites to the new primary site.
- After a graceful failover to a logical standby database, the broker attempts to reinstate logical standby bystanders. However, the failover operation may result in all logical standby databases being permanently disabled under some circumstances. For example, if there is a gap in the log sequence and the logical standby bystanders cannot finish applying all of the redo data that the target logical standby database had applied prior to the failover operation. All physical standby databases will be permanently disabled when the failover occurs to a logical standby database.
Forced failover

Do not perform a forced failover to a standby site except in an emergency. Forced failover may result in lost application data even when standby redo logs are configured on the (physical) standby database. A consequence of a forced failover operation is that you must re-create the original primary database and all bystanders before they can serve as standby sites to the new primary site. Another consequence is that there may be lost application data unless the standby and primary databases had been configured to run in maximum protection mode prior to the failover, and all logs have been successfully applied to the standby database.

Depending on the log transport services destination attributes, a graceful failover may provide no data loss or minimal data loss. A forced failover may result in data loss. Always try to perform a graceful failover operation; only when a graceful failover is unsuccessful should you perform a forced failover operation.

Note:

After a failover operation, the overall Data Guard protection mode is always reset to the maximum performance mode. The log transport mode (SYNC, ASYNC, or ARCH) of the bystanders does not change.

3.2.2.1 Starting a Failover Operation

To start a failover operation using Data Guard Manager, select the Data Guard configuration in the navigator tree and then select Failover from the right-click menu to invoke the Failover wizard. The Failover wizard guides you through the steps necessary to transition one of the standby sites into the primary role. When using the CLI, you issue one FAILOVER command that specifies the name of the standby site that you want to change into the primary role, and the keyword GRACEFUL or FORCED to specify the type of failover operation.

The standby site that is the target of the failover operation should be a physical standby site in an enabled state. You can fail over to logical standby sites only if there are no enabled physical standby sites in the configuration.

After the failover operation, the overall protection mode of the new configuration (maximum protection, maximum availability, or maximum performance) is reset to the maximum performance mode, which is the default.

The broker controls the failover operation steps described in Section 3.2.2.2. However, you must perform the additional steps described in Section 3.2.2.4 after the failover operation completes.

3.2.2.2 How the Broker Performs a Graceful Failover Operation

Once you start the failover operation, the broker:

Verifies that the target standby site and database are in the enabled state. (For the CLI, this is the PHYSICAL-APPLY-ON substate for physical standby databases, or the LOGICAL-APPLY-ON substate for logical standby databases.) If the database is not enabled, then you will not be able to perform a failover operation to this site.
Waits for the target standby site to finish applying any remaining archived redo logs before stopping log apply services on it.

Updates the Data Guard configuration file to record the change in roles.

If a bystander was in an online state, then the bystander will be restarted in the state it was in before the failover operation. If a bystander was in the offline state, then it will be taken to its default online state during the failover operation. For example, if a physical standby database was operating in read-only mode, it will remain in read-only mode.

Note:

Standby bystanders may be permanently disabled during a graceful failover operation and they must be re-created in the configuration before they can serve as standby sites to the new primary site. A graceful failover to a logical standby database may result in all logical standby databases being permanently disabled, but it will result in all physical standby databases being permanently disabled.

Transitions the target standby site into the primary role, opens the new primary database in read/write mode, and starts log transport services that begin shipping archived redo logs to bystanders.

The broker allows the failover operation to proceed as long as there are no errors for the standby site that you selected to participate in the failover operation. However, errors occurring for any bystanders will not stop the failover operation. If you initiated a graceful failover operation and it fails, you might need to restart it as a forced failover operation.

3.2.2.3 How the Broker Performs a Forced Failover Operation

Once you start the failover operation, the broker:

Verifies that the target standby site and database are enabled. If the standby site is not enabled for management by the broker, then the failover operation cannot occur.
Stops log apply services on the standby site immediately, without waiting for log apply services to finish applying the available archived redo logs. Note that this may result in some data loss.
Updates the Data Guard configuration file to record the change in roles.
Transitions the target standby site into the primary role, opens the new primary database in read/write mode, and starts log transport services.

Because a forced failover operation starts a new log stream from the new primary site, all bystanders are permanently disabled from the broker configuration. These standby sites are left in an online state, but they are no longer manageable by the broker.

The broker allows the failover operation to proceed as long as there are no errors for the standby site that you selected to participate in the failover operation.

3.2.2.4 Completing the Failover Operation

You must perform recovery steps after the failover operation completes:

After a graceful or forced failover operation completes, the original, failed primary database and the new primary database have diverged. The original primary database is permanently disabled by the broker until such time as the database can be reinstantiated as a standby to the new primary database.
After a graceful failover completes, any of the bystander standby sites that determine for themselves that they cannot continue as a viable standby for the new primary will be permanently disabled by the broker.

For instance, this could happen if a bystander finds that it has applied more logs than the new primary itself has applied, hence diverging from the new primary. The bystander must be reinstantiated before it may serve as a standby for the new primary database.
After a graceful failover to a logical standby completes, all physical bystander standby databases in the configuration have diverged from the new primary database. The broker permanently disables the physical bystanders. They must be reinstantiated before they can serve as standby to the new primary database.
After a forced failover completes, the new primary database has diverged from all bystander standby sites regardless of their type. The broker permanently disables all of them. They must be reinstantiated before they can serve as standby to the new primary database.

A permanently disabled site is recovered for broker operation by:

Removing the site object from the configuration. This also removes any dependent objects, i.e. the database object that depends upon that site.
Reinstantiate the database itself from the new primary database using the procedures described in Oracle9i Data Guard Concepts and Administration.
Restore the site and its database to the broker configuration.
Enable the restored site. The newly reinstantiated standby database will begin serving as standby to the new primary database.

3.2.2.5 Troubleshooting Failover Operations

Although it is possible for a failover operation to stop, it is very unlikely. If an error occurs, it is likely to happen when the standby site is transitioning to the primary role. If the error messages that have been returned indicate that this is when the problem occurred, use these general guidelines to fix the problem:

Investigate the error message returned by the broker to find the source of the problem and correct it.
Perform the failover operation again.

3 Managing Site Objects

3.1 Site Objects

3.2 Role Management

3.2.1 Managing Switchover Operations

3.2.1.1 Before You Perform a Switchover Operation

3.2.1.2 Starting a Switchover Operation

3.2.1.3 How the Broker Performs a Switchover Operation

3.2.1.4 Troubleshooting Switchover Operations

Problems Transitioning the Primary Site to the Standby Role

Problems Transitioning the Standby Site to the Primary Role

3.2.2 Managing Failover Operations

3.2.2.1 Starting a Failover Operation

3.2.2.2 How the Broker Performs a Graceful Failover Operation

3.2.2.3 How the Broker Performs a Forced Failover Operation

3.2.2.4 Completing the Failover Operation

3.2.2.5 Troubleshooting Failover Operations

3
Managing Site Objects