6
User-Managed Media Recovery Scenarios

This chapter describes how to recover from common media failures, and includes the following topics:

Recovering After the Loss of Datafiles: Scenarios

If a media failure affects datafiles, then the recovery procedure depends on:

The archiving mode of the database: ARCHIVELOG or NOARCHIVELOG
The type of media failure
The files affected by the media failure

The following sections explain the appropriate recovery strategies based on the database archiving mode:

Losing Datafiles in NOARCHIVELOG Mode

If either a permanent or temporary media failure affects any datafiles of a database operating in NOARCHIVELOG mode, then Oracle automatically shuts down the database. Depending on the type of failure, use one of the following recovery methods:

If the media failure is . . .	Then . . .
Temporary	Correct the hardware problem and restart the database. Usually, crash recovery is possible, and all committed transactions can be recovered using the online redo log.
Permanent	Follow the procedure "Performing Complete User-Managed Media Recovery".

Losing Datafiles in ARCHIVELOG Mode

If either a permanent or temporary media failure affects the datafiles of a database operating in ARCHIVELOG mode, then the following scenarios can occur.

Damaged Datafiles	Database Status	Solution
Datafiles in the `SYSTEM` tablespace or datafiles with active rollback or undo segments.	Oracle shuts down.	If the hardware problem is temporary, then fix it and restart the database. Usually, crash recovery recovers lost transactions. If the hardware problem is permanent, then refer to "Performing Closed Database Recovery".
Datafiles not in the `SYSTEM` tablespace or datafiles that do not contain active rollback or undo segments.	Oracle takes affected datafiles offline, but the database stays open.	If the unaffected portions of the database must remain available, then do not shut down the database. Take tablespaces containing problem datafiles offline using the temporary option, then follow the procedure in "Performing Datafile Recovery in an Open Database".

Recovering Through an Added Datafile: Scenario

If database recovery with a backup control file rolls forward through a CREATE TABLESPACE or an ALTER TABLESPACE ADD DATAFILE operation, then Oracle stops recovery when applying the redo record for the added files and lets you confirm the filenames.

For example, suppose you make a whole database backup, and then later create a new tablespace containing two datafiles: /oracle/dbs/db2.f and /oracle/dbs/db3.f. If you later restore a backup control file and perform media recovery through the CREATE TABLESPACE operation, then Oracle may signal the following error when applying the CREATE TABLESPACE redo data:

ORA-00283: recovery session canceled due to errors 
ORA-01244: unnamed datafile(s) added to controlfile by media recovery
ORA-01110: data file 3: '/oracle/dbs/db2.f'
ORA-01110: data file 2: '/oracle/dbs/db3.f'

To recover through an ADD DATAFILE operation:

View the files added by selecting from V$DATAFILE. For example:

SELECT FILE#,NAME 
FROM V$DATAFILE;

FILE#           NAME
--------------- ----------------------
1               /oracle/dbs/db1.f
2               /oracle/dbs/UNNAMED00002
3               /oracle/dbs/UNNAMED00003

If multiple unnamed files exist, then determine which unnamed file corresponds to which datafile by using one of these methods:
- Open the alert_SID.log, which contains messages about the original file location for each unnamed file.
- Derive the original file location of each unnamed file from the error message and V$DATAFILE: each unnamed file corresponds to the file in the error message with the same file number.

Issue the ALTER DATABASE RENAME FILE statement to rename the datafiles. For example, enter:

ALTER DATABASE RENAME FILE '/db/UNNAMED00002' TO '/oracle/dbs/db3.f';
ALTER DATABASE RENAME FILE '/db/UNNAMED00003' TO '/oracle/dbs/db2.f';

Continue recovery by issuing the previous recovery statement. For example:
```
RECOVER AUTOMATIC DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL
```

Recovering Transportable Tablespaces: Scenario

The transportable tablespace feature of Oracle allows a user to transport a set of tablespaces from one database to another. Transporting a tablespace into a database is like creating a tablespace with preloaded data. Using this feature is often an advantage because:

It is faster than using the Export or SQL*Loader utilities because it involves only copying datafiles and integrating metadata
You can use it to move index data, hence avoiding the necessity of rebuilding indexes

Like normal tablespaces, transportable tablespaces are recoverable. While you can recover normal tablespaces without a backup, you must have a version of the transported datafiles in order to recover a transported tablespace.

To recover a transportable tablespace:

If the database is open, then take the transported tablespace offline. For example, if you want to recover the users tablespace, then issue:
```
ALTER TABLESPACE users OFFLINE IMMEDIATE;
```
Restore a backup of the transported datafiles using an operating system utility. The backup can be the initial version of the transported datafiles or any backup taken after the tablespace is transported. For example, enter:
```
% cp /backup/users.dbf /oracle/dbs/users.dbf
```
Recover the tablespace as normal. For example, enter:
```
RECOVER TABLESPACE users
```

Oracle may signal ORA-01244 when recovering through a transportable tablespace operation just as when recovering through a CREATE TABLESPACE operation. In this case, rename the unnamed files to the correct locations using the procedure in "Recovering Through an Added Datafile: Scenario".

See Also:

Oracle9i Database Administrator's Guide for detailed information about using the transportable tablespace feature

Recovering After the Loss of Online Redo Log Files: Scenarios

If a media failure has affected the online redo logs of a database, then the appropriate recovery procedure depends on the following:

The configuration of the online redo log: mirrored or non-mirrored
The type of media failure: temporary or permanent
The types of online redo log files affected by the media failure: current, active, unarchived, or inactive

Table 6-1 displays V$LOG status information that can be crucial in a recovery situation involving online redo logs.

Table 6-1 STATUS Column of V$LOG

Status	Description
`UNUSED`	The online redo log has never been written to.
`CURRENT`	The log is active, that is, needed for instance recovery, and it is the log to which Oracle is currently writing. The redo log can be open or closed.
`ACTIVE`	The log is active, that is, needed for instance recovery, but is not the log to which Oracle is currently writing.It may be in use for block recovery, and may or may not be archived.
`CLEARING`	The log is being re-created as an empty log after an `ALTER` `DATABASE` `CLEAR` `LOGFILE` statement. After the log is cleared, then the status changes to `UNUSED`.
`CLEARING_CURRENT`	The current log is being cleared of a closed thread. The log can stay in this status if there is some failure in the switch such as an I/O error writing the new log header.
`INACTIVE`	The log is no longer needed for instance recovery. It may be in use for media recovery, and may or may not be archived.

The following sections describe the appropriate recovery strategies for these situations:

Recovering After Losing a Member of a Multiplexed Online Redo Log Group

If the online redo log of a database is multiplexed, and if at least one member of each online redo log group is not affected by the media failure, then Oracle allows the database to continue functioning as normal. Oracle writes error messages to the LGWR trace file and the alert_SID.log of the database.

Solve the problem by taking one of the following actions:

If the hardware problem is temporary, then correct it. LGWR accesses the previously unavailable online redo log files as if the problem never existed.
If the hardware problem is permanent, then drop the damaged member and add a new member by using the following procedure.

Note:
The newly added member provides no redundancy until the log group is reused.

To replace a damaged member of a redo log group:

Locate the filename of the damaged member in V$LOGFILE. The status is INVALID if the file is inaccessible:

SELECT GROUP#, STATUS, MEMBER 
FROM V$LOGFILE
WHERE STATUS='INVALID';

GROUP#    STATUS       MEMBER
-------   -----------  ---------------------
0002      INVALID       /oracle/dbs/log2b.f

Drop the damaged member. For example, to drop member log2b.f from group 2, issue:
```
ALTER DATABASE DROP LOGFILE MEMBER '/oracle/dbs/log2b.f';
```
Add a new member to the group. For example, to add log2c.f to group 2, issue:
```
ALTER DATABASE ADD LOGFILE MEMBER '/oracle/dbs/log2c.f' TO GROUP 2;
```
If the file you want to add already exists, then it must be the same size as the other group members, and you must specify REUSE. For example:
```
ALTER DATABASE ADD LOGFILE MEMBER '/oracle/dbs/log2b.f' REUSE TO GROUP 2;
```

Recovering After the Loss of All Members of an Online Redo Log Group

If a media failure damages all members of an online redo log group, then different scenarios can occur depending on the type of online redo log group affected by the failure and the archiving mode of the database.

If the damaged log group is inactive, then it is not needed for crash recovery; if it is active, then it is needed for crash recovery.

If the group is . . .	Then . . .	And you should . . .
Inactive	It is not needed for crash recovery	Clear the archived or unarchived group.
Active	It is needed for crash recovery	Attempt to issue a checkpoint and clear the log; if impossible, then you must restore a backup and perform incomplete recovery up to the most recent available log.
Current	It is the log that Oracle is currently writing to	Attempt to clear the log; if impossible, then you must restore a backup and perform incomplete recovery up to the most recent available log.

Your first task is to determine whether the damaged group is active or inactive.

To determine whether the damaged groups are active:

Locate the filename of the lost redo log in V$LOGFILE and then look for the group number corresponding to it. For example, enter:

SELECT GROUP#, STATUS, MEMBER FROM V$LOGFILE;

GROUP#    STATUS       MEMBER
-------   -----------  ---------------------
0001                    /oracle/dbs/log1a.f
0001                    /oracle/dbs/log1b.f
0002      INVALID       /oracle/dbs/log2a.f
0002      INVALID       /oracle/dbs/log2b.f
0003                    /oracle/dbs/log3a.f
0003                    /oracle/dbs/log3b.f

Determine which groups are active. For example, enter:

SELECT GROUP#, MEMBERS, STATUS, ARCHIVED FROM V$LOG;

GROUP#  MEMBERS           STATUS     ARCHIVED
------  -------           ---------  -----------
 0001   2                 INACTIVE   YES
 0002   2                 ACTIVE     NO
 0003   2                 CURRENT    NO

If the affected group is inactive, follow the procedure in "Losing an Inactive Online Redo Log Group". If the affected group is active (as in the preceding example), then follow the procedure in "Losing an Active Online Redo Log Group".

Losing an Inactive Online Redo Log Group

If all members of an online redo log group with INACTIVE status are damaged, then the procedure depends on whether you can fix the media problem that damaged the inactive redo log group.

If the failure is . . .	Then . . .
Temporary	Fix the problem. LGWR can reuse the redo log group when required.
Permanent	The damaged inactive online redo log group eventually halts normal database operation. Reinitialize the damaged group manually by issuing the `ALTER` `DATABASE` `CLEAR` `LOGFILE` statement as described in this section.

You can clear an active redo log group when the database is open or closed. The procedure depends on whether the damaged group has been archived.

To clear an inactive, online redo log group that has been archived:

If the database is shut down, then start a new instance and mount the database:
```
STARTUP MOUNT
```
Reinitialize the damaged log group. For example, to clear redo log group 2, issue the following statement:
```
ALTER DATABASE CLEAR LOGFILE GROUP 2;
```

To clear an inactive, online redo log group that has not been archived:

Clearing an unarchived log allows it to be reused without archiving it. This action makes backups unusable if they were started before the last change in the log, unless the file was taken offline prior to the first change in the log. Hence, if you need the cleared log file for recovery of a backup, then you cannot recover that backup. Also, it prevents complete recovery from backups due to the missing log.

If the database is shut down, then start a new instance and mount the database:
```
STARTUP MOUNT
```
Clear the log using the UNARCHIVED keyword. For example, to clear log group 2, issue:
```
ALTER DATABASE CLEAR LOGFILE UNARCHIVED GROUP 2;
```
If there is an offline datafile that requires the cleared unarchived log to bring it online, then the keywords UNRECOVERABLE DATAFILE are required. The datafile and its entire tablespace have to be dropped because the redo necessary to bring it online is being cleared, and there is no copy of it. For example, enter:
```
ALTER DATABASE CLEAR LOGFILE UNARCHIVED GROUP 2 UNRECOVERABLE DATAFILE;
```
Immediately back up the database with an operating system utility as described in "Making User-Managed Backups of the Whole Database". Now you can use this backup for complete recovery without relying on the cleared log group. For example, enter:
```
% cp /disk1/oracle/dbs/*.f /disk2/backup
```
Back up the database's control file using the ALTER DATABASE statement as described in "Backing Up the Control File to a Binary File". For example, enter:
```
ALTER DATABASE BACKUP CONTROLFILE TO '/oracle/dbs/cf_backup.f';
```

Failure of CLEAR LOGFILE Operation

The ALTER DATABASE CLEAR LOGFILE statement can fail with an I/O error due to media failure when it is not possible to:

Relocate the redo log file onto alternative media by re-creating it under the currently configured redo log filename
Reuse the currently configured log filename to re-create the redo log file because the name itself is invalid or unusable (for example, due to media failure)

In these cases, the ALTER DATABASE CLEAR LOGFILE statement (before receiving the I/O error) would have successfully informed the control file that the log was being cleared and did not require archiving. The I/O error occurred at the step in which the CLEAR LOGFILE statement attempts to create the new redo log file and write zeros to it. This fact is reflected in V$LOG.CLEARING_CURRENT.

Losing an Active Online Redo Log Group

If the database is still running and the lost active log is not the current log, then issue the ALTER SYSTEM CHECKPOINT statement. If successful, then the active log is rendered inactive, and you can follow the procedure in "Losing an Inactive Online Redo Log Group". If unsuccessful, or if your database has halted, then perform one of procedures in this section, depending on the archiving mode.

Note that the current log is the one LGWR is currently writing to. If a LGWR I/O fails, then LGWR terminates and the instance crashes. In this case, you must restore a backup, perform incomplete recovery, and open the database with the RESETLOGS option.

To recover from loss of an active online redo log group in NOARCHIVELOG mode:

If the media failure is temporary, then correct the problem so that Oracle can reuse the group when required.
Restore the database from a consistent, whole database backup (datafiles and control files) as described in "Restoring Datafiles". For example, enter:
```
% cp /disk2/backup/*.f /disk1/oracle/dbs
```
Mount the database:
```
STARTUP MOUNT
```
Because online redo logs are not backed up, you cannot restore them with the datafiles and control files. In order to allow Oracle to reset the online redo logs, you must first mimic incomplete recovery:
```
RECOVER DATABASE UNTIL CANCEL
CANCEL
```
Open the database using the RESETLOGS option:
```
ALTER DATABASE OPEN RESETLOGS;
```
Shut down the database consistently. For example, enter:
```
SHUTDOWN IMMEDIATE
```
Make a whole database backup as described in "Making User-Managed Backups of the Whole Database". For example, enter:
```
% cp /disk1/oracle/dbs/*.f /disk2/backup
```

To recover from loss of an active online redo log group in ARCHIVELOG mode:

If the media failure is temporary, then correct the problem so that Oracle can reuse the group when required. If the media failure is not temporary, then use the following procedure.

Begin incomplete media recovery. Use the procedure given in "Performing Incomplete User-Managed Media Recovery", recovering up through the log before the damaged log.
Ensure that the current name of the lost redo log can be used for a newly created file. If not, then rename the members of the damaged online redo log group to a new location. For example, enter:
```
ALTER DATABASE RENAME FILE "/oracle/dbs/log_1.rdo" TO "/temp/log_1.rdo";
ALTER DATABASE RENAME FILE "/oracle/dbs/log_2.rdo" TO "/temp/log_2.rdo";
```
Open the database using the RESETLOGS option:
```
ALTER DATABASE OPEN RESETLOGS;
```
Note:
All updates executed from the endpoint of the incomplete recovery to the present must be re-executed.

Loss of Multiple Redo Log Groups

If you have lost multiple groups of the online redo log, then use the recovery method for the most difficult log to recover. The order of difficulty, from most difficult to least difficult, follows:

The current online redo log
An active online redo log
An unarchived online redo log
An inactive online redo log

Recovering After the Loss of Archived Redo Log Files: Scenario

If the database is operating in ARCHIVELOG mode, and if the only copy of an archived redo log file is damaged, then the damaged file does not affect the present operation of the database. The following situations can arise, however, depending on when the redo log was written and when you backed up the datafile.

If you backed up . . .	Then . . .
All datafiles after the filled online redo log group (which is now archived) was written	The archived version of the filled online redo log group is not required for complete media recovery operation.
A specific datafile before the filled online redo log group was written	If the corresponding datafile is damaged by a permanent media failure, use the most recent backup of the damaged datafile and perform incomplete recovery up to the damaged log.

Caution:

If you know that an archived redo log group has been damaged, immediately back up all datafiles so that you will have a whole database backup that does not require the damaged archived redo log.

Recovering from User Errors: Scenario

An accidental operational or programmatic change to the database can cause loss or corruption of data. Recovery may require a return to a state prior to the error.

Note:

If you have granted powerful privileges (such as DROP ANY TABLE) to only selected, appropriate users, you can minimize user errors that require database recovery.

To recover a table that has been accidentally dropped:

If possible, keep the database that experienced the user error online and available for use. Back up all datafiles of the existing database in case an error is made during the remaining steps of this procedure.
Restore a database backup to an alternative location, then perform incomplete recovery of this backup using a restored backup control file, to the point just before the table was dropped (as described in "Performing Incomplete User-Managed Media Recovery").
Export the lost data from the temporary, restored version of the database using the Oracle utility Export. In this case, export the accidentally dropped table.

Note:
System audit options are exported.
Use the Import utility to import the data back into the production database.
Delete the files of the temporary copy of the database to conserve space.

See Also:
Oracle9i Database Utilities for more information about the Import and Export utilities

Performing Media Recovery in a Distributed Environment: Scenario

The manner in which you perform media recovery depends on whether your database participates in a distributed database system. The Oracle distributed database architecture is autonomous. Therefore, depending on the type of recovery operation selected for a single, damaged database, you may have to coordinate recovery operations globally among all databases in the distributed system.

Table 6-2 summarizes different types of recovery operations and whether coordination among nodes of a distributed database system is required.

Table 6-2 Recovery Operations in a Distributed Database Environment

If you are . . .	Then . . .
Restoring a whole backup for a database that was never accessed from a remote node	Use non-coordinated, autonomous database recovery.
Restoring a whole backup for a database that was accessed by a remote node for a database in `NOARCHIVELOG` mode	Shut down all databases and restore them using the same coordinated full backup.
Performing complete media recovery of one or more databases in a distributed database	Use non-coordinated, autonomous database recovery.
Performing incomplete media recovery of a database that was never accessed by a remote node	Use non-coordinated, autonomous database recovery.
Performing incomplete media recovery of a database that was accessed by a remote node	Use coordinated, incomplete recovery to the same global point in time for all databases in the distributed system.

Coordinating Time-Based and Change-Based Distributed Database Recovery

In special circumstances, one node in a distributed database may require recovery to a past time. To preserve global data consistency, it is often necessary to recover all other nodes in the system to the same point in time. This operation is called coordinated, time-based, distributed database recovery. The following tasks should be performed with the standard procedures of time-based and change-based recovery described in this chapter.

Recover the database that requires the recovery operation using time-based recovery, as described in "Performing Time-Based Incomplete Recovery". For example, if a database needs to be recovered because of a user error (such as an accidental table drop), then recover this database first using time-based recovery. Do not recover the other databases at this point.
After you have recovered the database and opened it with the RESETLOGS option, search the alert_SID.log of the database for the RESETLOGS message.

If the message is, "RESETLOGS after complete recovery through change xxx", then you have applied all the changes in the database and performed complete recovery. Do not recover any of the other databases in the distributed system, or you will unnecessarily remove changes in them. Recovery is complete.

If the message is, "RESETLOGS after incomplete recovery UNTIL CHANGE xxx", then you have successfully performed an incomplete recovery. Record the change number from the message and proceed to the next step.
Recover all other databases in the distributed database system using change-based recovery, specifying the change number (SCN) from Step 2.

6 User-Managed Media Recovery Scenarios