Long running RMAN jobs

If you are going to restore a huge database, then it’s important, that the rman session can’t abort due to a session disconnection. So it’s best to start rman as follows under linux:

nohup rman target / cmdfile rmancommand log rmanlog.txt &

you can monitor the rman session and check the estimated remaining time in seconds

> set lines 132
> col message for a100
> SELECT time_remaining, message FROM v$session_longops WHERE substr(opname,1,4)='RMAN' and time_remaining > 0;

TIME_REMAINING MESSAGE
-------------- ----------------------------------------------------------------------------------------------------
           767 RMAN: incremental datafile backup: Set Count 98209: 458878 out of 1048576 Blocks done
           767 RMAN: incremental datafile backup: Set Count 98206: 459262 out of 1048576 Blocks done
            74 RMAN: incremental datafile backup: Set Count 98226: 323710 out of 786432 Blocks done
           760 RMAN: incremental datafile backup: Set Count 98212: 458622 out of 1048576 Blocks done
           725 RMAN: incremental datafile backup: Set Count 98203: 487678 out of 1048576 Blocks done
           879 RMAN: incremental datafile backup: Set Count 98219: 328574 out of 1048576 Blocks done
           383 RMAN: aggregate input: backup 32: 509908909 out of 541901262 Blocks done
           305 RMAN: incremental datafile backup: Set Count 98217: 618878 out of 1048576 Blocks done
            45 RMAN: incremental datafile backup: Set Count 98225: 756862 out of 1048576 Blocks done

Here you can see 383 remaining seconds for this backup.

Recreate lost redo log or controlfile

What you do if you lose a online redo log or controlfile (just one or one member of a group)

Tests with 11g and dNFS.

Example 1: After a controlfile is lost you will see the following messages in alert log and after ~3 seconds the db will shutdown itself

Tue Oct 20 13:37:20 2009
Direct NFS: Read FAIL status  70
Direct NFS: NFS3ERR 70 error encountered. path fas-0002 mntport 4046 nfsport 2049
Errors in file /oracle/admin/diag/rdbms/testdb1/TESTDB1/trace/TESTDB1_lgwr_11804.trc:
ORA-00202: control file: '/oracle/oradata/TESTDB1/control2/control02.ora'
ORA-17500: ODM err:KGNFS READ FAIL
Errors in file /oracle/admin/diag/rdbms/testdb1/TESTDB1/trace/TESTDB1_lgwr_11804.trc:
ORA-00204: error in reading (block 1, # blocks 1) of control file
ORA-00202: control file: '/oracle/oradata/TESTDB1/control2/control02.ora'
ORA-17500: ODM err:KGNFS READ FAIL
LGWR (ospid: 11804): terminating the instance due to error 204
Instance terminated by LGWR, pid = 11804

Solution: Copy the other controlfile or adapt the parameter control_files

Example 2: A online redo log member is lost. you will see the following msg in alert log each time a log switch happens, but db is still running

Tue Oct 20 13:51:14 2009
Direct NFS: write FAILED 70
Direct NFS: NFS3ERR 70 error encountered. path fas-0002 mntport 4046 nfsport 2049
Errors in file /oracle/admin/diag/rdbms/testdb1/TESTDB1/trace/TESTDB1_lgwr_12637.trc:
ORA-00345: redo log write error block 55 count 1
ORA-00312: online log 2 thread 1: '/oracle/oradata/TESTDB1/redo2/log2.dbf'
ORA-17500: ODM err:KGNFS WRITE FAIL
Errors in file /oracle/admin/diag/rdbms/testdb1/TESTDB1/trace/TESTDB1_lgwr_12637.trc:
ORA-00346: log member marked as STALE and closed
ORA-00312: online log 2 thread 1: '/oracle/oradata/TESTDB1/redo2/log2.dbf'

Solution: switch the logfiles until the group is no more active, then drop it and recreate the member:

select status,member from v$logfile;
STATUS  MEMBER
------- ------------------------------------------------------------
        /oracle/oradata/TESTDB1/redo1/log1.dbf
        /oracle/oradata/TESTDB1/redo2/log1.dbf
        /oracle/oradata/TESTDB1/redo1/log2.dbf
INVALID /oracle/oradata/TESTDB1/redo2/log2.dbf
alter system switch logfile;
ALTER DATABASE DROP LOGFILE MEMBER '/oracle/oradata/TESTDB1/redo2/log1.dbf';
ALTER DATABASE ADD LOGFILE MEMBER '/oracle/oradata/TESTDB1/redo2/log1.dbf' TO GROUP 1;

Oracle 11g Direct NFS

source: http://url4u.co/15FTGk

I have tested the Oracle NFS Driver on Oracle Enterprise Linux 5.1 with Oracle 11.1.0.7

Instead of using the Kernel NFS Driver, Oracle 11g provides its own Direct NFS driver. Hence Oracle will optimize the I/O path between Oracle and the NFS server.

The following is the order where Oracle searchs for NFS exports.

  1. $ORACLE_HOME/dbs/oranfstab
  2. /etc/oranfstab
  3. /etc/mtab (content of mounted /etc/fstab)

So you have just to configure your /etc/fstab,  with no special parameter as mentioned in metalink#359515.1 , as follow:

filername:/vol/nfs_data01/q_nfs_data01  /nfs/SID/data1   nfs

mount -a

Then you have to link some libraries in your $ORACLE_HOME

Oracle 11gR1

cd $ORACLE_HOME/lib
mv libodm11.so libodm11.so_stub
ln -s libnfsodm11.so libodm11.so

Oracle 11gR2

cd $ORACLE_HOME/rdbms/lib
make -f ins_rdbms.mk dnfs_on    <-- to enable
make -f ins_rdbms.mk dnfs_off   <-- to disable

Now you can start your Oracle Instance and check the alertlog file for this line:

Oracle instance running with ODM: Oracle Direct NFS ODM Library Version 2.0

Hence the Driver of Oracle is used for NFS.

There are also some Views to check the stats and the current nfs exports you are using:

SQL> SELECT svrname, dirname FROM v$dnfs_servers;

SVRNAME    DIRNAME
---------------
filername  /vol/nfs_data01/q_nfs_data01

Some other views:

v$dnfs_servers
v$dnfs_files
v$dnfs_channels
v$dnfs_stats