Skip Headers

Oracle9i Database Utilities
Release 2 (9.2)

Part Number A96652-01
Go To Documentation Library
Go To Product List
Book List
Go To Table Of Contents
Go To Index

Master Index


Go to previous page Go to next page

External Tables Concepts

The Oracle9i external tables feature is a complement to existing SQL*Loader functionality. It allows you to access data in external sources as if it were in a table in the database.

External tables are read-only. No data manipulation language (DML) operations or index creation is allowed on an external table. Therefore, SQL*Loader may be the better choice in data loading situations that require additional indexing of the staging table. See Behavior Differences Between SQL*Loader and External Tables for more information on how load behavior differs between SQL*Loader and external tables.

To use the external tables feature, you must have some knowledge of the file format and record format of the datafiles on your platform. You must also know enough about SQL to be able to create an external table and perform queries against it.

This chapter discusses the following topics:

The Access Driver

An external table describes how the external table layer must present the data to the server. The access driver and the external table layer transform the data in the datafile to match the external table definition.

When you create an external table of a particular type, you provide access parameters that describe the external data source. If you do not specify a type for the external table, then the ORACLE_LOADER type is used as a default. For a description of the access parameters for the ORACLE_LOADER type, see Chapter 12.

The description of the data in the data source is separate from the definition of the external table. This means that:

The access driver ensures that data from the data source is processed so that it matches the definition of the external table.

In the following example, a traditional table named emp is defined along with an external table named emp_load.

CREATE TABLE emp (emp_no CHAR(6), last_name CHAR(25), first_name CHAR(20), middle_initial 

CREATE TABLE emp_load (employee_number CHAR(5), employee_last_name CHAR(20),
                       employee_first_name CHAR(15), employee_middle_name CHAR(15))
                      ACCESS PARAMETERS (RECORDS FIXED 62 FIELDS (employee_number INTEGER(2),
                                                                 employee_dob CHAR(20),
                                                                 employee_last_name CHAR(18),
                                                                 employee_first_name CHAR(11),
                                                                 employee_middle_name CHAR(11)))
                                          LOCATION ('foo.dat'));

INSERT INTO emp (emp_no, first_name, middle_initial, last_name)
(SELECT employee_number, employee_first_name, 
        substr(employee_middle_name, 1, 1),
FROM emp_load);

Note the following in the preceding example:

External Table Restrictions

This section lists what external tables does not do and also describes some processing restrictions.

Location of Datafiles and Output Files

The access driver runs inside of the database server. This is different from SQL*Loader, which is a client program that sends the data to be loaded over to the server. This difference has the following implications:

The access driver does not allow you to specify random names for a file. This is because the server may have access to files that you do not, and allowing you to read this data would affect security. Similarly, you cannot specify a location for an output file, because the server could overwrite a file that you might not normally have privileges to delete.

Instead, you are required to specify directory objects as the locations from which to read files and write files. A directory object maps a name to a directory name on the file system. For example, the following statement creates a directory object named load_src.

create directory load_src as '/usr/apps/datafiles';

Directory objects can be created by DBAs or by any user with the CREATE ANY DIRECTORY privilege. After a directory is created, the user creating the directory object needs to grant READ or WRITE permission on the directory to other users. For example, to allow the server to read files on behalf of user scott in the directory named by load_src, the user who created the directory object must execute the following command:


The name of the directory object can appear in the following places in a CREATE TABLE...ORGANIZATION EXTERNAL statement:

The SYS user is the only user that can own directory objects, but the SYS user can grant other users the privilege to create directory objects. Note that READ or WRITE permission to a directory object only means that the Oracle database server will read or write that file on your behalf. You are not given direct access to those files outside of the Oracle database server unless you have the appropriate operating system privileges. Similarly, the Oracle database server requires permission from the operating system to read and write files in the directories.

Using External Tables to Load Data

The main use for external tables is as a row source for loading data into a real table in the database. After you create an external table, you can issue a CREATE TABLE AS SELECT or INSERT INTO... AS SELECT statement using the external table as the source of the SELECT clause. Remember that external tables are read-only, so you cannot insert into them or update records in them.

When the external table is accessed through a SQL statement, the fields of the external table can be used just like any other field in a normal table. In particular, the fields can be used as arguments for any SQL built-in function, PL/SQL function, or Java function. This allows you to manipulate the data from the external source.

Although external tables cannot contain a column object, you can use constructor functions to build a column object from attributes in the external table. For example, assume a table in the database is defined as follows:

CREATE TYPE student_type AS object (
student_no CHAR(5),
name CHAR(20))

  student student_type,
  grade CHAR(2));

Also assume there is an external table defined as follows:

CREATE TABLE roster_data (
  student_no CHAR(5),
  name CHAR(20),
  grade CHAR(2))
                         ACCESS PARAMETERS (FIELDS TERMINATED BY ',')
                         LOCATION ('foo.dat'));

To load table roster from roster_data, you would specify something similar to the following:

INSERT INTO roster (student, grade)
  (SELECT student_type(student_no, name), grade FROM roster_data);

Parallel Access to External Tables

To enable external table support of parallel processing on the datafiles, use the PARALLEL clause when you create the external table. The access driver attempts to divide large datafiles into chunks that can be processed separately.

The following file, record, and data characteristics make it impossible for a file to be processed in parallel:

Performance Hints When Using External Tables

When you monitor performance, the most important measurement is the elapsed time for a load. Other important measurements are CPU usage, memory usage, and I/O rates.

You can alter performance by increasing or decreasing the degree of parallelism. The degree of parallelism indicates the number of access drivers that can be started to process the datafiles. The degree of parallelism allows you to choose on a scale between slower load with little resource usage and faster load with all resources utilized. The access driver cannot automatically tune itself, because it cannot determine how many resources you want to dedicate to the access driver.

Performance can also sometimes be increased with use of date cache functionality. By using the date cache to specify the number of unique dates anticipated during the load, you can reduce the number of date conversions done when many duplicate date or timestamp values are present in the input data. The date cache functionality provided by external tables is identical to the date cache functionality provided by SQL*Loader. See Specifying a Value for the Date Cache for a detailed description.

In addition to changing the degree of parallelism and using the date cache to improve performance, consider the following information:

The access driver takes advantage of multithreading to streamline the work as much as possible.

Behavior Differences Between SQL*Loader and External Tables

This section describes important differences between loading data with external tables as opposed to loading data with SQL*Loader conventional and direct path loads.

Multiple Primary Input Datafiles

If there are multiple primary input datafiles with SQL*Loader loads, a bad file and a discard file are created for each input datafile. With external table loads, there is only one bad file and one discard file for all input datafiles. If parallel access drivers are used for the external table load, each access driver has its own bad file and discard file.

Syntax and Datatypes

The following are not supported with external table loads:

Rejected Rows

With SQL*Loader, if the SEQUENCE parameter is used and there are rejected rows, the rejected row still updates the sequence number value. With external tables, if the SEQUENCE parameter is used, rejected rows do not update the sequence number value. For example, suppose you load 5 rows with sequence numbers beginning with 1 and incrementing by 1. In SQL*Loader, if rows 2 and 4 are rejected, the successfully loaded rows are assigned the sequence numbers 1, 3, and 5. In an external table load, the successfully loaded rows are assigned the sequence numbers 1, 2, and 3.

Byte-Order Marks

With SQL*Loader, if a primary datafile uses a Unicode character set (UTF8 or UTF16) and it also contains a byte-order mark (BOM), then the byte-order mark is written at the beginning of the corresponding bad and discard files. With external table loads, the byte-order mark is not written at the beginning of the bad and discard files.

Default Character Sets and Date Masks

For fields in a datafile, it is the client's NLS environment that determines the default character set and date masks. For fields in external tables, it is the server's NLS environment that determines the default character set and date masks.

Go to previous page Go to next page
Copyright © 1996, 2002 Oracle Corporation.

All Rights Reserved.
Go To Documentation Library
Go To Product List
Book List
Go To Table Of Contents
Go To Index

Master Index