Skip Headers

Oracle9i Data Warehousing Guide
Release 2 (9.2)

Part Number A96520-01
Go To Documentation Library
Home
Go To Product List
Book List
Go To Table Of Contents
Contents
Go To Index
Index

Master Index

Feedback

Go to previous page Go to next page

7
Integrity Constraints

This chapter describes integrity constraints, and discusses:

Why Integrity Constraints are Useful in a Data Warehouse

Integrity constraints provide a mechanism for ensuring that data conforms to guidelines specified by the database administrator. The most common types of constraints include:

Constraints can be used for these purposes in a data warehouse:

Unlike data in many relational database environments, data in a data warehouse is typically added or modified under controlled circumstances during the extraction, transformation, and loading (ETL) process. Multiple users normally do not update the data warehouse directly, as they do in an OLTP system.

See Also:

Chapter 10, "Overview of Extraction, Transformation, and Loading"

Many significant constraint features have been introduced for data warehousing. Readers familiar with Oracle's constraint functionality in Oracle7 and Oracle8 should take special note of the functionality described in this chapter. In fact, many Oracle7-based and Oracle8-based data warehouses lacked constraints because of concerns about constraint performance. Newer constraint functionality addresses these concerns.

Overview of Constraint States

To understand how best to use constraints in a data warehouse, you should first understand the basic purposes of constraints. Some of these purposes are:

Typical Data Warehouse Integrity Constraints

This section assumes that you are familiar with the typical use of constraints. That is, constraints that are both enabled and validated. For data warehousing, many users have discovered that such constraints may be prohibitively costly to build and maintain. The topics discussed are:

UNIQUE Constraints in a Data Warehouse

A UNIQUE constraint is typically enforced using a UNIQUE index. However, in a data warehouse whose tables can be extremely large, creating a unique index can be costly both in processing time and in disk space.

Suppose that a data warehouse contains a table sales, which includes a column sales_id. sales_id uniquely identifies a single sales transaction, and the data warehouse administrator must ensure that this column is unique within the data warehouse.

One way to create the constraint is as follows:

ALTER TABLE sales ADD CONSTRAINT sales_unique 
UNIQUE(sales_id); 

By default, this constraint is both enabled and validated. Oracle implicitly creates a unique index on sales_id to support this constraint. However, this index can be problematic in a data warehouse for three reasons:

A unique index is required for unique constraints to ensure that each individual row modified in the sales table satisfies the UNIQUE constraint.

For data warehousing tables, an alternative mechanism for unique constraints is illustrated in the following statement:

ALTER TABLE sales ADD CONSTRAINT sales_unique 
UNIQUE (sales_id) DISABLE VALIDATE;

This statement creates a unique constraint, but, because the constraint is disabled, a unique index is not required. This approach can be advantageous for many data warehousing environments because the constraint now ensures uniqueness without the cost of a unique index.

However, there are trade-offs for the data warehouse administrator to consider with DISABLE VALIDATE constraints. Because this constraint is disabled, no DML statements that modify the unique column are permitted against the sales table. You can use one of two strategies for modifying this table in the presence of a constraint:

FOREIGN KEY Constraints in a Data Warehouse

In a star schema data warehouse, FOREIGN KEY constraints validate the relationship between the fact table and the dimension tables. A sample constraint might be:

ALTER TABLE sales ADD CONSTRAINT sales_time_fk
  FOREIGN KEY (sales_time_id) REFERENCES time (time_id)
  ENABLE VALIDATE;

However, in some situations, you may choose to use a different state for the FOREIGN KEY constraints, in particular, the ENABLE NOVALIDATE state. A data warehouse administrator might use an ENABLE NOVALIDATE constraint when either:

Suppose that the data warehouse loaded new data into the fact tables every day, but refreshed the dimension tables only on the weekend. During the week, the dimension tables and fact tables may in fact disobey the FOREIGN KEY constraints. Nevertheless, the data warehouse administrator might wish to maintain the enforcement of this constraint to prevent any changes that might affect the FOREIGN KEY constraint outside of the ETL process. Thus, you can create the FOREIGN KEY constraints every night, after performing the ETL process, as shown here:

ALTER TABLE sales ADD CONSTRAINT sales_time_fk
  FOREIGN KEY (sales_time_id) REFERENCES time (time_id)
  ENABLE NOVALIDATE;

ENABLE NOVALIDATE can quickly create an enforced constraint, even when the constraint is believed to be true. Suppose that the ETL process verifies that a FOREIGN KEY constraint is true. Rather than have the database re-verify this FOREIGN KEY constraint, which would require time and database resources, the data warehouse administrator could instead create a FOREIGN KEY constraint using ENABLE NOVALIDATE.

RELY Constraints

The ETL process commonly verifies that certain constraints are true. For example, it can validate all of the foreign keys in the data coming into the fact table. This means that you can trust it to provide clean data, instead of implementing constraints in the data warehouse. You create a RELY constraint as follows:

ALTER TABLE sales ADD CONSTRAINT sales_time_fk
  FOREIGN KEY (sales_time_id) REFERENCES time (time_id) 
  RELY DISABLE NOVALIDATE;

RELY constraints, even though they are not used for data validation, can:

Creating a RELY constraint is inexpensive and does not impose any overhead during DML or load. Because the constraint is not being validated, no data processing is necessary to create it.

Integrity Constraints and Parallelism

All constraints can be validated in parallel. When validating constraints on very large tables, parallelism is often necessary to meet performance goals. The degree of parallelism for a given constraint operation is determined by the default degree of parallelism of the underlying table.

Integrity Constraints and Partitioning

You can create and maintain constraints before you partition the data. Later chapters discuss the significance of partitioning for data warehousing. Partitioning can improve constraint management just as it does to management of many other operations. For example, Chapter 14, "Maintaining the Data Warehouse" provides a scenario creating UNIQUE and FOREIGN KEY constraints on a separate staging table, and these constraints are maintained during the EXCHANGE PARTITION statement.

View Constraints

You can create constraints on views. The only type of constraint supported on a view is a RELY constraint.

This type of constraint is useful when queries typically access views instead of base tables, and the DBA thus needs to define the data relationships between views rather than tables. View constraints are particularly useful in OLAP environments, where they may enable more sophisticated rewrites for materialized views.

See Also:

Chapter 8, "Materialized Views" and Chapter 22, "Query Rewrite"


Go to previous page Go to next page
Oracle
Copyright © 1996, 2002 Oracle Corporation.

All Rights Reserved.
Go To Documentation Library
Home
Go To Product List
Book List
Go To Table Of Contents
Contents
Go To Index
Index

Master Index

Feedback