Working with Relational Tables, 3 of 7

Creating an Analytic Workspace from Relational Tables

When relational tables have been defined to the OLAP catalog using CWM1 metadata, you can use a tool provided with Oracle OLAP to design and populate an analytic workspace for the tables. For more information on creating an analytic workspace from relational tables in this manner, see Oracle9i OLAP User's Guide.

In other cases, you can design and populate an analytic workspace by taking the following steps:

Design the analytic workspace as described in "Process: Designing and Defining an Analytic Workspace to Hold Relational Data".
1. Define the analytic workspace using OLAP AW CREATE command.
2. Define the analytic workspace objects using the OLAP DEFINE command.
Define, write, and execute OLAP DML programs to populate the analytic workspace objects with relational data as described in "Process: Writing Programs that Populate Analytic Workspaces with Relational Data".
Aggregate the fact data up any hierarchies as described in Chapter 12, "Aggregating Data".

Process: Designing and Defining an Analytic Workspace to Hold Relational Data

One way that you can map a relational database to an analytic workspace is to take the following steps:

Identify the table columns that contain the fact data that you want to analyze. When the relational database is a data warehouse, these columns will be columns of a measure tables.
Identify the primary keys to the tables identified in step 1 and determine if any of these keys participate in any hierarchies. When the relational database is fully normalized, you can do this by following the foreign keys of the table. When the relational database contains summarized data, you can do this by, first, determining if the primary key columns are "children" of other columns, and then, following the "parent" columns up until you determine the complete hierarchy.
When there are hierarchies, decide if your applications need aggregated (summarized) fact data for each level of the hierarchy.
When your applications do not need aggregated data for any of the levels, then define a non-hierarchical dimension that you can use to hold the values of the primary key column as described in "Defining Dimensions".
When your applications need aggregated fact data for some or all of the levels, then define the following analytic workspace objects to represent the hierarchy:
1. An analytic workspace dimensions to hold the values of the levels for which you want aggregated data. You can define a hierarchical dimension as described in "Defining Hierarchical Dimensions and Variables That Use Them"; or you can define a concat dimension as described in "Defining Concat Dimensions and Variables That Use Them".
2. A self-relation for the hierarchy. This relation is dimensioned by the dimension described in step 5a. The values of this self-relation are the parents of each value in the hierarchy. For an example of a self-relation, see "Example: Self-relation".
Define the variables for facts you identified in step 1 and for dimension attributes that you want to use in your analysis. Typically, these variables are dimensioned by the dimensions that you identified in steps 4 and 5. However, if any of these variables are sparsely populated, then you can define a composite for the dimensions, and dimension the variables by that composite.

For an example, of an analytic workspace designed following this process, see "Designing and Defining an Analytic Workspace for Sales History Data".

Process: Writing Programs that Populate Analytic Workspaces with Relational Data

To populate the analytic workspace structures with data from relational tables, you write and execute one or more OLAP SQL programs that perform the following actions:

Define a SQL cursor and associate it with a SELECT statement or procedure as described in "Declaring a Cursor".
Open the SQL cursor defined in step 1 as described in "Opening a Cursor".
Retrieve and process data specified by the cursor opened in step 2 using wither the OLAP DML SQL IMPORT or SQL FETCH command as described in "Importing and Fetching Relational Table Data into Analytic Workspace Objects".

Note: You must declare and open a cursor from within a single OLAP DML program. You can fetch the data and close the cursor either in the same program or a different program.
Close the SQL cursor opened in step 2 as described in "Closing a Cursor".
Cancel all SQL cursor definitions and free the memory resources of SQL cursors as described in "Cleaning up the SQL Cursors".

Once the analytic workspace objects are populated, you can make these changes permanent using the OLAP DML UPDATE and COMMIT commands.

The rest of the topics in this section describe these steps in more detail. For examples of programs that populate an analytic workspace with data from relational tables, see "Populating Analytic Workspace Objects with Sales History Data".

Declaring a Cursor

In an OLAP DML program, you cannot issue a SELECT statement interactively. Instead, you must define a cursor which contains the SELECT statement. In the context of a query, a cursor can be thought of as simply a row marker in a relational table of data resulting from a query. Instead of receiving the results of a query all at once, your program receives the results row by row using the cursor.

A DECLARE CURSOR statement associates a cursor by name with the results of a data query. As an argument to the OLAP DML SQL command, the DECLARE CURSOR statement has the following syntax.

SQL DECLARE cursor-name CURSOR FOR select-statement

Tip:

You should write down SELECT statements that you think will retrieve the data you want to fetch. When possible, use an interactive interface such as SQL*Plus, SQL Worksheet, or OLAP Worksheet to test these SQL statements and make sure that they produce the results you expect. Afterward, you can modify these SELECT statements for use in your OLAP DML programs.

Example: Declaring a Cursor

In Example 10-1, "Declaring a Cursor", the cursor declaration selects rows from a relational table named costs in the sample Sales History (sh) schema. The costs table has several columns, including a column for product identification codes (prod_id) and a column for unit_price. The unit_price column is used in a WHERE clause to limit the returned rows to only those products in which the unit price is greater than $20.00.

Example 10-1 Declaring a Cursor

SQL DECLARE highprice CURSOR FOR -
   SELECT prod_id FROM costs -
      WHERE unit_price > 20

Using Variables in the WHERE Clause of the SELECT Statement

When you are declaring a cursor to be used by the OLAP DML SQL IMPORT command, you can only use literal values in the WHERE clause of a SELECT statement. However, when you are declaring a cursor to be used by the OLAP DML SQL FETCH command, you can use the values of input host variables instead of providing literal values in the WHERE clause of a SELECT statement.

Input host variables are values supplied by Oracle OLAP as parameters to a SQL command. They specify the data to be selected or provide values for data that is being modified. If you specify a dimension or a dimensioned variable, the first value in status is used; no implicit looping occurs, although you can use a FOR command to loop through all of the values. An input host variable can be any expression with an appropriate data type. When you use input host variables in a WHERE clause to match the data in a relational table, any required conversions between data types is performed wherever conversion is possible. The value of an input host variable is taken when a cursor is opened, not when it is declared.

An input host variable can be any expression preceded by a colon (for example, :myvar). However, if you specify a multidimensional expression, such as a variable or dimension, then the first value in status is used. Table 10-1 gives examples of expressions that can be used as input host variables. Example 10-2, "Using Input Host Variables" shows a program fragment that modifies the SQL command shown previously. Instead of using an explicit value in the WHERE clause, it uses the value of a local variable named set_price.

Table 10-1 Examples of Expressions That Can Be Used as Input Host Variables

Type of Expression	Example
Variable (database or local)	:set_price
Dimension	:prod
Qualified data reference	:units(prod 'P8', geog 'G12', time 'T36')
Program argument	:newval
Text expression	:joinchars('first_name' 'last_name')
Arithmetic expression	:intpart(6.3049) + 1
User-defined function	:getgeog

Example 10-2 Using Input Host Variables

VARIABLE set_price SHORT
set_price = 20
SQL DECLARE highprice CURSOR FOR -
   SELECT prod_id FROM costs -
      WHERE unit_price > :set_price

Using Conjunctions in a WHERE Clause

Because both the OLAP DML and SQL include AND and OR as part of their language syntax, you must use parentheses when using one of these conjunctions with an input host variable. Otherwise, the command might be ambiguous and produce unexpected results. Place the parentheses around the input host variable preceding AND and OR.

If a host variable expression begins with a parenthesis, then the matching right parenthesis is interpreted as the end of the host variable expression. If you plan to add more text to the expression after a matching right parenthesis, then you must enclose the entire expression with an extra set of parentheses.

The fragment of the program shown in Example 10-3 uses the values of two arguments to limit the range of values selected for the prod_id column of the relational table named products.

Example 10-3 Using Conjunctions in a WHERE Clause

prod1 = 415
prod1 = 49990      
      ...
SQL DECLARE twoprods CURSOR FOR -
   SELECT prod_id FROM products -
      WHERE prod_id EQ :(prod1) -
      AND :prod2

Opening a Cursor

After the SQL DECLARE CURSOR command has associated a cursor with a selection of data, you use the SQL OPEN statement to get ready to retrieve the data. These commands for a particular cursor must appear in the same OLAP DML program and can not contain ampersand substitution.

The following is the syntax of the SQL command with an OPEN statement as an argument.

SQL OPEN cursor-name

The SQL OPEN command:

Evaluates the input host variables (if any) used in the definition of the specified cursor.
Determines the active set of the cursor (that is, the rows that satisfy the SELECT statement).
Leaves the cursor in the open state for use by SQL FETCH or SQL IMPORT. The cursor is positioned before the first row of the result set

The active set of a cursor is determined when it is opened, and it is not updated later. Therefore, changing the value of an input host variable after opening its cursor does not affect the active set of a cursor.

Importing and Fetching Relational Table Data into Analytic Workspace Objects

After you open a cursor, you can use a SQL IMPORT or a SQL FETCH command statement to copy data from relational tables into analytic workspace objects. Before you use these SQL commands, ensure that you have access rights to the tables that you want to use.

SQL IMPORT or a SQL FETCH both copy data from relational tables into analytic workspace objects. Although SQL FETCH offers the most functionality, SQL IMPORT offers improved performance when copying large amounts of data from relational tables into analytic workspace objects.

SQL FETCH retrieves and processes data specified by a SQL cursor and assigns the retrieved data to OLAP objects. When you use a FETCH statement to retrieve data from relational tables, you must include it in a loop or use the LOOP argument to retrieve all of the rows of the active set of a cursor. Also, if you include a THEN clause, SQL FETCH may perform processing on the retrieved data. The following is the syntax of the SQL command using a FETCH statement as an argument.
```
SQL FETCH cursor [LOOP [loopcount]] INTO :targets... -

[THEN action-statements...]
```
SQL IMPORT advances the cursor position to each subsequent row of the active set of a cursor and delivers the selected fields into analytic workspace objects. The following is the syntax of the OLAP DML SQL command using an IMPORT statement as an argument.
```
SQL IMPORT cursor INTO :targets... 
```

In the syntax for SQL IMPORT and SQL FETCH, targets represents output host variables. An output host variable is an analytic workspace object that will be used to store the data retrieved from the relational tables. The order of the output host variables must match the order of the columns in the DECLARE CURSOR statement, and a colon must precede each output host variable name. The variable or dimension receiving the data must be defined already. It must also have a compatible data type.

For both IMPORT and FETCH, output host variables can be one or more of the following:

[MATCH] dimension|surrogate
APPEND  dimension
ASSIGN  surrogate
variable|qualified data reference|relation|composite

When an output host variable is a dimension, retrieved values are handled based on the keyword that you specify before the host variable name. You can specify either the MATCH keyword (the default) or the APPEND keyword.

With the MATCH keyword, only values that are the same as existing values of the dimension are fetched, and an error is signalled when a new value is encountered. You use it when fetching data into a variable whose dimensions are already maintained; the dimensions are included in the fetch only to align the data.
With the APPEND keyword, all values that do not match are added to the end of the list of dimension values. Also, for FETCH, values can be appended to an output host variable based on position using the following syntax for target:
```
APPEND [position] dimension
```

Table 10-2 provides examples of expressions that can be used as output host variables.

Table 10-2 Examples of Expressions That Can Be Used as Output Host Variables

Type of Expression	Example
Variable (database or local)	:sales_quantity_sold
Dimension or surrogate	:prodid
Qualified data reference	:sales_quantity_sold(prod_id 415 cust_id 18670 time_id '1998-01-04' channel_id 'S' promo_id 9999)

Whenever you fetch data into a dimensioned workspace variable, you must include the dimension values in the fetch. While you can add new dimension values at the same time, you do not need to add them when they already exist in your analytic workspace; instead, you use the dimension values in the fetch to align the data. In either case, be sure to fetch the dimension values before you fetch the values of the variable. Otherwise, the fetch will not loop through the dimension value.

Important:

When data is written into a dimension, it temporarily limits the status of the dimension to the value being matched or appended. This means that when the IMPORT statement or the FETCH statement also includes output host variables that are dimensioned by the specified dimension, the temporary status is observed when values are assigned to those variables.

Null values in a relational table are equivalent to NAs. In OLAP DML variables, null values do not pose a problem; they appear as NAs. However, you cannot have a dimension value of NA. Therefore, any rows that have a value of null are discarded in a column being fetched into a dimension.

Example: Copying Relational Table Data into Analytic Workspace Objects

Sometimes you want to copy data from relational tables into the analytic workspace to perform a quick analysis. For example, the sample Sales History database includes the sales table (described in Example 10-4 ) whose keys are prod_id, cust_id, time_id, channel_id, and promo_id and that contains two facts (quantity_sold and amount_sold).

Assume that you want to forecast the quantity sold for product 415 for the year 2002 using the forecasting commands available in the OLAP DML. In order to perform this analysis using the OLAP DML, the data must be in an analytic workspace. To copy the data into the analytic workspace, you must define the analytic workspace objects to hold the data, write an OLAP DML program to copy the data from the relational table to the analytic workspace objects, and, then, execute that program.

The simplest way to map the sales table to analytic workspace objects is to define one analytic workspace dimension for each of the key columns (aw_prod_id, aw_cust_id, aw_time_id, aw_channel_id, and aw_promo_id) and to define analytic workspace variables (dimensioned by those dimensions) to hold the data from the other columns (aw_quantity_sold and aw_amount_sold). However, in this case, the variables will be quite sparse along the time dimension. To avoid this sparsity, you can define a composite that represents all of the key dimensions and define the analytic workspace variables using this composite as shown in Example 10-5, "Analytic Workspace Definitions for Sales Data".

Example 10-6, "import_sales_for_prod415 Program" ) illustrates using SQL IMPORT to copy the data from the relational table into the analytic workspace objects. The fetch_sales_for_prod415 program (shown in Example 10-7, "fetch_sales_for_prod415 Program" ) illustrates using SQL FETCH to copy the data from the relational table into the analytic workspace objects. Both of these programs assume that values for aw_prod_id, aw_cust_id, aw_time_id, aw_channel_id, and aw_promo_id have not previously been copied into the analytic workspace. When you have defined a composite, Oracle OLAP automatically populates the composite as it populates the other analytic workspace objects.

Example 10-4 Description of the sales Table

 PROD_ID                                   NOT NULL NUMBER(6)
 CUST_ID                                   NOT NULL NUMBER
 TIME_ID                     NOT NULL DATE
 CHANNEL_ID                                NOT NULL CHAR(1)
 PROMO_ID                                  NOT NULL NUMBER(6)
 QUANTITY_SOLD                             NOT NULL NUMBER(3)
 AMOUNT_SOLD                               NOT NULL NUMBER(10,2)

Example 10-5 Analytic Workspace Definitions for Sales Data

DEFINE aw_prod_id DIMENSION NUMBER (6)
DEFINE aw_cust_id DIMENSION NUMBER (6) 
DEFINE aw_date DIMENSION TEXT
DEFINE aw_channel_id DIMENSION TEXT
DEFINE aw_promo_id DIMENSION NUMBER (6)
DEFINE aw_sales_dims COMPOSITE <aw_prod_id aw_cust_id aw_date -
     aw_channel_id aw_promo_id>
DEFINE aw_sales_quantity_sold VARIABLE NUMBER (3) <aw_sales_dims <aw_prod_id -
     aw_cust_id aw_date aw_channel_id paw_romo_id>>
DEFINE aw_sales_amount_sold VARIABLE NUMBER (10,2) <aw_sales_dims <aw_prod_id -
     aw_cust_id aw_date aw_channel_id aw_promo_id>>

Example 10-6 import_sales_for_prod415 Program

ALLSTAT
NLS_DATE_FORMAT = '<YYYY><MM><DD>'
DATEFORMAT = '<YYYY>-<MM>-<DD>'
" Declare a cursor named GRABDATA
SQL DECLARE grabdata CURSOR FOR SELECT prod_id, cust_id, time_id, -
    channel_id, promo_id, quantity_sold, amount_sold FROM sh.sales -
    WHERE prod_id = 415
" Import new values into the analytic workspace objects
SQL IMPORT grabdata INTO :APPEND aw_prod_id -
                         :APPEND aw_cust_id -
                         :APPEND aw_date -
                         :APPEND aw_channel_id -
                         :APPEND aw_promo_id -
                         :aw_sales_quantity_sold -
                         :aw_sales_amount_sold
" Update the analytic workspace and make the updates permanent
UPDATE
COMMIT

Example 10-7 fetch_sales_for_prod415 Program

ALLSTAT
NLS_DATE_FORMAT = '<YYYY><MM><DD>'
DATEFORMAT = '<YYYY>-<MM>-<DD>'
" Declare a cursor named GRABDATA
SQL DECLARE grabdata CURSOR FOR SELECT prod_id, cust_id, time_id, -
    channel_id, promo_id, quantity_sold, amount_sold FROM sh.sales -
    WHERE prod_id = 415
" Open the cursor
SQL OPEN grabdata
" Fetch new values into the analytic workspace objects
SQL FETCH grabdata LOOP INTO :APPEND aw_prod_id -
                             :APPEND aw_cust_id -
                             :APPEND aw_date -
                             :APPEND aw_channel_id -
                             :APPEND aw_promo_id -
                             :aw_sales_quantity_sold -
                             :aw_sales_amount_sold
" Close the cursor
SQL CLOSE grabdata
" Cleanup from SQL query
SQL CLEANUP
" Update the analytic workspace and make the updates permanent
UPDATE
COMMIT

Closing a Cursor

After you have used a cursor to retrieve all the data in its active set, you close the cursor. If you want to use the cursor again to retrieve data starting from the first row of its active set, then you can use the OPEN statement without having to declare the cursor again. The CLOSE statement does not cancel a cursor declaration; it only renders the active set undefined.

The following is the syntax of the CLOSE statement when it is used as an argument in the OLAP DML SQL command.

SQL CLOSE cursor-name

Cleaning up the SQL Cursors

Once you are completely done making OLAP DML SQL calls, you should cancel all the SQL cursor declarations and free the memory resources for all SQL cursors. You perform these actions by using CLEANUP as the argument to the OLAP DML SQL command:

SQL CLEANUP

After you have cancelled all SQL cursors in this manner, you cannot use them again unless you issue new SQL DECLARE CURSOR and SQL OPEN commands.