Skip Headers

Oracle9i OLAP Developer's Guide to the OLAP DML
Release 2 (9.2)

Part Number A95298-01
Go To Documentation Library
Home
Go To Product List
Book List
Go To Table Of Contents
Contents
Go To Index
Index

Master Index

Feedback

Go to previous page Go to beginning of chapter Go to next page

Defining Data Objects, 8 of 11


Defining Variables That Handle Sparse Data Efficiently

A variable with sparse data is one in which a relatively high percentage of the cells of the variable do not contain actual data. Such "empty," or NA, values take up storage space in the analytic workspace.

There are two types of sparsity:

Definition: Composite

A composite is an internal object that is used to store sparse data compactly in a variable. A composite is a list of dimension-value combinations in which one value is taken from each of the dimensions on which the composite is based.

Composites can be named or unnamed:

Because the values in composites are maintained automatically, using composites is the recommended way of handling sparsity in your analytic workspace.

Using composites is one of the most important steps you can take to manage sparsity, which contributes to keeping analytic workspace size to a minimum and promoting good performance.

Why You Should Use Named Composites

Using named composites makes it easier to track which variables share the same composite. A named composite in the dimension list of a variable tells Oracle OLAP that the dimensions in the named composite are sparse in this variable, and that this composite is shared only with other variables that have the same sparsity pattern.

In contrast, all variables defined with an unnamed composite that have exactly the same dimensions in the same order will automatically share that unnamed composite. If these variables have different sparsity patterns, performance will suffer.

You can also manage sparsity by using a conjoint dimension to hold dimension-value combinations for which a given variable has data. However, because the values in composites are automatically maintained, using composites is the recommended way of handling sparsity in your analytic workspace.

How to Use Composites

When you define a multidimensional variable, you can specify a composite in the list of dimensions.

First, define a named composite by using the DEFINE COMPOSITE command. Then, define the variables by using the following syntax to include a named composite in the dimension list of each variable.

composite-name <dims>

For example, suppose you define a composite named proddist, whose dimensions include product and district, as shown in the following command.

DEFINE proddist COMPOSITE <product district>

Now, suppose you want to define a sales variable in which time will be the fastest-varying dimension and the proddist composite will be the slowest-varying dimension, as shown in the following command.

DEFINE sales <time proddist<product district>>

Note that you should never use the SPARSE keyword with a composite. Essentially, you use the name of the composite instead of the SPARSE keyword.


See Also:

Naming, Renaming, and Unnaming Composites

You can use the RENAME command to:

Adding Data to a Variable That Uses a Composite

When you define a multidimensional variable, you can specify that a composite is used instead of its base dimensions to dimension the data. Later, as you add values to the dimensions of the variable for which you defined a composite, the following actions are taken:

For a variable that uses a composite, cells are created for only those dimension values that are used in the composite dimension-value combinations; it does not create a variable cell for every value in the base dimensions. Data for a variable is stored in order, cell by cell, for each combination of dimension values. From the perspective of data storage, each combination of base dimension values in a composite is treated like the value of a regular dimension. This means that if you define a variable with one regular dimension and one composite, then it is stored like a two-dimensional variable.

Example 3-1 Defining a Variable That Uses a Named Composite

If your company does promotional marketing for certain products in some but not all districts, then your variable data will be sparse along the product and district dimensions. Therefore, suppose you define a composite named proddist, whose base dimensions are product and district. There are dimension-value combinations in the composite only for those values that have data. For example, if you run a promotion for tents but not canoes, then the composite includes the tents and city combinations, but not the canoes and city combinations.

The following command creates a variable called promo that is dimensioned by month and a composite named proddist, whose base dimensions are product and district.

DEFINE promo INTEGER <month proddist<product district>>

The following conceptual figure illustrates the promo variable that is created by this command, the month, product and district base dimensions, a named composite (proddist) created from the product and district base dimensions, and the internal relation that is created between the product and district base dimensions and the proddist composite.


Text description of compvar.gif follows
Text description of the illustration compvar.gif

The following table is an example of the sequence in which the data for the promo variable might be stored.


 TENTS  BOSTON  JAN95  TENTS
 BOSTON
 FEB95
 TENTS
 BOSTON
 MAR95
 . . .
 . . .
 . . .
 RACQUETS
 CHICAGO
 JAN95
 RACQUETS
 CHICAGO
 FEB95
 . . .
 . . .
 . . .

 257

 379

 428

 . . .

 635

 192

 . . .

Defining a Variable with a Single-Dimension Composite

When you specify a composite for just one dimension in a variable definition, a single-dimension composite is created. The values of this composite will be a subset of the values in its base dimension.

It is a good idea to use single dimension composites when a variable will share the same dimensions as some other variables, but for a particular single dimension, the variable will only have data for some of the values of the dimension.

Suppose you have already defined a variable called actual with the dimensions line, division, and month. The actual variable does not contain any NA values. You need to define a variable called budget, which requires much less detail than actual. For example, budget only needs 10 percent of the line dimension values, while actual needs all of them.

If you define budget without setting sparsity, then all of the line dimension values are present for every month and division, but 90 percent of the line dimension cells will have NA values.

To handle sparse data in this case, you define budget with an unnamed composite for only the line dimension as shown below.

DEFINE budget DECIMAL <SPARSE <line> division month>

Go to previous page Go to beginning of chapter Go to next page
Oracle
Copyright © 2001, 2002 Oracle Corporation.

All Rights Reserved.
Go To Documentation Library
Home
Go To Product List
Book List
Go To Table Of Contents
Contents
Go To Index
Index

Master Index

Feedback