• Keine Ergebnisse gefunden

Background and Related Work

2.2 OLAP and the Multidimensional Data Model

2.2.2 OLAP Operations

Cube structure builds the foundation of OLAP applications. A variety of cross-dimensional calculations and aggregations for real-time data analysis can be performed within a cube or across multiple cubes by applying specialized query operations. OLAP operators pursue different tasks, from specifying the subset of interest and manipulating its dimensionality and granularity to filtering the dataset, obtaining new measures by linking multiple cubes, ranking the aggregates according to some function, specifying user-defined hierarchies, etc.

The operators can be subdivided by function into the following groups:

18 Chapter 2 : Background and Related Work

Roll-up Drill-down

Roll-up Drill-down

Roll-up Drill-d

own

Roll-up

Drill-down

Figure 2.5:DRILL-DOWNandROLL-UPacross dimensions (top) and within a dimension hierarchy (bottom)

1. AggregationoperatorsDRILL-DOWNandROLL-UPand their variants (DRILL-THROUGH,DRILL-WITHIN, DRILL-ASIDE,PROJECT) manipulate dimensionality and granularity of the output data cube. The abil-ity to drill down into any level of any dimension is known as“drilling anywhere”.

2. FilteringoperatorsSLICE&DICEand its special cases (SLICE,DICE,SELECT,FILTER,CONDITIONAL HIGHLIGHTING) reduce the size of the data set by adding predicates on dimensional characteristics.

RANKoperator applies filtering and ranking to the aggregated values themselves.

3. ReorderingoperatorsPIVOT(ROTATE) andSWITCHenable visual rearrangement of the output without any changes in the underlying dataset.

4. Restructuring operatorsDRILL-ACROSS,DRILL-AROUND,PUSH&PULL, andINSERT/DELETE LEVEL transform the cube’s schema to obtain new measures, dimensions, or hierarchy levels.

To gain an insight into the specifics of OLAP queries, we provide a detailed description of the popular operators from the above enumeration:

DRILL-DOWNandROLL-UPare inverse operations that use dimension hierarchies to perform aggrega-tion steps. ROLL-UPaggregates a measure to a coarser granularity whereas DRILL-DOWN navigates from aggregated data to a higher level of detail. Drilling down and rolling up by respectively adding and eliminating dimensions from the drill path changes the dimensionality of the resulting cube, as shown in the upper part of Figure 2.5. However, the dimensionality remains unaffected when drilling within a dimension, as presented in the lower part of Figure 2.5 at the example ofCountrydimension.

DRILL-THROUGHallows to jump back to the original fact data behind the selected aggregate values.

2.2 : OLAP and the Multidimensional Data Model 19

Define a slice by reducing Degree to

a single value "Mathematics, Dipl." Dice by filtering Semester to select only winter semester values

Figure 2.6: Example of aSLICE&DICEoperation

DRILL-WITHINrefers to switching from one classification to another within the same dimension. Drill-within prompts the user to select the path for drilling down at each level where multiple hierarchies are encountered within a dimension. For instance, at theYearlevel the user may choose to drill either into Monthor intoWeek.

DRILL-ASIDEenables navigation in a many-to-many mapping of the aggregation path. For instance, when drilling fromYeardown toWeek, the last week of a year may partially belong to the next year and, therefore, the user is prompted to decide how the respective subaggregates should be handled.

SLICE&DICEcorresponds to reducing the cube’s dimensionality by projecting the data onto a subset of dimensions while setting other dimensions to selected values, and is composed of two sub-operations:

SLICEselects a subcube corresponding to a single value for some dimension in the drill path, while DICEreduces the size of a slice by filtering its data along any dimension(s) in the drill path. Figure 2.6 visualizes the effects of a slice&dice in the student enrollments cube:Degreedimension is “sliced”

down to a single value “Mathematics, Dipl.” with subsequent “dicing” ofSemesterdimension by selecting values“WS04”and“WS05”.

SELECTis the dual of dicing: rather then selecting the elements of a dimension to be included, the user is prompted to specify the condition on eliminating the data from the result.

RANK, or topn/bottomnqueries, retrieves only the first/last cells in the result sorted by the aggregate’s value – for example, the 10 most popular degrees in the academic year 2004/05.

PIVOT, orROTATE, is a visualization operation that rotates the dimension axes in the view in order to provide an alternative presentation of the data. The data subset itself remains unaffected. Figure 2.7 gives an example of pivoting the original data cube by means of two rotations.

DRILL-ACROSSallows to query multiple cubes that have at least one common dimension, combining the results into a single data set [48, 82, 145]. The join is preceded by theROLL-UPof all participating cubes to the set of their common dimensions to ensure compatible granularity of the measures to be combined. Consider a sample task of deriving a new measureacceptance ratio(i.e., the number of enrollments divided by the number of applications), which can be achieved by combining enrollment facts with the corresponding application facts. The latter is available in form of a 2-dimensional cube with dimensionsSemesterandCountry. Figure 2.8 demonstrates the intermediate operation of aggre-gating the enrollment facts bySemesterandCountry– the set of dimensions common to both cubes – and the final step of computing a new cube. Advanced drill-across options are elaborated in [4].

PUSHandPULLoperations, proposed in [5], enable the interchange of dimension and measure roles in a query. The push operator is used to convert a dimension into a measure to be aggregated. The pull

20 Chapter 2 : Background and Related Work

Figure 2.7: Example of aPIVOToperation with two rotations (shaded cells are visible in both views)

Student enrollment numbers

Roll-up

Drill-across

Student enrollment numbers

Application numbers Acceptance ratios

Figure 2.8: Example of aDRILL-ACROSSoperation with two input cubes

operator is the converse of push: it converts a measure attribute into a dimension. Combined, push and pull enable uniform treatment of measure and dimension characteristics.