General Research Report
Implementation and
Optimization Issues of the ROLAP Algebra
F. Ramsak, M.S. (UIUC) Dr. V. Markl
Prof. R. Bayer, Ph.D.
Contents
Motivation
ROLAP Algebra Recap
Optimization Issues
– Handling of Restrictions – Aggregation Networks
Future Work & Summary
Example DW Model
Sales Cost Quantity
Year
Month All Time
Quarter Year
Month All Time
Quarter
Region Nation Trade Type Business Type
l Customer Region
Nation Trade Type Business Type
l Customer PR
Sector Categ
ory Produ
ct Group Conta
iner
All Products PR
Sector Categ
ory Produ
ct Group Conta
iner
All Products
User‘s View of a Report
Sum - Sales Year Quarter
1998 1998 Total Total
Region Nation 1 2 3 4
Asia China
Japan Asia Total
Europe France Germany Spain Europe Total
Total
1
1 2
2
3 9
7
5 8
4 7
5 8
6 4
Grouping combinations used to fill pivot table:
(1){Y,Q,R,N} (2){Y,Q,R} (3) {Y,Q}
(4) {Y,R,N} (5){Y,R} (6){Y}
(7) {R,N} (8) {R} (9){} = ALL
G
g
g
h F
R ALL
PAD F
G R
POT
G
h
) )),
( (
( )
, ,
(
POT: Pivot Organized Tuples
We may also write
for POT(R,G,F).
G,F
(R)
POT-Example
POT(R,{{A},{A,B}},{sum(D)}) yields the table:
A B Sum( D)
a1 ALL ***
… ***
n
ALL ***
a1 b1 ***
… … ***
an ***
… … ***
a1 bm ***
… … ***
an bm ***
a
A A ,B
) ( D
sum sum ( D )
R
, ( ALL )
pad A B pad A , B ( ALL )
A B s u m ( D ) A s u m ( D )
A B s u m ( D ) A B s u m ( D )
A B s u m ( D )
POT Extension: Group Filtering
Filtering of generated groups
(like with the HAVING clause in SQL)
with H containing a predicate H[g] for each grouping g in G
G
g
g F
g
h H
R ALL
PAD
F H
G R
POT
G
h
) ))),
( (
( (
) ,
, ,
(
]
[
Group Filtering Example
Report Years, Product-Group sales totals and sales/year for PGs with less than 10 Mio sales
)}) (
{
}}, 10
) (
{ {}, {{},
}}, ,
{ }, {
}, {{
, (
Sales sum
Mio Sales
sum
PG Y
PG Y
Fact POT
Straight Forward SQL Generation
POT(R,{{A},{A,B}},{sum(D)}) maps directly to:
SELECT A, ‘ALL’, sum(D)
FROM R
GROUP BY A
UNION
SELECT A, B, sum(D)
FROM R
GROUP BY A,B
Disadvantages:
– Efficient execution depends on optimizer of underlying DBMS – no UB-Tree support on SQL interface guaranteed
Handling Restrictions
Semantic of ALL value
Pushing Restrictions Down
– Pushing Through POT: Restrictions on all groups
– Pushing down inside POT:
Restrictions on individual groups may be pushed down (i.e., before grouping) if they do not contain constraints on the aggregation
results
) , ), ( (
)) ,
, (
(POT R G F POT R G F
? ))
}}, {
}, {{
, (
1998
(
POT R Year PG F
Year) ( Sales
sum ,
( ALL ) pad
Year Month
Year
Year,Month) ( Sales
sum
,
( ALL ) pad
Year MonthYear
Year1997 1998)
() ( Sales
sum ,
( ALL )
pad
Year Month
Year
Year,Month) ( Sales
sumFACT
,
( ALL )
pad
Year MonthYear
Year1997 1998)
(
Aggregation Networks
Efficient generation of multiple groups
– Usage of previous generated (more finer) groups instead of fact table as starting point
– Only one access to the fact table for multiple groups
Problems: Size of aggregation nets
– Hierarchy semantic reduces aggregation nets significantly
UB-Tree & Tetris techniques have high potential to optimize aggregation nets
– Grouping requires sorting
– Sorted writing of large temporary results saves additional processing time
Example of an Aggregation Network
(Year, Month, Productgroup)
(Year) (Month) (Productgroup)
(Year, Month) (Year, Productgroup) (Month, Productgroup)
Aggregation Net with
Hierarchies
(Year) (Productgroup)(Year, Month) (Year, Productgroup)
Tetris: sort according to Y
Sort according to PG
(or sorted writing+Scan)
POT and
Aggregation Nets
) ( Sales
sum
, ( ALL )
pad Year Month
Year
Month
Year ,
) ( Sales
sum
, ( ALL )
pad Year Month
CTV oductgroup
Germany Country
Year
Year 1997 1998 ) Pr
(
Optimization Issues of AggregationNets
Find minimal spanning tree for the specified groupings
– Vertices: groupings
– Edge weights: cost of computing new group#
Cost factors:
– Group size
– Required sorting – ...
Optimization Issues
of Aggregation Nets
(Year, Month, Productgroup)
(Year) (Productgroup)
(Year, Month) (Year, Productgroup)
C1 C2
C5
C4
C3
C6 C7
Summary and Future Work
Aggregation networks have a very potential to speed up POT operations
Standard grouping/aggregation algorithms may
benefit significantly from UB-Tree/Tetris techniques
Upon availability of resources: Implementation of basic ROLAP algebra processing as part of a
master thesis