© 1999 FORWISS
FORWISS FORWISS
General Research Report
Implementation and Optimization Issues of the
ROLAP Algebra
F. Ramsak, M.S. (UIUC) Dr. V. Markl
Prof. R. Bayer, Ph.D.
FORWISS FORWISS
Contents
● Motivation
● ROLAP Algebra Recap
● Optimization Issues
–Handling of Restrictions –Aggregation Networks
● Future Work & Summary
© 1999 FORWISS
FORWISS FORWISS
Example DW Model
Sales Cost Quantity
Year
Month All Time
Quarter Year
Month All Time
Quarter
Region Nation Trade Type Business Type
CUSTOM ER All Customer Region Nation Trade Type Business Type
CUSTOM ER All Customer PROD
UCT Sector
Category Produ
ct Group Conta
iner
All Produ PR cts
ODUCT Sector
Category Produ
ct Group Conta
iner
All Produ cts
FORWISS FORWISS
User‘s View of a Report
Sum - Sales Year Quarter
1998 1998 Total Total
Region Nation 1 2 3 4
Asia China
Japan Asia Total Europe France
Germany Spain Europe Total Total
1
1 2
2
3 9
7
5 8
4 7
5 8
6 4
Grouping combinations used to fill pivot table:
(1){Y,Q,R,N} (2){Y,Q,R} (3) {Y,Q}
(4) {Y,R,N} (5){Y,R} (6){Y}
(7) {R,N} (8) {R} (9){} = ALL
© 1999 FORWISS
FORWISS FORWISS
7
G7
g
g
h F
R ALL
PAD F
G R POT
G
∈ h
÷÷ø ö
ççè æ
=
∈
) )),
( ( ( )
, ,
( α γ
POT: Pivot Organized Tuples
● We may also write
forPOT(R,G,F).
G,F
(R)
FORWISS FORWISS
POT-Example
POT(R,{{A},{A,B}},{sum(D)}) yields the table:
A B Sum(D)
a1 ALL ***
… ***
n
ALL ***
a1 b1 ***
… … ***
an ***
… … ***
a1 bm ***
… … ***
an bm ***
a
γA γA,B
) (D
αsum αsum(D)
R
7
{ }, (ALL)
pad AB pad{ }A,B (ALL)
A B sum(D) A sum(D)
A B sum(D) A B sum(D)
A B sum(D)
© 1999 FORWISS
POT Extension: Group Filtering
● Filtering of generated groups
(like with the HAVING clause in SQL)
with Hcontaining a predicate H[g] for each grouping g inG
7 7
G g
g F g
h H
R ALL
PAD
F H G R POT
G
∈ h
÷÷ø ö
ççè æ
=
∈
) ))),
( ( ( (
) , , , (
]
[
α γ
σ
FORWISS FORWISS
Group Filtering Example
● Report Years, Product-Group sales totals and sales/year for PGs with less than 10 Mio sales
)}) (
{
}}, 10
) (
{ {}, {{},
}}, ,
{ }, {
}, {{
, (
Sales sum
Mio Sales
sum
PG Y PG Y
Fact POT
<
© 1999 FORWISS
FORWISS FORWISS
Straight Forward SQL Generation
● POT(R,{{A},{A,B}},{sum(D)}) maps directly to:
SELECT A, ‘ALL’, sum(D) FROM R
GROUP BY A
UNION
SELECT A, B, sum(D) FROM R GROUP BY A,B
● Disadvantages:
–Efficient execution depends on optimizer of underlying DBMS –no UB-Tree support on SQL interface guaranteed
FORWISS FORWISS
Handling Restrictions
● Semantic of ALLvalue
● Pushing Restrictions Down
–Pushing Through POT: Restrictions on all groups
–Pushing down inside POT:
Restrictions on individual groups may be pushed down (i.e., before grouping) if they do not contain constraints on the aggregation results
) , ), ( ( ))
, , (
(POT R G F POT ρ R G F
ρ σ
σ =
? )) }}, {
}, {{
, (
1998
( =
=
POT R Year PG F
σ
Year© 1999 FORWISS
) (Sales
α
sum{ , }
( ALL ) pad
YearMonthγ
Yearγ
Year,Month) (Sales
α
sumFACT
7
{ , }
( ALL ) pad
YearMonthYear Year=1997∨ =1998)
σ(
FORWISS FORWISS
) (Sales
α
sum{ , }
( ALL ) pad
YearMonthγ
Yearγ
Year,Month) (Sales
α
sumFACT
7
{ , }
( ALL ) pad
YearMonthYear Year=1997∨ =1998)
σ(
© 1999 FORWISS
FORWISS FORWISS
Aggregation Networks
● Efficient generation of multiple groups
–Usage of previous generated (more finer) groups instead of fact table as starting point
–Only one access to the fact table for multiple groups
● Problems: Size of aggregation nets
–Hierarchy semantic reduces aggregation nets significantly
● UB-Tree & Tetris techniques have high potential to optimize aggregation nets
–Grouping requires sorting
–Sorted writing of large temporary results saves additional processing time
FORWISS FORWISS
Example of an Aggregation Network
(Year, Month, Productgroup)
(Year) (Month) (Productgroup)
(Year, Month) (Year, Productgroup) (Month, Productgroup) ( )
© 1999 FORWISS
FORWISS FORWISS
Aggregation Net with
Hierarchies
(Year, Month, Productgroup)
(Year) (Productgroup)
(Year, Month) (Year, Productgroup) ( )
Tetris: sort according to Y
Sort according to PG (or sorted writing+Scan)
FORWISS FORWISS
POT and Aggregation Nets
) (Sales
αsum
{ , }(ALL) padYearMonth
γYear
Month Year,
γ
) (Sales
αsum
FACT
7
{ , }(ALL) padYearMonth
CTV oductgroup Germany Country Year
Year=1997∨ =1998)∧ = ∧Pr =
σ(
© 1999 FORWISS
FORWISS FORWISS
Optimization Issues of AggregationNets
● Find minimal spanning tree for the specified groupings
–Vertices: groupings
–Edge weights: cost of computing new group#
● Cost factors:
–Group size –Required sorting –...
FORWISS FORWISS
Optimization Issues
of Aggregation Nets
(Year, Month, Productgroup)
(Year) (Productgroup)
(Year, Month) (Year, Productgroup) ( )
C1 C2
C5
C4
C3
C6 C7
© 1999 FORWISS
Summary and Future Work
● Aggregation networks have a very potential to speed up POToperations
● Standard grouping/aggregation algorithms may benefit significantly from UB-Tree/Tetris techniques
● Upon availability of resources: Implementation of basic ROLAP algebra processing as part of a master thesis