• Keine Ergebnisse gefunden

Continuous Inverse Ranking Queries in Uncertain Streams

N/A
N/A
Protected

Academic year: 2023

Aktie "Continuous Inverse Ranking Queries in Uncertain Streams"

Copied!
23
0
0

Volltext

(1)

Continuous Inverse Ranking Queries in Uncertain Streams

Thomas Bernecker*, Hans-Peter Kriegel*,

Nikos Mamoulis**, Matthias Renz* and Andreas Zuefle*

*)

Ludwig-Maximilians-Universität München (LMU) Munich, Germany

http://www.dbs.ifi.lmu.de

{bernecker, kriegel, renz, zuefle}@dbs.ifi.lmu.de

**)

University of Hong Kong (HKU) Hong Kong

http://www.cs.hku.hk nikos@cs.hku.hk

(2)

1. Motivation: Probabilistic Inverse Ranking 2. Continuous Inverse Ranking Queries

Initial Computation

Incremental Processing

3. Experimental Evaluation 4. Summary

(3)

Inverse Ranking: Return the position of the query object q w.r.t. the score function S

Probabilistic Inverse Ranking: Find all possible positions of q

• Example: Stock rating system

q

Stock I Stock II Stock III

Risk

q

Rank 1? → 0 % Rank 2? → 50 % Rank 3? → 50 % Rank 4? → 0 % S = Chances - Risk

(4)

• Probabilistic Inverse Ranking (PIR) Query

– Probabilistic database DB where |DB| = n

– Uncertain object o: m alternative locations (discrete uncertainty) or pdf (continuous uncertainty)

– Query object q

– Score function S : DB R0+

– Definition: ∀ i = 1, ..., k : P(q is on rank i w.r.t. S ) =

– There exist exactly i - 1 objects o Ԗ DB with S(o) > S(q)

• Challenge: Application to dynamic data

– General stream model with location updates retrieved at a time t – P(q is on rank i at time t) =

– Initial computation

– Incremental processing

( )

i

Pqt

( )

i

Pqt

(5)

– Object o Ԗ DB : = P(S(o) > S(q) at time t)

j objects have been processed so far (oj is the latest)

– Successive processing by the Poisson Binomial Recurrence (PBR):

( )

+

>

<

=

=

=

1 else

0 if

0

0 0

if 1

1 , 1

, 1 ,

t o t

j i t

o t

j i t

j i

j

j P p

p P

j i

i

j i

P

i out of j:

S(o) > S(q)

i-1 out of j-1:

S(o) > S(q) and

S(oj) > S(q)

i out of j-1:

S(o) > S(q) and

S(oj) S(q)

t

po

q

(6)

j = n (i = 0,...,k-1):

⇒ PIR result for q ⇒ runtime: O(k·n)

• Optimizations:

o has no effect on the rank of q

increment counter

• General case ( ) ⇒ process o by PBR: ∀i = 0,...,k-1 :

P(i objects processed by PBR have a higher score than q) =

• Initial PIR result:

( )

1

,

, = P = P i +

Pitj itn qt

( ) ( )

+ + +

= 0 else

1 1

if

1 C C i C k

i i P

P

t t

t t

t PBR q

) (i PPBRt

= 0

t

po

=1

t

po Ct

1 0 < pot <

(7)

1 .

1 = 0

t

po 0

2 =

t

po 0.6

3 =

t

po 1

4 =

t

po Ct = 0

q o1

o3 o2

o4

(8)

• Example: n = 4, k = 2, j = 1

j = 1: P0t,1 = Pt1,0 pot1 +P0t,0

(

1 pot1

)

= 00.1+10.9 = 0.9 1

.

1 = 0

t

po 0

2 =

t

po 0.6

3 =

t

po 1

4 =

t

po

(

1 1

)

1 0.1 0 0.9 0.1

1 1,0

0 , 0 1

,

1t = Pt pot +Pt pot = + =

P

= 0 Ct

(9)

j = 1: P0t,1 = Pt1,0 pot1 +P0t,0

(

1 pot1

)

= 00.1+10.9 = 0.9 1

.

1 = 0

t

po 0

2 =

t

po 0.6

3 =

t

po 1

4 =

t

po

(

1 1

)

1 0.1 0 0.9 0.1

1 1,0

0 , 0 1

,

1t = Pt pot +Pt pot = + =

P

= 0 Ct

(10)

• Example: n = 4, k = 2, j = 3

j = 1:

j = 3:

(

1 1

)

0 0.1 1 0.9 0.9

1 0,0

0 , 1 1

,

0t = Pt pot +Pt pot = + =

P

1 .

1 = 0

t

po 0

2 =

t

po 0.6

3 =

t

po 1

4 =

t

po

(

1 1

)

1 0.1 0 0.9 0.1

1 1,0

0 , 0 1

,

1t = Pt pot +Pt pot = + =

P

(

1 3

)

0 0.6 0.9 0.4 0.36

3 0,1

1 , 1 2

,

0t = Pt pot +Pt pot = + =

P

(

1 3

)

0.9 0.6 0.1 0.4 0.58

3 1,1

1 , 0 2

,

1t = Pt pot +Pt pot = + =

P

= 0 Ct

(11)

j = 1:

j = 3:

(

1 1

)

0 0.1 1 0.9 0.9

1 0,0

0 , 1 1

,

0t = Pt pot +Pt pot = + =

P

1 .

1 = 0

t

po 0

2 =

t

po 0.6

3 =

t

po 1

4 =

t

po

(

1 1

)

1 0.1 0 0.9 0.1

1 1,0

0 , 0 1

,

1t = Pt pot +Pt pot = + =

P

(

1 3

)

0 0.6 0.9 0.4 0.36

3 0,1

1 , 1 2

,

0t = Pt pot +Pt pot = + =

P

(

1 3

)

0.9 0.6 0.1 0.4 0.58

3 1,1

1 , 0 2

,

1t = Pt pot +Pt pot = + =

P

=1 Ct

(12)

• Example: n = 4, k = 2

j = 1:

j = 3:

• Initial PIR result:

(

1 1

)

0 0.1 1 0.9 0.9

1 0,0

0 , 1 1

,

0t = Pt pot +Pt pot = + =

P

1 .

1 = 0

t

po 0

2 =

t

po 0.6

3 =

t

po 1

4 =

t

po

(

1 1

)

1 0.1 0 0.9 0.1

1 1,0

0 , 0 1

,

1t = Pt pot +Pt pot = + =

P

(

1 3

)

0 0.6 0.9 0.4 0.36

( )

0

3 0,1

1 , 1 2

, 0

t PBR t

o t

t o t

t P p P p P

P = + = + = =

(

1 3

)

0.9 0.6 0.1 0.4 0.58

( )

1

3 1,1

1 , 0 2

, 1

t PBR t

o t

t o t

t P p P p P

P = + = + = =

=1 Ct

( )

1 = PBRt

(

111

)

= PBRt

( )

1 = 0

t

q P P

P

( )

2 = PBRt

(

211

)

= PBRt

( )

0 = 0.36

t

q P P

P

(13)

compute ∀i = 1,...,k

• Naive solution: Apply PBR ⇒ O(n) ∀i = 1,...,k

• Enhanced solution: just consider update of o

⇒ O(1) ∀i = 1,...,k

– Phase 1

• Remove effect of old value from i = 0,...,k-1

• Obtain intermediate result – Phase 2

• Incorporate effect of new value in

• Obtain new PIR result

( )

i

Pqt

) (i PPBRt )

ˆ 1 (i PPBRt+

) ˆ 1 (i PPBRt+ )

1( i Pqt+

t

po

+1 t

po

(14)

• Phase 1: Three cases

1.

2. and

3. ⇒ remove from

( ) ( ) ( ) (

ot

)

t PBR t

o t

PBR t

PBR i P i p P i p

P = ˆ 1 + ˆ 1

( ) ( ) ( )

t o

t o t

PBR t

t PBR

PBR p

p i

P i

i P

P

=

1 ˆ 1 ˆ

( ) ( )

t o t

t PBR

PBR p

P P

= 1 0 0

ˆ

= 0

t

po

( )

i P

( )

i

PˆPBRt = PBRt 1

0 < pot <

( )

i P

( )

i

PˆPBRt = PBRt

=1

t

po Ct+1 = Ct 1

( )

i PPBRt

t

po

(15)

1.

2. and

3. ⇒ compute applying PBR

• New PIR result:

1 = 0

+ t

po

1 0 < pot+1 <

( )

i P

( )

i

PPBRt+1 = ˆPBRt

1 =1

t+

po PPBRt+1

( )

i = PˆPBRt

( )

i Ct+1 = Ct +1

( )

i PPBRt+1

( ) ( )

1

( ) (

1

)

1 ˆ 1 + ˆ 1 +

+ = PBRt ot + PBRt ot

t

PBR i P i p P i p

P

( ) ( )

+ + +

= + + + +

+

else

0

1 1

if

1 1 1 1

1

1 P i C C i C k

i P

t t

t t

t PBR q

(16)

• Example: n = 4, k = 2

1 .

1 = 0

t

po 0

2 =

t

po 0.6 1 0.2

3

3 = ot+ =

t

o p

p 1 2 0

4

4 = ot+ =

t

o p

p Ct =1

q o1

o3 o2

o4 q

o1

o3 o2

o4

(17)

– Phase 1 (Case 3):

– Phase 2 (Case 3):

– PIR result:

1 .

1 = 0

t

po 0

2 =

t

po 0.6 1 0.2

3

3 = ot+ =

t

o p

p 1 2 0

4

4 = ot+ =

t

o p

p Ct =1

( )

0 ˆ

( )

1 1 ˆ

( )

0

(

1 1

)

0 0.2 0.9 0.8 0.72

1

3

3 + = + =

= + +

+ t

o t

PBR t

o t

PBR t

PBR P p P p

P

( )

1 ˆ

( )

0 1 ˆ

( )

1

(

1 1

)

0.9 0.2 0.1 0.8 0.26

1

3

3 + = + =

= + +

+ t

o t

PBR t

o t

PBR t

PBR P p P p

P

( ) ( ) ( )

1 . 4 0

. 0

6 . 0 9 . 0 58 . 0 1

ˆ 0 1 1

ˆ

3

3 = =

= t

o

t o t

PBR t

t PBR

PBR p

p P

P P

( ) ( )

9 . 4 0

. 0

36 . 0 1

0 0 ˆ

3

=

=

= t

o t t PBR

PBR p

P P

( )

1 = 0 t+1

( )

1 = 0

t P

P Pt

( )

2 = 0.36 Pt+1

( )

2 = 0.72

(18)

• Example: n = 4, k = 2

– Phase 1 (Case 1):

– Phase 2 (Case 2):

– PIR result:

1 .

1 = 0

t

po 0

2 =

t

po 0.6 1 0.2

3

3 = ot+ =

t

o p

p 1 2 0

4

4 = ot+ =

t

o p

p Ct =1

( )

0

( )

0 0.72

ˆPBRt+1 = PPBRt+1 = P

( )

1 0 2

( )

1 2

(

1 1 0

)

2

( )

0 0.72

1 = + = + = + =

+ t

PBR t

PBR t

q t

q P P P

P

( )

2 0.72 2

( )

2 2

(

2 1 0

)

2

( )

1 0.26

1 = + = + = + =

+ t

PBR t

PBR t

q t

q P P P

P

( )

0 ˆ 1

( )

0 0.72

2 = + =

+ t

PBR t

PBR P

P

( )

1 ˆ 1

( )

1 0.26

2 = + =

+ t

PBR t

PBR P

P

( )

1

( )

1 0.26

ˆPBRt+1 = PPBRt+1 = P

= 0 Ct

(19)

0 0,2 0,4 0,6 0,8 1 1,2 1,4 1,6 1,8 2

0 1.000 2.000 3.000 4.000 5.000

time per update [ms]

enhanced naive

(20)

0 2.000 4.000 6.000 8.000

0 1.000 2.000 3.000 4.000 5.000

time to process the full stream [ms]

database size n

enhanced naive

dimensions = 2, m = 10, σ = 5, k = n, buffer = 3

(21)

0 50.000 100.000 150.000 200.000 250.000

0 1.000 2.000 3.000 4.000 5.000 6.000

time to process the full stream [ms]

enhanced naive

(22)

0 10.000 20.000 30.000 40.000 50.000 60.000 70.000 80.000

0 2 4 6 8 10

time to process the full stream [ms]

standard deviation σ

enhanced naive

n = 10,000, dimensions = 2, m = 10, k = n, buffer = 3

(23)

update costs of O(k) instead of O(k·n)

• The framework can be adapted to other query types, e.g. the probabilistic threshold inverse ranking query

• Future work: approximate approach using lower and upper bounds for the probabilities and applying the concept of Generating Functions

Referenzen

ÄHNLICHE DOKUMENTE

This means the output stream does not contain an element for (b , [10, 11)), which is stored in the queue Q but not in the SweepArea. The remaining elements in the SweepArea

Since queries expressed in our query language are translated into a query plan composed of operators from our stream algebra, the se- mantics of a query results from the semantics

Bricht ab bei top-level Konstruktoren (Hier: int, bool, →) oder wenn versucht wird eine Typvariable mit einem Funktionstyp zu vereinigen der die Typvariable enth¨

Part III will address the need for effective and efficient approaches for similarity pro- cessing in uncertain databases, in particular with solutions for similarity ranking queries

The following is based on an elaborate analysis of representative claims by MPs in Austria, Germany, Ireland and the UK during parliamentary debates on the failed

Lo scopo dell'etichetta energia è di sensibilizzare nei confronti di un uso par- simonioso dell'acqua calda, che porta beneficio sia all'ambiente che al portafo- gli..

Les transferts de fonds des migrants boostent la croissance économique au Sénégal et cet effet positif des TFM sur la croissance est amplifié par la protection

For both the MSA and national samples, the total utility coefficients for attached and detached homes are very similar, with an av- erage value of about -20, indicating that home

Die theoretischen Schlussfolgerungen dieser Lite- ratur- und Marktanalyse werden empirisch durch eine umfangreiche Expertenbefragung ergänzt und auch überprüft.. Mithilfe