• Keine Ergebnisse gefunden

Join Tuning Database Tuning

N/A
N/A
Protected

Academic year: 2022

Aktie "Join Tuning Database Tuning"

Copied!
2
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Assignment 5

Join Tuning

Database Tuning

Start date: Dec 1, 2016

Due date: December 13, 23:59 Grading: 5 points

In this assignment you will experiment with different join algorithms in PostgreSQL.

1. Download http://dbresearch.uni-salzburg.at/downloads/teaching/2016ws/dbt/dblp.zip This archive contains two tab separated files (publ.tsv and auth.tsv) that store authors and their publications as found in the DBLP1 bibliography. The imported tables have the following schemas:

• Auth(name(49),pubID(129))

• Publ(pubID(129),type(13),title(700),booktitle(132), year(4),publisher(196))

You can assume that all attribute values are strings; the maximum string length is shown in brackets. Publ.pubID is a key.

2. Study index nested loop join, merge join, and hash join for the following queries:

SELECT name,title FROM Auth, Publ

WHERE Auth.pubID=Publ.pubID;

SELECT title FROM Auth, Publ

WHERE Auth.pubID=Publ.pubID AND Auth.name=’Divesh Srivastava’

(a) What join strategies does the system propose (i) without use of an index, (ii) with a unique non-clustering index on Publ.pubID, and (iii) with two clustering indexes, one onPubl.pubID and the other one onAuth.pubID?

(b) Test the index nested loop join with a non-clustering index (i) on Publ.pubID, (ii) on Auth.pubID, (iii) and both Publ.pubID and Auth.pubID. Give the response times and discuss the query plans.

(c) Test the merge join (i) without index, (ii) with two non-clustering indexes, and (iii) with two clustering indexes. Give response times and discuss the query plans.

1http://www.informatik.uni-trier.de/~ley/db/

(2)

(d) Test the hash join without index and give the response time.

(e) Are the results (query plan and throughput) expected? Why (not)?

Note: You can stop queries that run for more than 10 minutes onbiber. Check the query plan to avoid queries with excessive runtime.

Notes about PostgreSQL

• Clustering indexes: You first create an index, then you use the index to cluster the table (i.e., physically sort the table by the index attribute):

CREATE INDEX year idx ON publ(year);

CLUSTER publ USING year idx;

• Query plan: The command EXPLAIN shows the query plan without executing the query. The commandEXPLAIN ANALYZEalso executes the query. Example:

EXPLAIN ANALYZE SELECT * FROM publ WHERE year=’2006’;

• Join strategy: You can influence the optimizer choice with the switches enable hashjoin, enable mergejoin, and enable nestloop. Example:

SET enable hashjoin TO true;

SHOW enable hashjoin;

Please indicate the average time per group member that was spent solving this assignment. The time that you indicate will haveno impact on your grade.

Grading scheme:

Category max. Points

Description of your setup 0.5

Join strategies (2a) 0.5

Response times (2b-2d) 0.5

Query plans discussion (2b-2d) 1.5

Interpretation of results 2

Referenzen

ÄHNLICHE DOKUMENTE

free list: list of unused database buffer pages a thread that needs a free page locks the free list during the lock no other thread can get a free page Solution: Logical

Augsten (Univ.. Overhead of Table vs. Salzburg) DBT – Concurrency Tuning Sommersemester 2019 25 / 74. Overhead of

Original query 1: description of query plan 0.5 Rewritten query 1: description of query plan 0.5 Difference between original and rewritten query 1 0.5 Original query 2: description

free list: list of unused database buffer pages a thread that needs a free page locks the free list during the lock no other thread can get a free page Solution: Logical

Write-back mode: request terminates when data is written to cache data is written from cache to disk later. writes become faster since they do not have to wait for the disk if

The account of the company (account number 0, initial balance 100) and the accounts of all employees (account numbers 1 to 100, initial balance 0) are with the same bank.. The

(a) What join strategies does the system propose (i) without use of an index, (ii) with a unique non-clustering index on Publ.pubID, and (iii) with two clustering indexes, one

The account of the company (account number 0, initial balance 100) and the accounts of all employees (account numbers 1 to 100, initial balance 0) are with the same bank.. The