Relational Database Systems 2

(1)

Wolf-Tilo Balke

Benjamin Köhncke

Institut für Informationssysteme

Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de

Relational Database Systems 2

3. Indexing and Access Paths

(2)

3.1 Introduction to access paths 3.2 Files and blocks

3.3 Indexing

– Single-level indexes – Multi-level indexes – Hash indexes

3.4 Physical Tuning

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 2

3 Indexing and Access Paths

(3)

• Databases persistently store data

• Major problem: efficiency of data access

– Obviously depends on the hardware used

• But also depends to a large degree

– On the allocation of data on the disks – On intelligent buffer management

– On creating access paths and indexing data

• Physical tuning of the database is a main task of database administrators

3.1 Introduction to Access Paths

(4)

• For persistent storage databases are mapped into a number of files

– Located in specially protected parts of the file system (tablespaces, etc.)

– Actually maintained by the operating system

• Each file is partitioned into fixed length blocks (pages)

– Smallest unit transferred from/to storage – Block size is specified on DB creation – Is a multiple of the OS block size

– A block usually contains several data records

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 4 SKS 10.5

3.1 Introduction to Access Paths

(5)

• Harddisk Sectors are abstracted by the file system to blocks

• DBMS abstracts FS blocks to DBMS blocks

3.1 Sectors and Blocks

DISK

FS Block 1 DBMS Block 1

DBMS Block 2

DBMS Block 5 DBMS Block 3 DBMS Block 4

DBMS Block 6 FS Block 2

FS Block 3 FS Block 4 FS Block 5 FS Block 6

(6)

• For data access always complete blocks (pages) are transferred from disk into the main memory

– The part used for holding copies of blocks is referred to as DB buffer and managed by the buffer

manager

– Optimized (pre-)fetching of blocks greatly improves performance

• If a data record is requested by the DB

– If the block is buffered, return main memory address – Else allocate space in the buffer, fetch the block

from disk, and return main memory address

3.1 Buffer Management

(7)

• Once new blocks are fetched, generally currently buffered blocks have to be evicted

– If blocks have been modified, they must be written back to disk (which is not always possible)

– Writing of blocks depends on the recovery strategy:

• Blocks that cannot yet be written are called pinned blocks

• Before checkpoints there might be a forced output of blocks

• Several buffer replacement strategies can be applied

3.1 Buffer Management

(8)

• Least recently used (LRU): the „oldest‟ block is replaced

– Usually used in OS buffer management – Simple, yet effective strategy

– Does not use semantics of the data

• Toss immediate: After the DB has finished to process a block, the block is immediately replaced

– Example: block-nested loop join between tables R, S

• Take one block from R and compare against all blocks in S

• The block from R can be thrown out, but no block from S

3.1 Replacement Strategies

(9)

• Expected Reuse: statistics about query frequencies assess how useful a buffered block is

– The block with least expected usefulness is evicted – Example: index blocks are more often

addressed than data blocks

• In any case, whatever the strategy:

Once a block has been modified and has not yet been written to disk, it cannot be evicted!

– Note: recovery subsystem has to agree before writing a block to disk

3.1 Replacement Strategies

(10)

• Overhead & Payload (e.g., ORACLE data blocks)

– Header contains general block information like block address, type of segment (data, index,…) – Table directory contains

tables that have rows in the block

– Row directory contains information about the actual rows in the block (IDs, addresses,…)

– Row data is actual data

3.2 Blocks (Pages)

(11)

• An Extent is a logical collection of blocks (usually adjacent)

– Fixed size, once more data space is needed for rows a new extent is allocated

– Remember: reading adjacent blocks improves access time

• Segments are collections of logically connected extents

– For instance tables, index segments, or rollback segments

3.2 Extents and Segments

(12)

• A tablespace is the logical storage space needed for the data in a table

– A grouping of multiple files allocated on disk

• Example: Data1.ora, system.ora, test.dbf

– Good practice to have one tablespace for tables, and a different one for indexes

CREATE TABLESPACE user_data

DATAFILE ‘udata.ora’ SIZE 10M EXTENT MANAGEMENT LOCAL

SEGMENT SPACE MANAGEMENT AUTO

3.2 Tablespaces

(13)

• Records of different relations should be stored in individual files

– Storing related records in the same block minimizes disk accesses

• Multitable clustering file organization may

store records of different tables in the same block

– Good e.g. for some often occurring joins

3.2 File Organization

(14)

• Files reserve space in the file system

– File size can be specified or change dynamically

– Default values strongly differ (e.g., DB2 allocates 100 MB)

3.2 File Organization

Schema file

Index file with index blocks Files related to table

“content_type_lecture”

Data file with data blocks

(15)

• Organization of records in the file

– Heap – a record can be placed anywhere in the file where there is space

– Sequential – store records in sequential order, based on the value of the search key of each record

– Hashing – a hash function is computed on some

attribute of each record. The result specifies in which block of the file the record should be placed

3.2 File Organization

(16)

• Data records have to be written in a file such that the entire record can be accessed with minimal disk accesses

– Fixed length records (easy to implement)

– Variable length records (storage space efficient)

TYPE deposit = RECORD

name : CHAR(22);

accNo : CHAR(10);

balance : REAL;

END

– If each char is 1 byte and a real 8 bytes, a record takes 40 bytes

3.2 File Organization

record Name AccNo Balance

0 Burg 864442 654,55

1 Myers 967531 56,45

2 Smith 145288 457,75

(17)

• Fixed length records

– 40 bytes for the first record, 40 bytes for the second,…

– Problem: if the block size is not a multiple of 40, records may cross boundaries

– Problem: deleting a record can be either done by marking it as deleted, or by replacing it with some other record of the file

• Keeping deleted items? Re

ading slo

w!

• Move up all other items? Efficiency!

• Fill space with next inserted item?

Changes sequence!

• Pointer lists? Wastes space in each record!

3.2 File Organization

record Name AccNo Balance

0 Burg 864442 654,55

1 Myers 967531 56,45

2 Smith 145288 457,75

(18)

• Variable length records

– Necessary for multi-table files or records that allow variable length attributes

– Problem: How to know when a record ends?

• Introduce ‘end-of-record’ symbols or store record length at beginning of the record. But what if a record is updated?

• Slotted page structure: header contains number of entries, end of free space and pointers to location/size of entries. Records are

moved to use up space: no fragmentation!

3.2 File Organization

(19)

• Typical database records like in our banking example easily fit into one block

– Name, account number, balance,…

– Usually multiple records per block

• Unspanned record organization fills blocks only with

complete records

• Spanned organization uses pointers to divide records

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 19 EN 4.4

3.2 File Organization

i i+1

record 1 record 2 record 3 record 4 record 5

i i+1

record 1 record 2 record 3 4 record 5

4

p

(20)

• Necessary for large objects: binary large objects (blobs) and character large objects (clobs)

• Large objects are not interpreted in databases

– Text documents, images, audio and video data

– Need to be stored in a contiguous sequence of bytes – If an object is bigger than a block, contiguous pages

of the buffer pool have to be allocated for storage – Sometimes preferable to

disallow direct access, but only allow access through file-system-like API to allow for fragmentation

20

3.2 Non-Standard Blocks

(21)

• Indexes help to locate records in a DB file

– Creation of indexes is part of the physical tuning task of database administrators

– Indexes often influence the actual location of storage for a record

• Example: sequential storage, storage via a hash function

– If the location is determined by the index, not all attributes can

be directly indexed (but secondary access paths may be used)

3.3 Indexing

(22)

• When items have to be found quickly, indexing mechanisms are used to

speed up access

– Alphabetical author catalog in libraries

– Lexicographic ordering – …

3.3 Basic Concepts

(23)

• Ordered by indexing field (search key)

– Attribute(s) used to look up records in a file

• An index file consists of index entries

– Records of the form

• Two basic kinds of queries

– Exact matches

• Locating a record(s) with a certain value

– Range queries

• Locating all records in a certain value range

3.3 Basic Concepts

search key record location(s)

(24)

• Two basic kinds of indexes

– Ordered indexes: search keys are stored in sorted order

• Single-level ordered indexes

• Multi-level ordered indexes

– Hash indexes: search keys are distributed uniformly across blocks using some hash function

• Hash function distributes records uniformly over blocks

• Hashed value of search key decides for storage block

3.3 Basic Concepts

(25)

• Index links indexing fields to logical file/block locations

– Dense indexes index all items, sparse indexes only selected ones

– Much smaller file than the data file

• Hence, a binary search on the index file requires fewer block accesses than a binary search on the data file

– Works exactly like a book‟s index

• Only few pages at beginning/end, alphabetically ordered, references to respective page(s)

3.3 Single-Level Ordered Indexes

(26)

• Primary indexes

– Order data by some usually unique attribute as indexing field (primary key), store database records in this order

– Index record contains pointer to the respective storage place (block address)

• To save entries usually there is only a single index entry

for each block (block anchor)

– But sparse indexes do not directly

show, whether some record is in the DB

3.3 Single-Level Ordered Indexes

AccNo 1 1, Adams, $ 887,00 2, Bertram, $19,99 AccNo 4

AccNo 7 3, Behaim, $ 167,00 4, Cesar, $ 1866,00

5, Miller, $179,99 6, Naders, $ 682,56

7, Ruth, $ 8642,78 8, Smith, $675,99 9, Tarrens, $ 99,00

(27)

• Advantages

– Number of blocks needed for storing the index is small compared to the data

• Not all records are indexed (non-dense)

• Index entries are smaller than data records

– Can often be kept in buffer

• Disadvantages

– Insertions and Deletions need to move data in storage and to update index entries affected by the shifts

3.3 Single-Level Ordered Indexes

(28)

• Clustering indexes store data records in the order of a non-unique indexing field

– One entry in the index per distinct value of the indexing field

– Search keys are linked to the block address of the containing the search key

– Is also a sparse index

• But existence of records with a certain key can be assessed by index look-up

3.3 Single-Level Ordered Indexes

(29)

• Still problems with insertion/deletion

– Often one or more complete blocks are reserved for each single search key to allow for block anchors

– Blocks do not have to be adjacent, but can use block

pointers, if more space is needed

• Structurally very similar to a hash index

3.3 Single-Level Ordered Indexes

Adams 1, Adams, $ 887,00 2, Adams, $19,99 Miller

Tarrans

4, Miller, $ 1866,00 5, Miller, $179,99 6, Miller, $ 682,56 7, Miller, $ 8642,78

8, Tarrens, $ 856,78

(30)

• Secondary indexes point to locations of records regarding a non-ordering attribute

– Indexing does not affect the storage order

– There can be multiple secondary indexes for the same DB file

• Secondary indexes are usually dense

– Objects with same or adjacent values are usually not adjacent on disk

– If the indexing field has unique values (secondary key) definitely all records have to be indexed

3.3 Single-Level Ordered Indexes

(31)

• If the indexing field is not unique there are several possibilities to create a secondary index

– Create a dense index by including duplicate search keys (one for each record)

– Use variable-length index entries, where each search key is assigned a list of pointers

– Keep fixed-length index entries, but point to a block containing (multiple) pointers to the actual records

• Introduces a level of indirection, but allows for a sparse index

• Usually used in practice

3.3 Single-Level Ordered Indexes

(32)

• Dense secondary index or sparse with indirection

3.3 Single-Level Ordered Indexes

Adams

1, Cesar, $ 887,00 2, Tarrens, $19,99 Cesar

Miller

4, Orwell, $ 1866,00 5, Miller, $179,99 6, Tarrens, $ 682,56

7, Post, $ 8642,78 8, Adams, $ 856,78 Miller

3, Miller, $19,99

9, Snyder, $ 856,78 Orwell

Post

Tarrens Tarrens Snyder

Adams

Miller

7, Post, $ 8642,78 8, Adams, $ 856,78

3, Miller, $19,99

Post

Tarrens Snyder

p p

p

p p

(33)

• Characteristics of secondary indexes

– Speeds up retrieval, because if it does not exist, the entire file would have to be scanned linearly

• For non-existent primary indexes files can still be scanned in a binary search fashion

– Uses more search time and space, because it is dense – Secondary indexes provides a logical ordering

• Accessing records in that order might not be the most efficient way regarding block accesses

• Each record access may fetch a new block into the buffer

3.3 Single-Level Ordered Indexes

(34)

• Improving secondary indexes for range queries

– Same block may be (unnecesarrily) accessed multiple times

3.3 Single-Level Ordered Indexes

Adams

Miller

3, Miller, $19,99

Post

• Stupid Example:

Cesar-Orwell

(buffer holds only 1 block)

 Cesar: fetch first block

 Miller: evict first block, fetch second block

 Miller: evict second block, fetch first block

 Orwell: evict first block,

fetch second block

(35)

• Improving secondary indexes for range queries

– Remember: working on blocks in main memory is cheap, fetching blocks from disk is expensive!

3.3 Single-Level Ordered Indexes

Adams 1, Cesar, $ 887,00

2, Tarrens, $19,99 Cesar

Miller

3, Miller, $19,99

Post

• Improved Example: Cesar-Orwell

• Fetch ‘Cesar’ and scan whole block

• Output ‘1’ & ‘3’

• Mark block as completed and evict

• Fetch ‘Miller’ and scan whole block

• Output ‘4’ & ‘5’

• Mark block as completed and evict

• Skip second ‘Miller’

• Skip ‘Orwell’

Done

(36)

• Short Summary

– Primary and cluster indexes affect storage order, secondary indexes don‟t

– Primary and clustering indexes may be sparse, secondary indexes are usually dense

• Primary and clustering indexes can use block anchors

– Index sizes

• Primary index – number of blocks

• Clustering index – number of distinct search keys

• Secondary index – up to number of records in table

3.3 Single-Level Ordered Indexes

(37)

• Example: What if a primary index does not fit in main memory?

– Bad look-up efficiency!

• Simple multi-level solution

– Treat primary index kept on disk as a sequential file and construct a sparse index on it

• outer index – sparse index of primary index

• inner index – primary index file

– If even outer index is too large to fit in main memory, yet another level of index can be created, and so on

3.3 Multi-Level Ordered Indexes

(38)

• Multi-level indexes can further improve search speed

– In single level indexes a binary search can be applied to locate pointers to blocks

• Efficiency of log

₂

N

– Multi-level indexes allow for higher search efficiency

• Fan-out of the index should be higher than 2

• Efficiency of log

_fan-out

N

3.3 Multi-Level Ordered Indexes

(39)

• Concept of Multi-Level Indexes can be applied to primary,

clustering and secondary indexes as long as values are distinct

• Example: 2-Level Primary Index

– Level 1 is Primary Index – Level 2 is Primary Index on

level 1

– Efficiency of log

_fan-out

N

• fan-out: Number of 1^st level blocks per 2^nd level block

3.3 Multi-Level Ordered Indexes

1

1, Adams, $ 887,00 2, Bertram, $19,99 3, Behaim, $ 167,00

4, Cesar, $ 1866,00 5, Miller, $179,99 6, Naders, $ 682,56

7, Ruth, $ 8642,78 8, Smith, $675,99 9, Tarrens, $ 99,00 10, Arnold, $ 3442,76

11, Marks, $435,19 12, Black, $ 319,44 3

5

7 9

11 1

7 11

Level 1 Level 2

(40)

• There is a huge amount of different tree structures for multi-level indexing

– Classical structures B-tree, B*-tree, etc.

– Multidimensional structures R-trees, Quad-trees, etc.

– Lots of literature

3.3 Multi-Level Ordered Indexes

(41)

• Index Structure based on Hashing

– Idealized Access Speed: O(1)

• Basic Idea:

– Fixed-Size directly addressable Index Space [0..M] containing M+1 buckets

• Buckets contain links to data

• Single link in internal hashing (i.e. in memory)

• Multiple links for external hashing (i.e. on harddisk)

– Hash Function h: ℤ → [0..M]

• Maps any number to [0..M]

• Should be uniform (i.e.: same probability for any x ∈ [0..M])

– Especially, should also be surjective (i.e. use the full range of [0..M])

• Needs to be deterministic (i.e.: same input → same output)

• Should be simple and fast

Datenbanksysteme 2 – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 41 EN 13.8.1

3.3 Hash Indexes

(42)

• Basic Idea

– Convert value which is to be indexed to numeric representation – Hash the value

– Store block anchor of value in the bucket with computed hash index

• Example: M=8

3.3 Hash Indexes

1, Cesar, $ 887.00 5, Miller, $179.99 9, Snyder, $ 856.78

Miller Cesar Snyder

h(Miller) = 4 h(Cesar) = 7 H(Snyder) = 1

0 1 2 3 4 5 6 7 8

(43)

• Problem: Collision

– Hash functions are not injective – collision may happen

• Block pointers for full blocks (overflow list)

– Probability of collision increases with load factor

• Deteriorates to linear behavior in worst case

3.3 Hash Indexes

1, Cesar, $ 887.00

5, Miller, $179.99 9, Snyder, $ 856.78 Miller

Cesar Snyder

h(Miller) = 4 h(Cesar) = 1 h(Snyder) = 1

0 1 2 3 4 5 6 7 8

Smith h(Smith) = 1 Rogers h(Rogers) = 1

3, Rogers, $ 1866.58

7, Smith, $ 0.29

(44)

• Problem: Static Address Range

– Addressable range is fixed to [0..M]

– What happens if address range is exhausted?

• E.g., load factor too high, too many collisions occur

– Possible solutions

• Rehashing :

– Create a new larger hashmap and rehash all values – For example, java hashmaps follow that policy

• Extendible Hashing:

– Uses dynamic directory to double and half number of buckets

• Linear Hashing:

– Buckets are split into two one after another and individually rehashed with additional hash-function

– Not in this lecture

3.3 Hash Indexes

(45)

• Extendible Hashing

– Hash values are positive integers between [0..2 ⁿ ]

• Numbers can be represented in binary

• Choose large n

– Create a directory of depth d with 2 ^d entries for the first d bits of values

• d is global depth of directory

• Directory cells link to buckets containing links to data with hash value starting with given bit pattern

– Adjacent cells may link to the same bucket depending on local depth of bucket

3.3 Hash Indexes

(46)

• Extendible Hashing

3.3 Hash Indexes

000 001 010 011 100 101 110 111

Global Depth = 3

Local depth = 1

Local depth = 2

Local depth = 3

All hash values starting with 0

(47)

• Extendible Hashing

– Allows to dynamically splitt and coalesc buckets as database grows and shrinks

• If bucket is full and local depth is lower than global depth:

– More than one cell links to this bucket

– Bucket is split into two with increased local depth and values are distributed accordingly

– Links in cells are adjusted accordingly

• If local depth already equals global depth:

– Global depth is increased by one and thus the size of the directory is doubled

– Each cell is replaced by two cells, both of which contain the same pointer as the original cell

3.3 Hash Indexes

(48)

• May perform in O(1)

– But: Management overhead can be quite large for growing data collections

• Overflow lists have small management overhead, but may decrease access performance to O(n)

• Rehashing is very expensive for each growth stage

• Extendible hashing has a smaller overhead and is especially suitable for external storage hashing

3.3 Hash Indexes

(49)

• Why not simply index every attribute?

– Physical index on primary key, logical indexes on every other attribute

• Results in good read efficiency

• But… whenever a DB file is modified, every index on the file has to be updated

– Updating indices imposes overhead on database modification (insert, delete, update)

• Especially for multi-level indexes all levels may have to be updated

3.3 Indexing

(50)

• One of the tasks of the database administrator

• Black Magic…

– Get DB performance statistics

– Try something that seems sensible – Get new performance statistics

– If it did not improve, reset it to previous configuration

– Try something different until satisfied...

3.4 Physical Tuning

(51)

• Black Magic today needs advanced tools to operate correctly

• For database tuning, you need

– Snapshot Monitors

• Continuously collect cumulative statistics for the DB within a given time span

– Event Monitors

• Hook into certain events and analyze them

3.4 Physical Tuning

(52)

• Snapshot Monitors

– Monitor collects cumulative statistics

– Internal counters set at global or session levels – Collected statistics can be evaluated afterwards

for system tuning on general level

– Data collection is explicitly enabled or disabled

3.4 Diagnostic Database Tools

(53)

• Useful for collecting point-in-time information regarding overall DB/DBMS behavior

– Locking –lists all locks & types held by all applications – Bufferpools – cumulative stats for memory, physical &

logical I/O‟s, synch/asynch

– Sorts – provides detailed statistics regard sort-heap usage, overflows, # active etc.

– Tablespaces – detailed I/O and activity statistics for a given tablespace

– UOW – displays status of application Unit of Work at point-in-time

– Dynamic SQL – shows contents of package cache and related stats at point-in-time

3.4 Diagnostic Database Tools

(54)

• Snapshot Monitors

– Overhead continuously is in the 3% - 5% range

• But no I/O since counters are usually RAM resident, not written to disk

– Quick list of useful metrics to identify particular problems include:

• bufferpool hit ratios

• sort overflows/problems

• effectiveness of prefetch

• need for indexing

• transaction logging issues

• single query/application resource domination, etc.

3.4 Diagnostic Database Tools

(55)

• Snapshot Monitor: IBM DB2 Performance Expert

3.4 Diagnostic Database Tools

http://www.redbooks.ibm.com/abstracts/sg246470.html

(56)

• Event Monitors

– Collects information regarding events or chains of events

• No continuously collected data as with snapshots

– Must be explicitly created and activated via DBMS commands or API‟s

– Are the best way to effectively diagnose and resolve transaction problems (e.g., deadlock issues)

– Output may be directed to a database table and results then analyzed using SQL

3.4 Diagnostic Database Tools

(57)

• Event Monitor can be used to

– Identify and rank most frequently executed or most time consuming SQL

– Track the sequence of statements leading up to a lock timeout or deadlock

– Track connections to the database

– Breakdown overall SQL statement execution time into sub-operations such as sort time and use or system CPU time

3.4 Diagnostic Database Tools

(58)

• Event Monitor: DBI Brother Panther

3.4 Diagnostic Database Tools

http://www.dbisoftware.com

(59)

• Physical DB Tuning is a complicated and intransparent task

• Usually trial and error

– Measure some hopefully meaningful metrics of you DB usage statistics

– Adjust around on something

• Mostly indexes and tablespace properties

– Measure again

• If results better : Great! Continue tuning!

• If result worse : Bad! Undo everything you did and try something else.

• But what to measure and what to do?

– Use best practices or try yourself!

3.4 The Black Art of Physical Tuning

(60)

• What to measure?

• First, what kind of database you have?

– OLTP (Online Transaction Processing) : Database with many small, concurrent read / write queries

• Optimize for fast read write accesses

• Keep an eye on “real-time” response times

• Optimize for concurrency

– Warehouse : Using larger and more complex queries for primary retrieving data

• Optimize for read accesses

3.4 Physical Tuning

(61)

• There is a huge amount of knobs and screws to turn

– DB planning and tuning is for professionals – Special certificates by each database vendor

3.4 Tuning Metrics

http://www.microsoft.com/learning/mcp/mcdba/

http://www-03.ibm.com/certify/certs/dm_index.shtml IBM DB2 IBM Certified Database Associate

IBM Certified Database Administrator IBM Certified Application Developer DB2

IBM Certified Advanced Database Administrator

http://education.oracle.com

(62)

• Scott Hayes

– President & CEO of Database-Brothers Inc (DBI)

• If I only had five minutes to assess the status of a database, I would look at…

– Average Result Set Size – Index Read Efficiency

– Synchronous Read Percentage

– Average Rows Read per DB Transaction per Table – Average Read Time

3.4 Tuning Metrics

(63)

• ARSS - Average Result Set Size

– ARSS = RowsSelected / NumSelectQueries – Rule of Thumb:

• ARSS ≤ 10 indicates a OLTP database

– If you think you have OLTP and ARSS>>10, there is something wrong…

• ARSS > 10 indicates a warehouse database

3.4 Tuning Metrics

(64)

• IREF - Index Read Efficiency

– “How many rows have to read to retrieve one row?”

• i.e. How much overhead is within the where predicates?

– IREF = RowsRead / RowsReturned

• Well-designed indexes may evaluate a where-clause without reading additional rows!

– Rule of Thumb:

• IREF should be less than 10 for OLTP Databases

• May be higher for Warehouses (>100)

– What to do?

• Introduce more efficient indexes!

• User simpler queries!

3.4 Tuning Metrics

(65)

• SRP – Synchronous Read Percentage

– Synchronous Read means the DBMS reads just the index file and the required data page (good)

– Asynchronous Read means the DBMS employs

prefetching to scan for an indexes or data (BAD) – Rule of Thumb:

• >90% for OLTP (>50% for warehouses) : Good

• 80%-90% (25%-50%) : Ok

• 50%-80% (<25%) Bad

– What to do?

• Improve index structures!

3.4 Tuning Metrics

(66)

• TBRRTX – Average Rows Read per DB Transaction per Table

– TBRRTX

_table

= rowsRead / (attemptedCommits + attemptedRollBacks)

– Rule of Thumb:

• <10 for OLTP : Good

• 10-100 : Ok

• >100 : Bad

• >1000 : Horrific

– What to do?

• Improve index structures!

• Improve Queries

3.4 Tuning Metrics

(67)