• Keine Ergebnisse gefunden

To fol low a d iscussion of com mit processing, two basic terms must first be understood. We begin this section by defi ning a transaction and the "moment of commit."

70

A transact ion is the execu t ion of one or more statements that access data managed by a database system. General ly, database management systems guarantee that the effects of a transaction are atomic, that is, either al l updates performed within the con­

text of the transaction are recorded in the database, or no updates are reflected i n the database.

The point at which a transaction's effects become durable is known as the "moment of commit." This concept is important because it al lows database recovery to proceed in a predictable manner after a t ransact ion fa ilure. If a t ransaction terminates abnormal l y before it reaches the moment of com­

mit, then it aborts. As a result, the database system performs transaction recovery, which removes al l effects of the transact ion. However, if the trans­

action has passed the moment of commit, recovery processing ensures that al l changes made by the transaction are permanent.

Transaction Profile

For the purpose of analysis, i t is useful to divide a transaction processed by KODA into four phases:

the transact ion start phase, the data manipu lation phase, the logging phase, and the commit process­

ing phase . Figure I i l l ustrates the phases of a trans­

action in time sequence. The first three phases are col lectively referred to as "the average transaction's CPU cost (excluding the cost of com m i t) " and the last phase (commit) as "the cost of writing a group com mit buffer." '

Vol. 3 No. I Winter 1991 Digital Technical journal

TIME -DATA MANIPULATION

Figure 1 Phases in the Execution of a Transaction

The transaction start phase i nvolves acquiring a transaction identifier and set t i ng up control data structures. This phase u sually incurs a fixed overhead.

The data manipulat ion phase i nvolves executing the actions d ictated by an appl ication program.

Obviously, the t ime spent in this phase and the amount of p rocess i ng required depend on the nature of the application.

At some point a request is made to complete the transaction. Accordi ngly in KODA, the transaction enters the logging phase which i nvolves updating the database with the changes and wri t i ng the undo/redo information to disk. The amount of work done in the logging phase is usually small and con­

stant (less than one 1/0) for t ransaction processing.

Finally, the transaction enters the commit pro­

cessing phase. In KODA, this phase i nvolves writing commit information to d isk, thereby ensuring that the transaction's effects are recorded in the data­

base and now visible to other users.

For some t ransact ions, the data manipu lation phase is very expensive, possibly requiring a large number of 1/0s and a great deal of CPU t ime. For example, if 500 employees in a company were to get a 10 percent salary i ncrease, a transaction wou ld have to fetch and modify every employee/salary record i n the company database. The commit pro­

cessing phase, in this example, represents 0.2 per­

cent of the transaction durat ion. Thus, for this class of transaction, commit processing is a small frac­

t ion of the overall cost. Figure 2 illustrates the pro­

file of a transaction modifying 500 records.

COMMIT

.---

START LOGGING

+l

II

DATA MANIPULATION

Ill

TIME

-Figure 2 Profile of a Transaction Modifying 500 Records

In contrast, for transaction processing appl ica­

t ions such as hotel reservation systems, banking

Digital Technical journal Vol. 3 No. I Winter 1!)91

applications, stock market t ransactions, or the telephone system, the data manipulation phase is usually short (requiring few 1/0s) . Instead, the log­

gi ng and commit phases comprise the bulk of the work and must be optim ized to al low high trans­

action throughput. The transaction profile for a transaction modifying one record is shown i n Figure 3 . Note that the comm i t processing phase represents 36 percent of the transaction duration, in this example.

rr=

START

DATA MANIPULATION

II I

LOGGING

TIME

-COMMIT

Figure 3 Profile of a Transaction Modifying One Record

Group Commit

Generally, database systems must force write i nfor­

mation to disk in order to commit transactions. In the event of a fa ilure, this operation perm i ts recov­

ery processing to determine which fa i led t rans­

actions were active at the t ime of their termination and which ones had reached their moment of com­

mit. This information is often in the form of l ists of transaction identifiers, called commit l ists.

Many database systems perform an opt imized version of commit processing where commi t infor­

mation for a group of transactions is written to disk in one 1/0 operat ion, thereby, amort izing the cost of the 1/0 across multiple t ransactions. So, rather than having each transaction write i ts own comm it l ist to d isk, one transact ion writes to disk a com­

mit l ist contai ning the commit information for a number of other transact ions. This technique is referred to in the literature as "group commit."'

Group commit processing is essential for achiev­

i ng high throughput. If every transaction that reached the commit stage had to actually perform an 1/0 to the same disk to flush its own commit informat ion, the throughput of the database sys­

tem would be limited to the l/0 rate of the disk. A magnetic d isk is capable of performing 30 l/0 operations per second. Consequent ly, i n t he absence of group commit, the throughput of t he system is lim i ted to 30 t ransactions per second (TPS) . Group commit is essential to breaking this performance barrier.

7 1

Transaction Processing, Databases, and Fault-tolerant Systems

There are several variations of t he basic algo­

rithms for grouping multiple comm i t l ists into a single 1/0. The specific group commit algorithm chosen can significantly influence the throughput and response times of transaction processing. One study reports throughput ga ins of as much as 25 percent by selecting an optimal group commit algorithm . 1

A t high transact ion throughput (hun dreds of transactions per second), efficient commit process­

i ng provides a significant performance advantage.

There is l it tle i nformation in the database l itera­

ture about the efficiency of d ifferent methods of perform ing a group commit. Therefore, we ana­

lyzed several grouping designs and evaluated their performance benefits.

Factors Affecting Group Commit

Before proceedi ng to a description of the experi­

ments, it is useful to have a better understanding of the factors affecting the behavior of the group com­

m i t mechanism. This section discusses the group s ize, the use of timers to stal l transactions, and the relationship between these two factors.

Group Size An important factor affecting group commit is the number of transactions that partici­

pate in the group commit. There must be several the group consists of 2 transactions, each of them does one-half a write. If the group size increases disk, the maximum transaction commit rate is Jx G TPS. For example, if the group size is 45 and the rate (on the order of tens of milliseconds) during com­

mit processing. During the stall , more transactions enter the comm i t p rocessing phase and so the group size becomes larger. The stalls provided by the timers have the advantage of increasing the group size, and the disadvantage of increasing the response time.

Trade-ojfs This section discusses the trade-offs between the size of the group and the use of timers to stall transactions. Consider a system where there are 50 active database p rograms, each repeatedly processing transactions against a database. Assume that on average each transaction takes between 0.4 and 0.5 seconds. Thus, at peak performance, the database system can commit approximately 100 transactions every second, each program actually completing two transactions in the one-second time i nterval. Also, assume that the transactions arrive at the commit point in a steady stream at dif­

ferent times. shoul.d be able to approach its peak throughpu t of 100 TPS. However, if the mechanism delays commit processing for one second, an entirely different behavior sequence occurs. Since the transactions complete in approximately 0.5 seconds, they accu­

mu late at the comm i t stall and are forced to wa it until the one-second stall completes. The group size then consists of 50 transactions, thereby maxi­

m izing the l/0 amortization. However, throughput is also limited to 50 TPS, s ince a group com mit is occurring only once per second.

Thus, it is necessary to balance response t ime and the size of the commit group. The longer the stall, the larger the group s ize; the larger the group size, the better the l/0 amortization that is achieved.

However, if the stall time is too long, i t is possible to limit transaction throughput because of wasted CPU cycles.

Vol. 3 No. I Winter 1991 Digital Technical journal

Motivation for Our Work

The concept of using commit timers is d iscussed in great detail by Reuter' However, there are signifi­

cant differences between his group commit scheme and our scheme. These differences prompted the work we present in this paper.

In Reuter's scheme, the timer expiration triggers the group commit for everyone. In our scheme, no single process is in charge of commit processing based on a timer. Our commit processing is per­

formed by one of the processes desiring to write a commit record. Our designs i nvolve coordination between the processes i n order to elect the group com mitter (a process).

Reuter's analysis to determine the optimum value of the timer based on system load assumes that the total transaction durat ion, the time taken for com­

mit processing, and the t ime taken for performing the other phases are the same for all transactions.

In contrast, we do not make that assumption. Our designs strive to adapt to the execut ion of many dif­

ferent transaction types u nder d ifferent system loads. Because of the complexity introduced by allowing variations in transact ion classes, we do not attempt to calculate the optimal timer values as does Reuter.