deposit_hagen
Publikationsserver der Universitätsbibliothek
Mathematik und
Informatik
Informatik-Berichte 07 – 08/1980
Optimistic Methods for Concurrency
Control in Distributed Database Systems
Abstract
G. Schlageter
University of Hagen Postfach 940
5800 Hagen Germany West
Recently, methods for concurrency control have been proposed which werecalled "optimistic". These methods do not detect access conflicts when they occur; instead, a transaction always proceeds, and at its end a check is performed whether a conflict has happened. If so, the transaction is backed up.
This basic approach is investigated in two directions:
First, a method is developed which frees read transactions from any consideration of concurrency control; all
responsibility for correct synchronization is assigned to the update transactions. This method has the great advantage that, in case of conflicts between read trans- actions and update transactions, no backup is performed.
Then, the application of optimistic solutions in distri-
buted database systems is discussed, a solution is presented.
1,
lNTRODUCTIONWe consider the problem of coordinating transactions which concurrently access a database. The database is assumed to consist of objects; for this paper it is not of interest what objects are. As usual, we assume that a transaction transforms a database from a consistent state into a consistent state, if run alone. If transactions are allowed to access a database concurrently, a conaurrency control has
to synchronize the transactions such that serializability is guaranteed. A system of concurrent transactions is said tobe serializable if there exists at least one serial execution of the transactions which produces
the same results for each transaction and the same final state of the database.
Much work has been done in the field of concurrency
control in database systems. Until recently all approaches have been based on locking, i .e. a transaction acquires a lock for on object before accessing it, and concurrent transactions cannot acquire a non-compatible lock until the lock is released. If a lock cannot be acquired, the requesting transaction is made to wait. Acquiring and releasing of locks must be done according to certain
protocols in order to guarantee serializability /EGL,GRA,SCH/.
Other approaches have been proposed in the context of distributed database systems: timestamping may be used as a basic means of concurrency control /THO,BSR/.
A third type of concurrency control has been introduced recently by Kung and Robinson /KUR/. They call their approach optimistic, because they assume that conflicts between transactions are rare, such ±hat methods can be applied which allow conflicts to happen without immediate
notice; if, at a later point, a possible conflict is detected, a transaction is backed up. So the optimistic approaches rely on backup as a central means of controlling concurrent transactions.
In this paper optimistic approaches are discussed in two directions:
(1) A method is developed which allows read transactions to proceed without any consideration of concurrency control; all responsibility for correct synchroni- zation is assigned to the update transactions. This
scheme is especially suitable for query dominant systems.
The essential feature of this method is that, in case of conflict between read and update transactions, backup is not required.
(2) The application of optimistic solutions in distributed database systems is investigated.
A solution is presented which treats global read transactions differently from global update
transactions.
2,
ÜPT1M1ST1C SOLUTIONSKung and Robinson /KUR/ discuss methods for concurrency control which they call optimistia. They assume that conflicts among concurrent transactions are sufficiently rare such that concurrency control needs not tobe based on locking, but can be based on backup as the main control mechanism. That is, a transaction is always executed
using local copies of the required database objects, but the transaction has tobe 11 validated11 before the updates are made visible in the database. In the validation
phase it is determined whether a conflict nas possibly happened. If validation fails the transaction is backed up and started again as a new transaction.
We can summarize the basic idea of the proposal of /KUR/
as follows:
Each update transaction is divided into three phases:
the read phase, the validation phase, and the write phase.
In the read phase the intended work of the transaction is done, all updates are made on local copies of the database objects. In the write phase the local database objects are made global. In the validation phase a test for serializability of the developing parallel transaction system is performed in the following way:Each transaction is assigned a unique transaction number after positive validation, before the write phase. Let tstart be the highest transaction number at the start of transaction T, let tfinish be the highest transaction number at the
beginning of the validation phase of T. Then, in the validation phase the following check is performed:
valid:= true;
fort from tstart + 1 to tfinish do
if (write set of transaction t intersects read set of T)
then valid: = false
Validation and ~ubsequent write are one critical
section (in the usual sense in the context of synchronization).
In the validation phase all transactions t which had their write phase after the beginning of Tand before the validation phase of T are considered. For these
transactions we check whether the set of objects written by t intersects with the set of objects read by T. In this case we possibly have a conflict which may destroy serializability, such that T has tobe backed up and has tobe restarted.
For more details and for various implementation refinements of this basic idea see /KUR/.
The above check is rather rough, because intersection automatically enforces backup; in fact, the sequence of operations in the intersection might have been such that the resulting schedule still is correct. To check for this possibility, however, would be much more
expensive, and would no langer be an 11optimistic11 type of solution.
Read transactions of course do not have a write phase, but they also have tobe validated. Again, the write sets of all transactions with numbers from tstart + 1 to tfinish have tobe examined to detect intersections with the
read set of T. The main difference is that in this case validation must not be done in a critical section.
Kung and Robinson argue that their approach is especially suitable for systems where transaction conflict is highly unlikely. An example are query dominant systems. We think that especially in this environment a different but still
11optimistic11 approach would be even more appropriate: to free read-only transactions from all b~rden of concurrency control, and to assign all responsibility for it to the update transactions. This approach is outlined in the following.
3.
NEW APPROACHAs was indicated in /KUR/, optimistic solutions are certainly valuable in query dominant systems. Neverthe- less /KUR/ treat read transactions in the same way as update transactions: though reads are completely
unrestricted, returning the results from a query is considered tobe equivalent to a write and is therefore subject to validation, with the risk of backup for read transactions as well. In a query dominant system it would certainly be desirable to treat read transactions in a different way from write transactions. Our idea is to 1et read transactions always proceed and terminate without any consideration of concurrency control. The idea is as follows:
Read transaction: a read transaction is executed without any consideration of concurrency control, there is no validation phase. The read set, indicating all objects read, is maintained.
We assume that a read transaction reads the same object only once.
Update transaction: an update transaction consists again of a read phase, a validation phase, and a write phase.
Validation not only considers other update transactions, as in /KUR/, but also parallel read transactions. Only conflict with parallel update transactions may result in backup, conflict with read transactions results in a defer of the write phase.
The vaiidation phase is as follows:
T is the current update transaction. At timet, the read set of a read transaction indicates all objects read up to timet.
L: waitset: = 0;
< for all t E {active read transaction} da
if (write set of T intersects with read set oft)
then
waitset: = waitset ~ {t};
if waitset ~ ß then begin
wait (waitset);
goto L end;
v a 1 i da t i o n a s t o u p tj'a t e t ran s a c t i o n s ; write phase >
< > indicates the critical section. As usual, wait implies that the critical section is left. Wait (waitset) means that T waits until all t E waitset are terminated.
If T detects a read transaction, say TR, which accessed objects tobe modified by T, T has to wait until TR reaches its end. This guarantees that TR cannot see any objects in a state not modified by T, but some other in a new state. Of course, while T waits for
some TR to terminate, other read transactions may proceed so that the test for a possible conflict has tobe repeated.
Obviously, we have to cope with the risk of indefinite
delay of an update transaction. As this risk is practically very low in the considered environment, any brute-force solution will be sufficient, as e.g. counting the number of cycles of Tin the validation phase, and, after this number exceeds some value, applying some global lockout mechanism. Similar approaches are appropriate if T has tobe backed up repeatedly because of concurrent update transactions.
Pr.oof of correct synchronization:
We only have to show that a read transaction TR always gets a consistent view of the database,
i.e., the set of objects read by TR reflects either all updates of Tornone of them. Remember that validation and write phase are a critical section.
Let Rt be the set of objects read by TR up to timet, R the read set of TR at the end of TR' and U the write set of the update transaction T.
Consider the beginning of the validation phase of T, say at timet. We have to distinguish between two cases:
(1) Rt n U = ~:
TR either gets no objects modified by T, i.e. R nu = ~;
or, if finally Rn U
i ~,
R is consistent because all objects in Rn U have been updated by Tat timet before Rt could_ be extended.
( 2) Rt n U 1 ftJ:
In this case T is made to wait for TR to terminate:
thus, TR cannot see any object modified by T.
•
To see the main difference between /KUR/ and the approach proposed here, consider the following example, figure 1.
TR2
t
l
Figure 1: Update transaction parallel to some read transactions.
An update transaction TU runs in parallel to several read transactions TRI to TR 3 in such a way that Tu reaches its end while the readers are active. In this case we have:
/KUR/: Tu does not consider the read transactions;
it gets positive validation and writes its
objects to the database. Each read transaction, after reaching its end, has to validate its results with respect to Tu, and is backed up if validation fails. In the example TRI' TR 2 and TR 3 must be backed up in the warst case.
New approach: Tu has to examine the current read sets of each read transaction; if a conflict is
detected with TR 3' say,the write phase of Tu is delayed until the end of TR3. No· backup can occur.
With respect to TR4 there is no activity concerning concurrency control at all.
As compared to locking the main differences are
- read transactions are not concerned with concurrency control; in a system which uses locking read locks would have tobe set and released.
- deadlock cannot occur.
4,
ÜPTIMISTIC ÄPPROACHES IN A DISTRIBUTED DATABASE SYSTEM In this chapter we want· to discuss optimistic solutions in the environment of a distributed database system.Here the problem is to correctly synchronize global transactions, i.e. transactions which access data on several sites of the distributed database. We mainly aim at non-redundant distributions of data, because it seems that redundant copies of data in a distributed
database should be treated in a special way /THO, BSR, ELL/.
A global transaction is executed as a set of local subtransactions. These subtransactions are coordinated by a primary subtransaction. The two-phase commit
protocol is a well known means of synchronizing a set of cooperating subtransactions /GRA/; essentially, the following is done:
A subtransaction acquires its resources and does its work independently of the other subtransactions. When it has done its job and is ready to terminate, it enters a state which allows to undo or redo all actions
performed so far on the database. Then it sends a
message to the primary subtransaction. After receiving the messages from all subtransactions, the primary sub-·
transaction sends1commit1 to all subtransactions.
Otherwise, all subtransactions get a message 'backup'.
Thus, we have the following structure of a subtransaction:
(1) subtransaction ready to commit (2) message to primary subtransaction
(3) wait for 'commit' message (or 'backup' message) (4) commit (or backup)
The two-phase commit protocol guarantees serializability of global transactions as well as atomicity (backup or redo of global transactions as a whole).
Consider the same organization of global transactions in the presence of optimistic synchronization on the
sites. For the same reason for which the release of objects has tobe coordinated in the network, we have to coordinate validation and write if we want to use optimistic
approaches. A direct application of two-phase commit would be as follows:
(1) validation
(2) message to primary subtransaction: 'ready' or 'backup'
(3) wait for 'ok' or 'backup' (4) write or backup
Obviously, there is a serious obstacle to apply the two-phase commit protocol in this way: the fact that validation and write have tobe done in one critical section. Because of the impredictable delays of inter-
computer communication this is trivially unacceptable (step 3).
We therefore propose the following solution:
PROTOCOL FOR GLOBAL UPDATE TRANSACTIONS:
(1) subtransaction ends read phase
(2) validation, if successful: tentative write (3) message to primary subtransaction ('backup'
or 'ready')
in case of positive validation:
(4) wait for 'ok' or 'backup'
(5) change tentative write to write, or back up.
Tentative write writes the objects to a safe place so that a final write (objects are made global) as well as a backup is possible with certainty. Alternatively, tentative write might actually perform the write, and guarantee that backup is possible. The wait for the response is not within a critical section. The critical section is substituted by another mechanism. We only want to mention that locking could be used, before presenting the final solution.
Locking of the objects in the write set after positive validation.
This would result in a mixture of optimistic approaches and locking, though locking would only be applied in the case of global transactions. Other transactions cannot access objects locked because of tentative write. One must be very careful with this approach. For instance, one may allow transactions to wait for objects locked
in this way; there are no problems with local transactions, but there is the risk of deadlock for global transactions, as can be seen by the example of figure 2: While P1 waits for 'ok', P2 may run into an object locked by
o
2 . Symmetri- cally,o
1 may run into an object locked by P1 . P2 and0
2will never be able to reach their final write because their partner subtransaction is blocked, which is a typical deadlock.
ready
-
~-- -
site A
}
l
Ql waitsite B
T
Q2~ - ready
--~
Tp2
6wait
Figure 2: Deadlock in case of locking tentatively written qbjects
Though correct solutions can be devised which use locking, a more consistent solution is as follows:
'DISTRIBUTED OPTIMISTIC SOLUTION:
The vaZidation is extended in the following way:
Consider the points of time of tentative write ttw and of write tw of transaction T. Then any trans- action must examine T for a possible conflict, for which
tbegin read phase < tw(T) tend read phase > ttw(T)
This guarantees that no transaction can modify data which are tentatively written but not yet finally written, and that no transaction can come to completion which has read
11 unsafe11 (i.e. tentatively written) objects.
If tentative write is done as a real write, one might argue that a transaction S should not be automatically backed up for which tbegin read > ttw(T). In case of successful termination of T, S would work correctly.
This, however would introduce the risk of backup propa- gation.
REALIZATION:
Consider transaction T. Let {ttentative} be the set of transaction numbers of those transactions which have tentatively written but not yet finally written at the start of T.
Validation phase:
M: = {tstart + l, ···' tfinish} u {ttentative}
Each transaction in M has tobe examined for possible conflict, as described in chapter 2.
Transaction numbers are assigned at tentative write, such that tentative write is seen as anormal write as far as validation is concerned.
READ TRANSACTIONS
Of course, the subtransactions of a global read trans- action TR have tobe synchronized, too, if TR is to get a consistent view of the database. We would like to extend the approach of chapter 3 directly to the distributed environment. This approach is characterized by the fact that
(1) read transactions do no validation (2) read transactions always proceed
(3) update transactions consider read transactions in the validation phase, and are possibly delayed by read transactions.
Unfortunately, the following assertion holds:
Assertion:
A mechanism with the properties (1) to (3) cannot guarantee the consistency of the result of global read transactions.
•
To verify this assertion, one has to note that seriali- zability requires that subtransactions of two global transactions TR and Tu, say, have to be executed in the same sequence on all sites. This, however, can not be guaranteed by a mechanism which can only delay subtransactions of Tu and cannot backup either TR or Tu. Consider the following example:
- TU delayed because of TR
site A site B
Figure 3: TR gets inconsistent view of database
TR gets new data on A but old data on B. This can only be prevented if TR is delayed on B instead of the write of Tu, or if TR is restarted. A similar situation can be constructed, if tentative write is a real write to the database.
A consequence of the above assertion is that in the
distributed case read transactions have to do validation against global update transactions. Local update trans- actions may behave as described in chapter 3, such that validation by read transactions is restricted to a small percentage of all read transactions (subtransactions).
Remember that global transactions are assumed tobe a very small portion of all transactions in any distributed
database system.
Remark: Local read transactions also have to do validation against global update transactions. The reason is that the interval tentative write to write is not a critical section.
We suggest the following
structure of global read transactions:
subtransaction:
(1) start read phase,
send 1started 1 to primary subtransaction
(2) stay in read phase at least until 1validation allowed' message arrives
(3) after read phase performed do validation against global update transactions
(4) positive validation: send 'ok' to primary subtransaction.
(5) subtransaction terminates.
primary subtransaction:
- after receiving 'started 1 from all subtransactions it sends 1validation allowed' to all subtransactions - after receiving 'ok' from all subtransactions,
the read transaction is successfully terminated;
otherwise the transaction has tobe restarted.
•
The 1started' message may be combined with the message acknowledging the receipt of the request, the 10k' message should be combined with the transmission of the requested objects read from the database.
This protocol guarantees that global read transactions always get a consistent view of the database.
Proof:
Note that:
(1) The global update transaction protocol assures that there is a point of time tu for each global update transaction, where all subtransactions are in the state "tentatively written".
(2) The global read transaction protocol assures that there is a point of time tR for each global read transaction, where all subtransactions are in the read phase.
Since write is atomic, inconsistency can only occur if a read transaction gets new data from an update trans- action Tu on a site A, and old data as to Tu on a site B.
This implies that TR starts on A after Tu writes and terminates on A, and that TR terminates on B before TU writes tentatively. It follows from (1) and (2) that this
is not possible:
Consider point tR of TR. We have termination of Tu before tR on A, but write tentatively of Tu after tR on B. This
is a contradiction to (1). Hence, at time tR transaction TU is in the state 11tentatively written 11 either on A or on B, and TR detects this during validation, and is backed up.
•
Remark: The protocol for read transactions differs in an essential point from the protocol for update transactions:
since coordination as to the validation is done in parallel to the read phases, the read phases are not prolonged
artificially by the concurrency control (as would be the case, if read transactions started coordination after the end of the read phase just as update transactions do).
If the communication system is reasonably fast, t~e message
1validation allowed 1 may very well arrive before or near the end of the real read phase of the subtransactions.
SUMMARY OF PROTOCOLS
ZocaZ update transaction:
validation against local and global update transactions (may result in backup);
validation against read transactions as described in chapter 3 (may result in a delay of write)
ZocaZ read transaction:
validation against global update transactions (may result in backup)
gZobaZ update transaction:
validation against local and global update transactions, coordination with respect to write.
gZobaZ read transaction:
validation against global update transactions,
coordination of validation (after all subtransactions are in read phase).
5,
(ONCLUSIONThe paper has discussed optimistic methods for concurrency control in two directions:
First, a method has been presented which is especially suitable for query dominant environments, where the majority of transactions are pure read transactions.
Read transactions are not concerned at all with concurrency control in this proposal, concurrency
control is the job of the update transactions. Conflicts between read and update transactions are resolved by delaying update transactions.
Then, optimistic solutions have been discussed in the context of distributed database systems. Protocols for global update and read transactions have been
developed. Global read transactions behave differently from global write transactions, the coordination of the subtransactions is very fast since it is done in parallel to the read phase of the subtransactions.
Local and global read transactions have to do validation against global update transactions in a distributed system, but not against local update transactions. Since local
transactions are always assumed tobe the majority of transactions in a distributed database system, this certainly is a satisfying approach.
I wish to thank·P. Dadam for his cornments on a first draft of this paper.
References BSR
EGL
ELL
GRA
KUR
SCH
THO
Bernstein, P.A., Shipman, D.W., Rothnie, B.:
Concurrency control in a system for distributed databases (SDD-1). ACM Trans. Database Syst. 5, l(March 1980), 18-51
Eswaran, K.P., Gray, J.N., Lorie, R.A. and Traiger, I.L.: The notions of consistency and
predicate locks in a database system. Commun. ACM 19, ll(Nov. 1976), 624-633
Ellis, C.~.: A robust algorithm for updating duplicate databases.
Proc. 2nd Berkeley Conf. on Distributed Data
~~nagement and Networks, May 1977
Gray, J.N.: Notes on database operating systems.
In: "Operating systems: an advanced course".
-Springer-Verlag;- Berlfn,-1978, pp. 394-481
Kung, H.T., Robinson, J.T.: On optimistic methods for concurrency control. Proc. VLDB, Rio de Janeiro, Oct. 1979, Conf.
Schlageter, G.: Process synchronization in database systems. ACM Trans. Databa~e Syst.3, 3(Sept. 1978), 248-271
Thomas, R.H.: A majority consensus approach
to concurrency control. ACM Trans. Database Syst. 4, 2(June 1979), 180-209.