Scheduling TSS /360 for responsiveness

by WALTER J. DOHERTY

IBM T. J. Watson Research Center Yorktown Heights, N ew York

INTRODUCTION

The performance of Rel~ase 4 of TSS/360 at the T. J.

Watson Research Center was dramatically improved in the three-month period from November, 1969, through January, 1970. The improvements consist of an increase in system responsiveness by a substantial factor together with an increase in throughput. This was achieved by methodically adj usting the parameters of the TSS/360 'Table-Driven Scheduler in accordance with the Principles of Balanced Core Time and Working Set Size.

The purpose of this paper is to set forth principles and methodology used to achieve the above initial results. The available evidence of improvement will be exhibited so that each reader can judge for himself the validity of the results.

CONCEPTS AND PRINCIPLES

Performance

Performance is a highly subjective term having a broad spectrum of connotation to different classes of people. Fundamentally, performance is the degree to which a computing system meets the expectations of the person involved with it. The terms responsive-ness, throughput, turn-around time, availability, re-liability, number of terminals supported, CPU utili-zation, channel and device utiliutili-zation, channel balance, and efficiency are but a few of the concepts that are usually included as aspects of performance.

Responsiveness

To a user of TSS/360, sitting at a terminal, the ability of the system to respond to his commands is his predominant view of performance.^l He does not

care if only one other person is using the system simul-taneously with him or one hundred people. If he ex-pects that TSS/360 will respond to his EDIT request in two seconds and it takes four seconds, he is usually far more irritated than if he expects a response of ten minutes to some partial differential equation and it takes thirty minutes. The system should be substan-tially more responsive to those requests to which the user expects an immediate reply, than to those during which he turns his attention elsewhere. This is the primary assumption I made when I set out to improve the performance of TSS/360.

On the other hand, if a person· expects that his re-quest will take awhile, say ten minutes, he usually turns his attention to other activities, or else he executes it in the background. Since his attention is not con-centrated on the response, he doesn't feel large delays nearly as intensely. In the days of batch computing, turn-around times in the range of one to two hours were frequently not distinguished by users who only turned their attention to it every two-and-a-half hours.

Throughput

To a system manager, the number of terminals he can support with TSS /360 is most important. Of course it is also important to consider the categories of work that the users are doing. Thus it is not unreasonable to speak of ranges from two to one hundred simul-taneous users when qualified by the work categories.

An intuitively obvious but rarely mentioned concept is that, for some categories of trivial work, as respon-siveness improves, the number of terminals in use may increase only after a threshold of human performance.

is reached. That is, if the system is responding at a rate slower than a person's response time, any initial improvements in system response will first result in the individual users getting more work done; only then will the system be able to handle more users at that level of responsiveness. This is a most important

98 Fall Joint Computer Conference, 1970

consideration. Allowing· variable delays in processing longer-running programs to build up as the load in-creases insures that the very fast ones can constantly provide their users with a fast response. This delay for long running programs is analogous to the concept of turn-around time in batch but is on the order of a few seconds instead of a few hours.

Folded forms of programs²

"By the unfolded form of a program we mean the form a program would take if it had available to it a large enough uniform memory to hold both itself and its data .... On the folded forms the addresses have been rearranged-folded-to-fit into the smaller address space actually available."2 In the TSS/360, unfolded forms of programs and data exist in virtual memory. When a space as possible without causing undue inefficiencies (called thrashing) due to an unnatural folding. A high degree of folding is important since it then permits many programs to be folded into main memory simul-taneously, thereby providing a potentially significant, increase in the level of multiprogramming. The relo-cation hardware on the :l\l{odel 67 makes automatic folding possible.

Program locality of reference³,4 ,5

"Program performance on any paging system is directly related to its page demand characteristics.

A program which behaves poorly accomplishes little on the CPU before making a reference to a page of its virtual address space that is not in real core and thus spends a good deal of time in page wait. A program which behaves well references storage in a more ac-ceptable fashion, utIlizing the CPU more effectively before referencing a page which must be brought in from back-up store. This characteristic of storage referencing is often referred to as a program's 'locality of reference.'''4 Thus a program's locality of reference influences the degree of folding to which that program can be subjected with a minimal impact on its pages referenced in the T page references immediately

prior to time t. As t progresses, W (t, T) mayor may that program in that time interval.

The working set size of a program5 ,6

The working set size set, T) of a program at time t is the number of pages contained in the working set Wet, T). Thus it is quite possible to have the working set change and the working set size remain unchanged.

It appears natural to try to refold the program when-ever its working set changes. This currently is difficult to do since it is not known in advance just when the working set is changing. In most paging systems, a working set size change is more easily detected. Thus it is possible to detect working set changes at least when the working set size changes. This paper describes a method for doing this. The relocation hardware of the lVlodel 67 makes the application of this concept possible.

To put the concepts of locality of reference, working set and working set size in perspective, consider this:

During a single interaction between a user at a terminal and TSS/360, several programs are usually executed for that user. Thus for the virtual execution time which spans this inter-action, the working set size mayor may not change; however, the working set will almost always change several times. Furthermore, for those programs having good locality of reference, the working set size during anyone time slice will usually be much smaller than the working set size for the whole interaction time interval. And, in addition, the maximum working set size for all the time slices will probably al-ways be smaller than the \vorking set size for the whole interaction time interva1. For those programs having poor locality of reference, the working set size for each time slice may frequent-ly approach the working set size for the entire interaction time interval. Good locality relates more to the rate at which new pages enter W (t, T) than to its actual size.

Balanced core time

Programs having poor locality of reference and a large working set size would greatly reduce the level of multiprogramming if allowed to remain in core for very long periods of time. This would initially appear to

affect throughput. However, responsiveness is also affected since new requests for service cannot be quickly honored if core is currently tied up. Therefore the scheduling strategy proposed here will penalize programs with poor locality and large working set size.

The Principle of Balanced Core Time states that the length of the time slice in terms of virtual CPU exe-cution time for anyone task is inversely proportional to the working set size in that time interval. This will minimize the elapsed time that any large program can clog memory. It will also allow programs with good locality to progress very rapidly. If there were no over-head associated with paging these programs in and out of memory this balanced core time principle could be applied in its pure form. But this is not the case. There-fore a minimum time slice length will be established for programs having a large set, T) and poor locality to prevent paging overhead from dominating the system.

To compensate for this compromise, the duration between such time slices ",ill be considerably longer than the duration between slices for programs with smaller working set sizes. Since the latter constitute an observed large majority, the aggregate paging load on the system will decrease. The multiprogramming level will increase since more core is available more often. Responsiveness will also improve for the same reason. In addition the degree of CPU utilization will increase. These trends should be evident in the RE-SULTS section of this paper.

Thus a paging system strikes back by reducing the service it provides to those who would misuse it. These scheduling characteristics become more a function of the goodness of the program than of the length of time it has been running. Therefore well-behaving programs will clearly be good and bad programs will hopefully become obsolete.

TSS/360 table driven scheduler^7,8

The TSS/360 table driven scheduler consists of a set of programs in the resident supervisor of TSS/360 used for scheduling, and a table with many rows (levels) of entries. The entries in anyone level of the table contain sufficient information to completely con-trol anyone task. Each task in the system has another table describing itself to the system. This table is called the Task Status Index (TSI). Each TSI has a pointer to some level in the schedule table. Thus by changing the value of that pointer a task is given a completely new set of scheduling parameters. These parameters include:

1. Time, Space, and I/O limits to be used when executing.

Scheduling TSS /360 for Responsiveness 99

2. Priority, Space, and Time values to be used to determine when to schedule a task to be run.

3. Pointers to other levels of the table which will replace the current schedule pointer in the task's TSI when some special condition occurs, or when one of the execution limits is reached.

The supervisor programs used for scheduling are described in the TSS /360 Program Logic lVlanual for the Resident Supervisor.7 The schedule table entries are described in the TSS/360 Program Logic lVlanual called System Control Blocks. ⁸

Structuring the table entries

A broad spectrum of scheduling strategies can be implemented by changing only the entries in the schedule table. In this section of the paper one of several strategies implemented at the IB1VI T. J. Watson Research Center will be described. It attempts to em-body the concepts and scheduling principles described above. As such it should not be confused with the scheduling strategy normally distributed with TSS/

360.

To better understand the scheduling strategies in the table it is helpful to consider sets oj levels grouped according to some primary goals of scheduling.

First note that several specific programs are treated separately from all other programs. They are:

1. The System Operator Task 2. The Bulkio Task

3. Logon 4. Logoff

In this initial work not much attention was paid to applying the above scheduling concepts and Plinciples to these programs.

All other programs are divided into two categories, interactive and batch. In general, the same sets of levels exist for both. The only differences are:

1. Interactive Programs have priority over batch.

2. Initially, interactive programs have greater urgency to get started than do batch.

3. The number of batch programs that are allowed to be run simultaneously is arbitrarily restricted to leave space capacity to handle anticipated interactive programs.

With these exceptions, the following applies for scheduling interactive as well as batch programs.

The interactive sets of table levels are the Starting

100 Fall Joint Computer Conference, 1970

Set, the Looping Set, the AWAIT Set, the Holding Interlock Set, and the Waiting for Interlock Set.

The Starting Set

The Starting Set of table levels are used to handle new inputs from the terminal. This is somewhat similar to the pipeline of IVL V. Wilkes.⁹This set of table levels has a twofold function:

a. Facilitate a fast reply to the terminal if possible, and

b. make an initial judgment of the current working set size of longer running programs so the best entrance to the '-Looping Set of table levels can be chosen for this program.

This is accomplished by several successive table levels with high priority, small execution time limits (say lOOms.), and increasingly larger core space limits (say 16, 32, 48 pages). Each program as it enters from the terminal will progress upward through these levels each time it exceeds its space limit.

Whenever it exceeds its time limit at any of these levels, the space limit of that level is used as the es-timate of the current working set size of that program.

That program is then considered to be a longer running program. Its future execution will be controlled by the Looping Set of table levels.

If the program exceeds its largest space limit, the largest allowable working set size (currently 64 pages) is used as the first estimate for future execution under control of the Looping Set of table levels.

Any time the program finishes it is returned to the initial Starting Set table level for the next input from the terminal.

The Looping Set

The Looping Set of table levels performs three significant functions:

a. It uses the schedule table parameters to follow the working set size of each program by regu-larly over- and underestimating its time and space requirements in a minimal fashion in ac-cordance with the balanced core time principle.

b. It causes the load generated by long running programs to be spread out in time to allow Starting Set entries to be processed quickly.

c. Finally, it optimizes the CPU utilization and penalizes bad paging programs by causing programs with minimal paging requirements

to be selected for running far more frequently than those with large paging requirements.

This penalty only occurs when the working set size is large and the program's locality of reference is poor. The Looping Set of table levels quickly detects any change in these situ-ations and dynamically adjusts to them. Thus few programs are penalized throughout their execution, while most receive consistently good serVIce.

The AWAIT Set

The AWAIT set is a special set of table levels re-served for those tasks doing tape I/O and other kinds of AWAIT operations. There is a parameter in each table level called AWAIT extension. This parameter is an elapsed time interval during which the current working set pages of a program are kept in core while the program is idle in the AWAIT state. Since this can cause severe elongations of real time compared to vir-tual time, smaller values of virvir-tual time are allotted in this set of table levels than for a task of the same working set size in the Looping Set.

The Holding Interlock Set

This set of levels is another special set reserved for all programs which are currently holding interlocks on some system resource. Programs running from this set have high priority so that the interlocked resource may be quickly released. I currently assume that the working set size will not change significantly while holding these interlocks. This needs further investi-gation.

The Waiting-for-Interlock Set

This is a special set of levels for those programs which are waiting for interlocks currently being held by other programs in the Holding Interlock Set. Programs controlled by this set of table levels will be infrequently considered for dispatching until the interlock is re-leased. The same assumption about insignificant change in the working set size is made here as in the Holding Interlock Set.

TOOLS AND THEIR USE Tools

The tools used in this work were:

1. The Carnegie Mellon Simulator (called SLIN, notCMS)

2. Conversational SIPE

3. A Basic Counter Unit (BCU) 4. Level Usage Counters

Of these, SLIN and BCU were used to gather some evidence of the improvements; however, the Level Usage Counters and conversational SIPE were the most important tools used for tuning, the primary one being the Level Usage Counters.

SLIN

The Carnegie IVlellon Simulator is a program de-veloped at Carnegie l\1ellon University that can co-exist in the Model 67 memory with TSS/360. It simu-lates multiple terminals and interfaces with TSS/360 at the CCW level of the transmission control unit.

Each simulated terminal can use a different set of commands (called a script) or all can use the same script. The overhead is quite low both in core space and CPU time.

Conversational SIPE

SIPE is a selective event trace capable of tracing many combinations of system functions simultane-ously.10 Depending on the events traced, overhead ranges from about 30 percent to less than 1 percent.

Conversational SIPE traces all CCW s and their data at the transmission control unit. Its overhead is about 1 percent. It was used only for user session measure-ments.

TheBCU

The BCD is a set of 16 hardware counters capable of measuring summary information about either time duration or frequency of use of the various hardware components of any computer. ^Itcounts at a one micro-second rate and was used occasionally to measure loads on the lVrodel 67 CPU and channels during user ses-sions as well as runs with SLIN.

The Level Usage Counters

The Level D sage Counters are a set of software counters, one for each level of the schedule table, that are incremented by one each time a task is dispatched at the level. They were used during the user sessions as well as during the SLIN runs. They provide infor-mation about utilization of the various schedule table levels and sets of levels.

Use of the tools

The initial experiments with the TSS/360 schedule table were run using SLIN, the BCU, and the Level Dsage Counters for instrumentation. Although we

Scheduling TSS/360 for Responsiveness 101

had two million bytes of LCS on our Model 67, this was rarely included when experimenting so results could be made as relevant to other installations as possible. The SHARE script (Figure 1) was used, initially running on 20 simulated terminals and, later, running on 36 simulated terminals. The script (run-ning with a single user) took approximately 2400 seconds. Thus to minimize the probability of several terminals simultaneously executing the same lines of

Im Dokument FALL JOINT (Seite 108-124)