AEGIS Kernel Services - AEGIS System Overview

AEGIS System Overview

2.2. AEGIS Kernel Services

The AEGIS system is designed as a structured set of subsystems called managen. Each manager is composed of one or more modules that define the manager's set or operations and its private database. Many of the modules within a particular manager are available to other managers in the system. Consequently, each manager can use its own internal database plus modules in other managers to build a complete set or system services.

And,

because the system is composed of small, independent modules, a change to a module does not require a change to the entire operating system.

The services provided by the managers or the AEGIS kernel can be separated into the rollowing categories:

• File management

• Process management

• Virtual memory management

• Network management

• I/O management

• Time management

• Access control

• System initialization and shutdown

The next sections introduce the methods that the AEGIS system uses to carry out each of these functions.

2.2.1. File Management

AEGIS file management at the kernel level is, for the most part, object management. At this level, files are abstracted, just as are all other system resources, into objects. In general, file system objects are simply storage containers for bits. The AEGIS managers that handle objects at this level make no attempt to interpret the bits within the object as representations of an object type. It is the system's higher-level managers that interpret the bits.

The AEGIS file system carries out two runctions - object management and object DamiDg.

2.2.1.1. Object Management

The AEGIS components that carry out object management are collectively known as the object storage system (OSS). The object storage system manages the storage of objects on disk and provides the ability to read and write l024-byte portions, or pages, of an object rrom local or remote disk to main memory. At any given time, the permanent storage ror an object resides entirely at only one node, called the home Dode. The object storage system returns the results of any remote modifications to an object back to the home node for permanent storage. In addition, the system does not arbitrarily shift an object's home node from one node to another.

AEGIS System Overview 2-2

The interpretation or the bits within an object is lert to the object's type manager. Most or the object type managers exist at the user-space level; an example or a user-space type manager is the stream interrace. There are, however, two kernel-space managers that interpret the bits or file system objects: the naming server, which recognizes directory objects, and the ACL manager, which interprets access control list objects.

2.2.1.2. Object Naming

Objects are identified by 64-bit unique identifier strings, or UIDs. When an object is created, the system manuractures it a UID by concatenating the unique node ID of the node generating the object with a time stamp Crom the node's timer. (The DOMAIN network does not use a global clock; instead, each node keeps its own time.) The UID is the mechanism the system uses to locate the object; that is, it is the system's internal name Cor the object.

The naming server is the AEGIS manager that allows users and programs to rerer to objects in the network using text string names instead or UIDs. The naming server on each node manages a collection or directories organized as a network-wide, multilevel tree. The directories contain the associations between text string names and the UIDs of objects local to that node. (By - convention, an object is located on the same node as the directory in which it is catalogued.) A user rerers to an object by its text string name, or pathnamej the naming server's function is to translate this text string name to the object's UID using the directory data structures. Chapter 8 describes naming server function and structure in detail.

2.2.2. Process Management

Process management concerns the allocation or processor resources. The AEGIS kernel manages processor resources by multiplexing the processor into many virtual processors, or processes. A process is an independent, asynchronously executing entity.

The AEGIS kernel supports two levels or processes:

• Levell processes, also called supervisor, kernel, or PROCI processes

• Level 2 processes, also called user or PROC2 processes 2.2.2.1. Levell Processes

Levell processes are processes that only run the protected operating system software and thus run e%clu~ivel!l in supervisor mode. Level 1 processes are completely internal to the AEGIS kernel: their context - processor state and stack - exists in a protected portion or virtual memory. In addition, level 1 process context is permanently wired; that is, the process's context is permanently resident in physical memory and can never be paged out by the virtual memory management subsystem. Note, however, that although a level 1 process's context is wired, it can still run pageable AEGIS kernel procedures.

There are 32 level 1 processes; the system names them with small integers (1-32) called process IDs, or PIDs. Process IDs are not unique; when the system deletes a level 1 process, it reissues its PID to the next new level 1 process it creates. (A level 1 process is deleted when the level 2 process on top or it is deleted.) Because PIDs, unlike UIDs, are not unique identifiers, the system can only rerer to level 1 processes on a single machine, rather than on a network-wide basis.

2-3 AEGIS System Overview

2.2.2.2. Level 2 Proceuea

At system initialization, eight level 1 prOcesses are rese"ed to the AEGIS kernel. The remaining 24 processes can be used as additional level 1 processes, or they can be augmented, or bound to level 2 processes (also called user or PROC2 processes).

Level 2 processes are level 1 processes with additional, user-mode context: their own process virtual address space. For the most part, level 2 proc.sees run user-mode sortware. To run the supervisor-mode AEGIS kernel services, level 2 processes enter supervisor mode via the SVC catcher.

The user-mode context or a level 2 process is pageable: the virtual memory management subsystem can move pages of the process's virtual memory in and out of physical memory (unless the process specifically wires its. pages.) However, while the level 2 process context is pageable, the level 1 process context underneath it is not. The level 2 process's pageable user-mode context provides the environment ror its user-mode activity. The level 1 process context that is bound to every level 2 process supports the level 2 process's SUpe"i80r-mode activity.

The system gives a unique name to a level 2 process by assigning it a UID. Consequently, a level 2 process on one machine can explicitly rerer to a level 2 process on another machine. Both the supe"isor-mode level 2 process (PROC2) manager and the user-mode process manager (PM) handle level 2 process operations. The PROC2 manager handles the binding of level 2 process context to level 1 processes, while the process manager handles user-mode process operations such as program invocation, fault handling, and resource cleanup. Figure 2-1 illustrates the relationship between levelland level 2 processes.

AEGIS SY8tem Overview 2-4

C

^-'~⁾

o

(DM PROCESS)

;j~In~;u~n~~n:

:~~~:~~n~~~~~;~~:

. .. .. ... .

...

;u.~t<::

Proc.ii:

;~~':I·I

• •

PID ... 1 2·

Reserved

1

For Display Manager

24 USER ....

oca.!.

UID ....

....

^--~

AIID

••• • ••

I 8 10

• KERNEL PROCESSES

Figure 2-1. Proeeu Levels 2.2.2.8. Process S)"Dchronilation

. : ; ~ ... : ;: .; ..

:.":::::1:1.

¹¹1111: .. ,

32

In the AEGIS system, process synchronization is based on eventcounts. An eventeount is an object that keeps a count of the number of events within a particular class that have occurred so (ar in the execution o( the system. A process signals the occurrence of an event by advancing the eventcount associated with it. Each time the eventcount is advanced, the counter value is incremented. Consequently, waiting processes can synchronize their operations around an eventcount by:

• Waiting on the very next event by waiting (or the eventcount to be advanced to a new value

• Waiting on a future event by reading the eventcount's current value, then waiting for it to reach the future trigger v&lue (the current value plus the nth etJent value)

h with processes, there are two levels of eventcounta: level 1 (EOI) and level 2 (EC2). Levell processes use level I eventcounts to synchronize operations in the kernel, while level 2 processes use level 2 eventcounts to synchronize operations with other level 2 processes.

Because the AEGIS eventcount operates as a shared object, only processes running on the same machine can use it. See the section on memory management (or an explanation of shared objects.

AEGIS SY3tem Overview

2.2.2.4. Process Sehedullq

The AEGIS system schedules processes ror execution based on their priority, running the highest priority process fim. The system calculates process priority inversely against the amount of CPU time the process requires. Consequently, a process that requires a large amount or CPU time is assigned a low priority. Process scheduling is dynamic. The system's scheduling procedure, called the scheduler, periodically checks a process's CPU needs; as its CPU need changes, 80 does its priority. The scheduler then performs a proeeaa exchanae (also known as a context switch or dispatch): it switches rrom the lower priority process to the higher. Dynamic scheduling intends to give interactive processes priority, on the theory that an interactive process is usually waiting ror the user to type.

2.2.2.5. Trap, Interrupt, and Fault HandlinB

The AEGIS system distinguishes between traps, interrupts, and faults. A trap is an instruction, like any other M680x0 processor instruction. Traps include SVC traps and traps to the PROM.

For example, typing CTRLjRETURN executes a trap instruction to the SF trap. A trap generates a hardware exception that changes the normal now of program execution. When an exception occurs, the processor hardware indexes to the appropriate trap vector address in the trap page and uses this address as the next instruction to execute. The trap page contains the entry points to routines that the processor hardware uses to handle exceptions and interrupts.

Once the trap is handled, code execution resumes. at the next user code instruction. Hardware exceptions include bus errors, zero divide, and privilege violations.

An interrupt is a hardware-generated event that takes the processor away from the currently running process. Interrupts vector through interrupt entry points in the trap page directly to driver interrupt service routines (ISRs). Interrupts mayor may not restart or wait for completion. Although an interrupt changes the now of execution, it is generated by system activity that occurs independently of instruction execution, while an exception always occurs as the result of instruction execution. Chapter 15 describes interrupt handling in more detail.

In addition to hardware exception vectors and interrupt vectors, the trap page contains five vectors that the AEGIS system uses to handle

sve

traps, which are traps from user to supervisor mode. The trap handlers to which these vectors point field user-mode calls to AEGIS supervisor-mode services; these handlers are collectively called the

sve

catcher. Chapter 19 describes these trap handlers in detail.

Faults are generated by either the hardware or software. The fault interceptor manager (FIM) handles hardware-generated faults; user code fault handlers deal with software-generated faults.

Hardware-generated faults restart the instruction that caused the fault; resumption of execution after a software fault depends on how the user rault handler is designed. Chapter 18 describes the fault handling carried out by the AEGIS kernel. See the manual Programming with General System Oall, for more information on user rault handlers.

2.2.3. Layout of Virtual Address Space

The AEGIS system allocates virtual memory into private and shared areas called per-process and global address spaces. In addition, it separates both per-process and global spaces into protected and unprotected areas. Figure 2-2 illustrates this allocation of virtual address space.

AEGIS System Overview 2-8

o

UHr Mod.

Sup.rvlaor Mode

16 or

256 MB . . . - - - '

Figure 2-2. Layout or Virtual Addresa Space

Per-process address space is the virtual address space that the system gives to each level 2 process it creates. The unprotected portion of per-process address space is called uaer private address space and contains the process's private programs and data.

Superviaor private address space is the protected portion of per-process address space IS the process can only access supervisor private address space while it is running in supervisor mode.

Consequently, user-mode processes must call the supervisor via the SVC catcher to let the inrormation mapped in supervisor private space on their behalr. For example, the system maps a process's. working and naming directories in into its supervisor private space. When the user-mode process wants to access its working directory, it makes an SVC call to the naming server to fetch the directory from its superviao~ private address space.

Because the contents or per-process address space varies with each process, dirrerent processes view different objects at the same virtual address. In contrast, Ilobal address space is ahared among all processes in the system, 10 that each process views the aame object at the aame virtual address. Global address space is also separated into protected and unprotected regions. UNr slobal address space (also known as alobal A) is unprotected shared virtual memory that all the user-mode programs in the system can access. User global address space contains the global libraries and other unprotected global data.

AEGIS SJI~tem Overview

Supervisor global (or .Iobal B) address space is the protected virtual memory shared among all supervi~or-mode processes. Supervisor global space contains supervisor-mode programs and data such ^aiSthe AEGIS kernel 80urce code and system data structures.

The size "r virtual address space differs depending on the DOMAIN node model. Chapter 9 provides more details about the contents or virtual address space ror each node model.

2.2.4. Vlrtual Memo!')" Manaaement

The AEGIS virtual memory management subsystem provides network-wide access to objects.

Virtual memory management includes two related operations:

• Mapping, (or 8lngle level atorase) where the system sets up an association between a local or remote object and a process's virtual address space 80 that the process can rerer to the object directly by rererencing addresses in its virtual memory

• Demand paging, where the system dynamically transfers l024-byte pages of an object residing on local disk or remote node to the requestor, be it on the local node or on another node in the network

2.2.4.1. Mapping

Som-e systems separate storage into levels; main memory is the primary storage level while the disk is the secondary storage. These multilevel storage systems allow programs direct access only to the primary storage level. A program must explicitly copy an object from secondary to primary storage before it can access the data. In contrast, the AEGIS system uses a single level storage mechanism, called SLS. Under SLS, a process gains access to an object by requesting that it be mapped directly into its address space, associating network-wide object pages with pages of process virtual address space. The direct mapping feature of SLS allows processes to access objects using programming language variables, arrays, strings, and other constructs.

In addition, once the object is mapped, the system does not demand page any data until the processor actually rererences it; consequently, processes can map objects to regions of process address space without incurring excessive system overhead.

The mapping between object space and process address space is the fundamental I/O primitive of the DOMAIN architecture. It provides one level of storage for all the objects in the network, whether the objects exist on local disk or on another disk in the network. It also allows users to share single copies of programs and data files. Because mapping proceeds independently of whether the object is local or remote, it provides a uniform, network-transparent way to access objects. As a result, the user can execute a program without being concerned about its location or the location or the fues it uses. For example, it is possible to execute on node A a program that resides on node B, reads input from node C, and creates output on node D.

2.2.4.2. Demand Paging

AEGIS manages virtual memory over physical memory by paging l024-byte pieces or virtual memory in and out or physical memory both locally and over the network. Each node has a remote paging server process that handles remote requests to read and/or write l024-byte pages of objects on that node. When a page belonging to an object is referenced by another node on the network, the remote paging server dynamically transfers, or demand pages, it to the requesting node.

AEGIS System Overview 2-8

()

o

.... -... ----.. -.-~ .. ~--.--.~ .. ----.--. ---•.. -' .. --_._---_ .. _---_ ... _-._-_ ... _--_.-.---..

---The paging system saves, or caches, copies

ot

the pages it has tetched; consequently, subsequent references proceed at main memory speeds. The object storage system ensures that these copies are always up-t&date by purging obsolete, or stale, pages as necessary; it also automatically reflects an object's page modifications back to the node on which the object is stored. Because the AEGIS virtual memory management subsystem uses a node's main memory as a cache over objects from the entire network, the subsystem only needs to read mapped objects rrom local disk or from the network when they are actually demanded and are not already cached locally.

2.2.5. Network Man_lement

The proprietary DOMAIN network consists of a token-passing ring. It is called a ring because the communications cable connects the nodes in a circle. The ring uses a token-pauml architecture, in which a special bit pattern circulates around the ring, passing through each node.

In order to transmit a message, a node must gain control or this token. The node's ring transmitter generates the message, which is received at each successive node and re-transmitted to

keep the message going around the ring; this process is called transce1vinl. The ring hardware carries out the transceiving sequence without intervention rrom the central processor.

Although messages sent by a transmitting node pass through each node on the ring, only the target node actually processes the message. If the ring hardware on a node decides to receive a message passing through it, it awakens the processor by signaling an interrupt as well as transceiving the message. Because each node transceives the message, it eventually returns to the node that generated it. This node checks the message for evidence of the target node's receipt small number. The message sender addresses the packet to a node and to a socket number within that node. When the packet arrives at the target node, the ring receive interrupt handler places it into the socket specified by the socket number in the packet.

Because sockets provide the only means for packet delivery, a process that intends to receive messages from the network must have a socket into which the messages can be queued. The AEGIS system runs special server processes that handle requests for service from remote nodes.

Since these servers expect to receive messages from the network, they are assigned sockets at system initialization. These sockets are called well-known sockets because the socket number assigned to a given AEGIS process is the same on all nodes; for example, the paging server is always assigned socket number 1, regardless of the node r 1 which it is running. Well-known socket owners within the AEGIS kernel include the paging server, the remote file server, and the information server. Well-known sockets are also assigned to AEGIS user-space system services,

Im Dokument AEGIS Internals and Data Structures (Seite 28-37)