• Keine Ergebnisse gefunden

DESPERATE MEASURES

Im Dokument Micronix Operating System (Seite 171-200)

If you are reading this chapter of the manual, you must either have a serious problem, or just be curious. What are desperate measures? They are techniques that allow you to

r~cover from serious problems without requiring deep knowledge of Micronix. The -problems that we will teach you how to .deal with here are:

o Freeing up a terminal that was captured by a runaway program,

o Replacing a forgotten root password, and

o Repairing the root file system from the Standalone Micronix diskette.

You might have come to this chapter looking for a translation of a message that appeared on your console. The explanations of console messages are contained in a later chapter of this division named Console Error Messages, of course. This' chapte'r will ba.i:l you out of other difficulties, and does explain how to use fsck from the Stand Alone Micronix diskette.

10.1. Runaway Terminals

Eve:r see a runaway terminal? Runaway terminals mos.tly get away from the designer of a new piece of software. What happen~

is that the new software has a serious defect (a bug) that causes Micronix to ign9re everything you type on the terminal .to try and stop the program. The program just keeps running wildly along, doing its own thing. On a simpler system, this problem is solved by resetting the computer. With Micronix, you don't want to reset the system: you might damage the file system and/or disrupt other users who don't have any problems, but will if you reset the system. What you do instead is KILL the program.

Now, maybe you don't like killing things, but relax. In Microriix, you kill a program by sending a signal to Micronix that gracefully halts the runaway program. Killing a program, (actually better described as killing a process), stops execution of the program, closes all its open files, flushes all disk buffers and releases the memory and swap space used by the program. After killing the program, the runaway terminal will show the normal shell prompt and respond correctly.

Killing a program is a three step procedure:

1. Log on as yourself on a working terminal,

2. Identify the process number of the offending program, 3. Kill the process.

STEP ONE

The . first step may be performed in two ways. If yo~ have more than one terminal hooked up to your Decision, log in on one of the other terminals. If someone is currently logged in, you can use the su program (switch user) to temporarily log-in. For example, suppose your user name is john and you want to tempo-rarily login on someone else's terminal. After asking the person politely, you would sit down and type

% su john Password:

% []

Now, you are logged in at two different terminals. (You could also do this as superuser, but it's more complicated.)

If you don't nave two terminals, we hope you were paying attention earlier (during the Addi'ng Terminal'chapter) and left unused ports configured for log--in. Micronix comes configured for two terminals:' one on port ttyA, and 'a second on ttyB. If you have a single terminal system, your terminal is the console, and is connected to port ttyA. And is also the runaway teminal.

So here's the trick: reach,around the back of your Decision, remove the RS232 connector from the ttyA socket (in the lower right hand corner of the back), and plug the connector into ttyB, immediately to' the left ,of ttyA. NOW, type a RETURN and you should s e e '

Name: john Password:

Last logged in on ttyA at 11 :37 Thu Jun 22, 1983

% []

As the example shows, you should log in as. yourself, and you'll be ready for the next step. If our trick doesn't work, and you're certain that the RS232 cable is plugged in correctly, you must have changed the /etc/ttys file so that. ttyB is not a login port. Maybe you made ttyC a login port? If nothing is plugged in there, try connecting your ca~le to it instead.

If you can't login, your only option is to reset the system and run fsck. You should also add the word "login" to the entry for port ttyB in /etc/ttys, or use the recon program, so you won't get stuck again.

STEP TWO

Once you are logged in on a functioning terminal as your self, you need to identify the offending process. To start with, let's call your runaway program "mustang". To get the process number of mustang, you use the process status program, ps, to display all current processes:

% ps a

PID TrY COMMAND·

2 ttyA -sh

4 ttyA update

5 ttyA mustang

6 ttyB -sh

7 ttyB ps

% [ ]

The process number of mustang, also known a$ PID, is readily apparent in the output of ps as being 5. ~s is what we needed from step two, so onward.

STEP THREE

The final step is to kill the program.

with the command

% kill 5

% []

This is simply done

Look over at the runaway terminal and see if the shell prompt, usually a percent sign (%), has appeared. If it hasn't, you might need to use ps a to see if you got the process number wrong. Once your runaway terminal is back, log off on the terminal you are temporarily using by typing exit. If you only have a single terminal, you need to switch the RS232 cable back to port ttyA. Otherwise, you're done.

10.2. Replacing the Root Password

This procedure is reserved for those instances when you have somehow forgotten the root password. (Perhaps some malicious person has changed it without telling you, the "official" system administrator.) You will need the root password to backup the entire file system, create or change user accounts, check the file system with fsck, etc. And you can only change the root password when you ARE the root. A real catch-22, eh?

Well, where there's a will, there's a way. You should have a copy of the Standalone Micronix floppy diskette safely locked away somewhere. (If you don't lock the standalone diskette away, anyone can use this procedure.) With Micronix properly brought down, load the system with the Standalone Micronix floppy in the floppy drive. (This will involve booting CP/M and using the DJBOOT program, unless you have SYSGENed DJLOAD onto your Standalone diskette).

Then, we want to edit the /etc/passwd file on the hard disk.

This is a three step process, with fairly simple steps, so here goes •••

STEP ONE

The first step is mounting the ,hard disk. This is performed with a single instruction. The only trick to it is knowing which hard disk to mount. There are four possible hard disk names which you could use with the mount command. Each name is related to the size and capacity of the first hard disk in your system, your root ,hard disk. Here are the four possible·names:

m5a 5 megabyte capacity.5 1/4" hard disk m10a 10 megabyte capacity 5 1/4" hard disk m16a 16 megabyte capacity 5 1/4" hard di~k

hda any capacity 8" or 14" hard disk

Choose one of these four names for the mount and umount commands that follow. Do NOT use the name "root", because when you are using the Standalone Micronix, the root device is the floppy diskette, and you want to change the passwd file on the hard disk.

Okay, insert your choice on the command line that follows (where we used mlDa, insert the chosen name instead, if diffe-rent) :

II mount m10a /b . II []

If this doesn't work, you've chosen the wrong name.

again.

STEP TWO

Please try

The mOUnt command in step one added the hard disk to

d~rectory /b of your floppy file system. NOW, we can edit the passwd file, and remove the encrypted password from the entry for the root user.

# edit /b/etc/passwd

"/b/etc/passwd" [reading] 14 lines :/root/

root:l"ie)3rBnvHo:O:O:Super User:/:/bin/sh :c

root::O:O:Super User:/:/bin/sh

• :w

"/b/etc/passwd", 14 lines :q

11

[1

The bold face type is, as always, the letters that you will type when you edit the passwd file on the hard disk. The first editor command, It/root/", locates the line with the root user entry.

The "cit command stands for change this line. On the following

l~ne, you type a new root entry without a password: just enter the line exactly as it appears in our example. A period by itself

on the next line ends the change. Then, exit the editor by typing "w" <RETURN> and tlq" <RETURN) and you're finished with step two.

STEP THREE

type

All that you need to do now is unmount the hard disk. Just

/I umount mlOa . /I down

/I []

replacing "m10a" with the name of your hard disk (from step one).

The next command, "down" prepares Micronix for being reset. Now, you can insert your normal boot floppy and bring up Micronix.

After bringing up Micronix, log-in as "root" and assign a password. Please remember to lock the St~ndalone diskette in a safe place, so it will be there when you need it.

10.3. Repairing the Root Hard Disk

You run the fsck program on the hard disk every day if you have been following our recommendations. This section is for problems that' fsck seems to be fixing over and over again on the' root hard disk. For example, while fsck was checking the I-list, (second pass), it produced ,the messages_

Inode 1, Inode 491,

13 Directory entries, Link count 12 1 Directory entries, Link count 2

and during the "Hunting up filenames of casualties" produced the message

dir491: Directory's name changed

These messages will occur over and over again in this particular case because the fsck program is trying to repair Inode 1. Inode 1 will always be in memory while you are running fsck from the hard disk. So when fsck finishes fixing Inode 1 on the disk, the copy of Inode 1 in memory will be writen over the repaired Inode 1 on the hard disk.

We are sorry if this explanation is a little bit obscure.

But, there. are really only two things you need to understand.

One, problems that fsck repeatedly reports and fixes are being stored in memory while fsck repairs the disk copy. Two, the way to clear up the problem is to run fsck from the Standalone Micronix diskette.

In the previous section, we discussed how to find out the name of your root hard disk device (as part of Step One). If you don't know the name of your root hard disk device, please refer to that section, then return here.

To run fsck from the Standalone Micronix diskette, you must start when hard disk Micronix" is down. Reset the system with the

S~andalone Micronix diskette in the floppy drive. (This is also explained in the previous section). Once you have Standalone Micronix running, type

H

fack /4ev/mlOa Checking /dev/m10a II []

...

replacing "m10a" with the name of your root hard disk, if it is different. When fsck finishes, you can bring down Standalone Micronix by typing

/I down II []

Insert your regular load diskette and tur~the key to RESET to boot Micronix.

10.4. Conclusion

No doubt we haven't covered every possible set of dire straits that you'll come into over the years, but these have been shown " to be· the most common. "If you do encounter other situations that call for de$perate measures, and they aren't the fault of defective hardware, you should let us-know about them.

Address correspondence to the Documentation Department, Morrow Inc., at the address shown in. the front of this manual.

11. ADDING A HARD DISK

-You may find it necessary or desirable to add an additional hard disk to yourOMicronix system. PhysicallYt adding a hard disk means changing the cabling to your current hard disk to include an extension of the control cable to the new disk. You will also need to connect the data cable to the new hard disk, remove the terminating resister from the hard disk in the Decision, and set a jumper on the new hard disk to make it the second (or third or fourth) hard disk. The details for connecting the 'hardware should be incl~ded with your new hard disk.

Once you have installed the new hard disk, you will need to format it. Please follow the instructions for formatting that are ,in the INSTALLATION section of this guide.

You next need to add a Micronix file system to the formatted hard disk. This is done with a single command, mkfs. For example, if you have added a 16 megabyte hard disk as the second hard disk in your system, you type

% mkfs /dev/m16b

to create a file system on it. The third 16 megabyte hard disk is called m16c, the fourth is named m16d. If you were adding a 5 or 10 megabyte hard disk instead, you would be using mSb or mlOb for the second hard disk. You can only have one "a" drive, one

"b" drive, etc., regardless of the drives' capacity.

The next step is to check the new file system. The fsck program is used with the same nam~ as you used with the mkfs command.

% fsck /dev/mI6b

The new hard disk can now be mounted on the root file system. For example,

% mount m16b /b

mounts the second hard disk (that happens to have a 16 megabyte capacity) on directory

lb.

The best way to handle checking and mounting additional hard disks is to add these commands to the /ect/rc file. This is discussed back in the section on Customizing the Environment.

Having these commands in the /etc/rc file makes them automatic each time the system goes multiuser.

12. CONSOLE ERROR MESSAGES

Micronix is designed so that system errors are reported to the console. The error messages will appear on the console terminal regardless of the current task being performed on the cons·ole. This may result in the error message appearing in the middle of the console screen during the editing of a file. If this happens, don't worry about the message being added to the file you are edIting. Your real concern at this point is dealing with the problem that caused the error message • .

Some of the error messages received are simply warnings;

··others indicate that a serious problem has occurred, and the system is bringing itsel.f down. Most of the error messages that you are likely to.see on your console are warnings about running out of space on a disk device, or disk read or write error:

reports. Other errors reflect trouble occurring in the operation of Micronix.

There are three classes of console error messages: file system .warnings, operating system warnings. and fatal error messages. The three sections that follow contain information about the interpretation of these messages and the responses that are needed to remedy t~e situation.

12.1 • . File System W~rning Messages

File .system warning mes~ages are reports' about the file system(s) that you are using·. As we mentioned previously, all messages will appear on the console terminal for any mounted file system. Each message will tell you which disk the file system with the problem is resident on. For example,

No more space on disk hddma/8

means that you have no free space left on the first 16 megabyte hard disk in your system. The "No more space on disk" part of the message is clear enough. Transl~ting "hddma/S" into the device name /dev/m16a is a little obscure at first.

When you get a file system warning message, the disks will be named according to the controller board used for the disk.

There are three different controller boards used:

djdma hdca hddma

floppy disk controller

S" and 14" hard disk controller 5 1/4" hard disk controller

These names will make up the beginning of the disk name. The slash and number following the name refer to something known as the minor device number. The minor device number is used, to select which type and number of drive. Complete information on the major and minor device numbers appears in the Device section of the binder, for those of you who are interested. This is also part of the online documentation, so you can find out what

hddma/4 means in absolute terms by typing timan hddma".

For the rest of us, the following tables provide the information needed to interpret which drive the message refers to, for systems with multiple drives using the same controller.

If you only have one floppy disk drive, a disk designated as djdma/140 will always be the floppy disk drive, regardless of the number that follows the slash after the controller name. The same is true for hard disks, that is, if you have a single 10 megabyte hard disk, disk error messages that refer to hddma/4 means your only hard disk, of course.

There are three tables that you might need. The table for the HDCA controller is very simple, since the HDCA controller can detect the drive capacity, the number after the slash (hdca/l, for example), refers to which of four hard disks the message refers to.

hda

o

hdb '.1

HDCA hdc

2

hdd 3

The table for the hddma cont.roller is a bi t larger than the hdca table because different capacity 5 1/4" drives are indistinguishable to the controller, and must be specified by minor device number.s. The following table reproduces the minor device numbers that you will see as part of file system warning messages:

HDDMA TABLE

5 megabytes 10 megabytes 16 megabytes

0 m5a 4 mlOa 8 m16a

1 m5b 5 mlOb 9 m16b

2 m5c 6 ml0c 10 mI6c

3 m5d 7 ml0d 11 m16d

This table includes the device name following the minor device number (that is, minor number 4 represents

I

dev

Iml

Oa) • .By now, you should be a pro at deciphering things ~ike hddma/9, which is the second 16 megabyte miniwinchester, /dev/mI6b, right?

NOW, let's look at the DJDMA table, which is the most difficult one to use. The numbers that will appear after the slash come from an 8 bit byte where all of the bits are used.

This tends to hide the minor device number, which is what we are interested in. Of course, if you have a single floppy drive system, you don't need this table. Any error message that refers to djdma!"anything" means your only floppy drive has a problem.

drive It example. Suppose the message

No more space on disk djdma/140

Let's proceed to understanding the file system warning messages and what to do about them.

The files that were being written to disk may have been partially copied when the disk became full. Uncompleted transfers will complete if you are able to free up disk space without stopping any ongoing processes. You can, for example, login on another port (as outlined -in Step 1 of Runaway Terminals, in Desparate Measures) to remove or backup and remove files.

o Out of range block number - A block address is dis_covered in a file (inode) or t-he free list that cannot possibly exist in the file system on disk. This. is one 'of the things that

o Out of range block number - A block address is dis_covered in a file (inode) or t-he free list that cannot possibly exist in the file system on disk. This. is one 'of the things that

Im Dokument Micronix Operating System (Seite 171-200)