• Keine Ergebnisse gefunden

Using Regular pressions

Im Dokument The Organization of this Manual (Seite 131-140)

The DM search and substitute operations (described in the next several sections) allow you to use special notation, called regular ex-pressions, to specify patterns for search and substitute text strings.

You can also use regular expressions with the Shell commands ED (EDIT), EDSTR (EDIT_STREAM), FPAT (FIND_PATTERN),

FPATB (FIND_PATTERN_BLOCK), and CHPAT

(CHANGE_PATTERN). See the DOMAIN System Command Reference for descriptions of these Shell commands.

Regular expressions permit you to concisely describe text patterns without necessarily knowing their exact contents or format. You can create expressions to describe patterns in particular positions on a line, patterns that always contain certain characters and at times may include others, or patterns that match text of indefinite length.

Table 5-5 provides a list of the characters used to construct regular expressions and a brief description of their functions.

CAUTION: The special characters described in Table 5-5 apply only to regular expression operations. Some of these characters also have meanings (often radically dif-ferent) in Shell commands and other software products. If you want to use a regular expression as a part of one of those Shell commands or products, be sure to enclose the expression in quotation marks so that it will not be misinterpreted.

Table 5-5. Characters Used in Regular Expressions

ASCII Character

Any standard ASCII character (except those listed in this table) matches one and only one occurrence of that charac-ter. By default, the case of the characters is insignificant. Use the SC (SET_CASE) command to control case significance (see the" Setting Case Comparison" section). The following examples are all valid expressions:

SAM not the last character in the expression, it simply matches the dollar sign character. Use this special feature to mark the

Table 5-5. Characters Used in Regular Expressions ( continued)

Question Mark (1)

A question mark (?) matches any single character except a NEWLINE character. The only exception to this is when the

? appears inside a character class (see the" [string]" descrip-tion in this table), in which case it represents the quesdescrip-tion or more occurrences of that expression. The only exception to this is when the'" appears inside a character class (see the

" [string] " description in this table) , in which case it represents the asterisk character itself. Matching zero or more occurrences of some pattern is called a closure. An

" [string]" description in this table). Strings like Mary would not match, since Mary does not begin with two uppercase characters.

Table 5-5. Characters Used in Regular Expressions (continued)

[string]

A string of characters enclosed in square brackets [string] is called a character class. This pattern matches anyone character in the string but no others. Note that the other regular expression characters % $ ? :I: lose their special meaning inside square brackets, and simply represent them-selves. For example:

[sam] matches the single character s, a, or m. (If you want to match the word sam, omit the square brackets.) [-string]

A string enclosed in square brackets whose first character is a tilde [-string] matches any single character that does not appear in the string. If a tilde (-) is not the first character in the string, it simply matches the tilde character itself. For example:

[-sam] matches any single character except s, a, or m.

[letter-letter] or [digit-digit]

Within a character class, you can specify anyone of a range of letters or digits by indicating the beginning and ending characters separated by a hyphen (-). For example:

[A-Z] matches any single uppercase letter in the range A through Z.

[a-z] matches any single lowercase letter in the range a through z.

[0-9] matches any single digit in the range 0 through 9.

Table 5-5. Characters Used in Regular Expressions (continued)

Remember, though, that the actual search ignores case un-less you have used the SC command to specify a case-sensi-tive search (see the "Setting Case Comparison" section).

The range can be a subset of the letters or digits. However, the first and last characters in the range must be of the same type: uppercase letter, lowercase letter, or digit. For ex-ample, [a-nJ and [3-8J are valid expressions. [A-9J is in-valid.

Note that a hyphen (-) has a special meaning inside square brackets. If you want to include the literal hyphen character in the class, it must be either the first or last character in the class (so that it does not appear to separate two range-mark-ing characters), or you can precede the hyphen with the es-cape character @ (see the @ description in this table).

The right bracket ( ] ) also has special meaning inside a character class; it closes the class descriptor list. If you want to include the right bracket in the class, precede it with the escape character @ (see the @ description in this table). For example:

[a-d] matches any single occurrence of a, b, C, or d.

% [A-Z] matches any uppercase letter that is also the first character on the line.

5-[1-9][0-9]* matches any of the page numbers in this chapter.

[OA-Z] matches any string containing a zero or an up-percase letter.

[-a-zO-9] matches any uppercase letter or punctuation mark (Le., no lowercase letter or digit).

Table 5-5. Characters Used in Regular Expressions ( continued)

At Sign (@)

The at sign (@) is an escape character. Characters preceded by the @ character have special meaning in regular expres-sions, as indicated in the following list:

@n matches a NEWLINE character.

@t matches a tab character. Note, however, that the

<TAB> key does not insert a tab character. It simply moves the cursor to the display's next tab stop. In a regular expression, @t matches only tab characters that have been inserted with @t.

@f matches a form feed character.

In addition, you can use the escape character inside a character class to specify literal occurrences of a hypen (-) or a right bracket (]). You may also use the @ character to specify a literal occurrence of the other special characters used in regular expressions: % $ ? * @. For example:

{expr}

[A-Z@-@]] matches any uppercase letter, a hyphen, or a right bracket.

@?@* matches a question mark followed by an asterisk, rather than zero or more occurrences of any character

(? *).

You can "tag" parts of a regular expression to help rearrange pieces of a matched string. The DM remembers a text pat-tern surrounded by braces {expr} so that you can refer to it with @n, where n is a single digit referring to the string remembered by the nth pair of braces. For example:

Table 5-5. Characters Used in Regular Expressions ( continued)

SI {???} {?*}/@2@1/

S is the DM command for substituting strings of text (see the

" Substituting All Occurrences of a String" section). This ex-ample of the S command moves a three-character sequence from the beginning of a line to the end of the line. ???

matches the first three characters of the line, and ?* matches the rest of the line. The @2 expression refers to the string ?*

inside the second pair of braces, and @J refers to the string

??? inside the first pair of braces. For example:

SOl {?} {?}/@2@1/

SO is also a DM command for substituting strings of text, but it only substitutes the first occurrence of the first pattern on a line (see the" Substituting the First Occurrence of a String"

section) . This example of the SO command transposes two characters beginning with the one under the cursor. This can be a handy key definition if you often type ie for ei, etc.

The search operations shown in Table 5-6 locate strings of charac-ters in a pad. You describe the string pattern using regular expres-sions (see the previous section).

Table 5-6. Commands for Searching for Text

Task DM Command Predefined Key

Search forward for Istringl None

string

Search backward for \string\ None

string

Repeat last forward I I CTRLlR

search

Repeat last backward \ \ CTRL/U

search

Cancel search or any ABRT CTRL/X

action involving the ECHO command

Set case comparison SC [-ON] [-OFF] None for search

To search forward from the current cursor position, enclose the regular expression in slashes as follows:

Istringl

To search backward from the current cursor pOSItiOn, enclose the regular expression in backslashes as follows:

\string\

A search operation moves the cursor to the first character in the pat-tern specified by string. If necessary, the pad moves under the win-dow to display the matching string. If the search fails, the cursor

position does not change, and the DM displays the message "No match" in its output window.

Searches do not wrap around the end or beginning of the file. There-fore, to search an entire pad, position the cursor at the beginning of operation. Note (as described in the "Defining Points and Regions"

section in Chapter 3) that one way to specify a point in a pad is by matching a regular expression. This means that the search operation is really a pointing action followed by a null command. Conse-quently, you should not think of search operations as operating on a text range, but rather searching from the initial cursor position to the end (or beginning) of the file in order to properly position the cursor.

If the DM scans more than 100 lines in a search operation, it displays a "Searching for /string/ ... " message in its output window. Then it

(see the "Cancelling a Search Operation" section).

• Use the keyboard; it works as it normally does. You can type into any pad except the one being searched. You can specify any DM command except another search or sub-stitute command. The DM executes these commands when it completes the search. You can type input to another Shell or program (if it was previously waiting for input). The process executes these commands when the D M finishes the search.

Im Dokument The Organization of this Manual (Seite 131-140)