Grammar Notations - RSDL Language Definition

Part 3: RSDL Language Definition

3.3 Grammar Notations

The following presentation forms are used to describe the syntax of RSDL.

3.3.1 Abstract Syntax

A definition in the abstract syntax can be regarded as a named composite object (a tree) defining a set of sub-components.

For example the abstract syntax for channel definition is Channel-path :: Originating-gate

Destination-gate Signal-identifier-set

which defines the domain for the composite object (tree) named Channel-path. This object consists of three sub-components, which in turn might be trees.

The abstract syntax definition Agent-identifier = Identifier

expresses that an Agent-identifier is an Identifier and therefore cannot syntactically be distinguished from other identifiers.

An object might also be of some elementary (non-composite) domains. In the context of RSDL, these are:

a) Integer objects Example

Number-of-instances :: Initial-number [Maximum-number]

Initial-number = Int Maximum-number = Int

Number-of-instances denotes a composite domain containing one mandatory integer (Int) value and one optional integer ([Int]) denoting the initial number and the optional maximum number of instances.

b) Token objects

Token denotes the domain of tokens. This domain can be considered to consist of a potentially infinite set of distinct atomic objects for which no representation is required.

Example

Name :: Token

A name consists of an atomic object such that any Name can be distinguished from any other name.

The following concrete syntax operators (constructors) in BNF (see below) have the same use in the abstract syntax: “*” for possibly empty list, “+” for non-empty list, “|” for alternative, and “[“ “]” for optional.

Parentheses are used for grouping of domains that are logically related.

Finally, the abstract syntax uses another postfix operator “-set” yielding a set (unordered collection of distinct objects).

Example

State-transition-graph :: Start-node State-node-set Free-action-set

A State-transition-graph consists of a Start-node, a set of State-nodes and a set of Free-actions.

3.3.2 Concrete Syntax

In the Backus-Naur Form for lexical rules the terminals are <space> and the ASCII printed characters. In the Backus-Naur Form for non-lexical rules, a terminal symbol is one of the lexical units defined in Section 3.4 (<name>, <special>, <composite special> or <keyword>). In non-lexical rules, a terminal can be represented by one of the following:

a) a keyword (such as state);

b) the character for the lexical unit, if it consists of a single character (such as “=“ );

c) the lexical unit name (such as <name>);

d) the name of a <composite special> lexical unit (such as <implies sign>).

To avoid confusion with the BNF grammar, the lexical unit names <asterisk> and <plus sign> are always used rather than the equivalent characters. Note that the special terminal <name> may also have semantics stressed as defined below.

The angle brackets and enclosed word(s) are either a non-terminal symbol or one of the lexical units. Syntactic categories are the non-terminals indicated by one or more words enclosed between angle brackets. For each non-terminal symbol, a production rule is given in concrete grammar. For example,

<block reference> ::=

block <block name> referenced <end>

A production rule for a non-terminal symbol consists of the non-terminal symbol at the left-hand side of the symbol “::=“ , and one or more constructs, consisting of non-terminal and/or terminal symbol(s) at the right-hand side. For example, <block reference> and <end> in the example above are non-terminals; block, <block name>

and referenced are terminal symbols.

Sometimes the symbol includes an underlined part. This underlined part stresses a semantic aspect of that symbol. For example, <block name> is syntactically identical to <name>, but semantically it requires the name to be a block name.

At the right-hand side of the “::=” symbol several alternative productions for the non-terminal can be given, separated by vertical bars (“|” ). For example,

<definition> ::=

expresses that a <definition> is an <agent definition> or an <agent type definition>.

Syntactic elements may be grouped together by using curly brackets (“{” and “}"), similar to the parentheses in the abstract syntax above. A curly bracketed group may contain one or more vertical bars, indicating alternative syntactic elements. For example,

<state machine graph> ::=

<start> { <state> | <free action> }*

Repetition of syntactic elements or curly bracketed groups is indicated by an asterisk (“*”) or plus sign (“+”). An asterisk indicates that the group is optional and can be further repeated any number of times; a plus sign indicates that the group must be present and can be further repeated any number of times. The example above expresses that <state machine graph> contains a <start> followed by any number of <state> or <free action>.

If syntactic elements are grouped using square brackets (“[” and “]”), then the group is optional. For example,

expresses that an <identifier> may, but need not, contain <qualifier>.

3.4 Lexical Rules

Lexical rules define lexical units. Lexical units are the terminal symbols of the Concrete grammar.

| <note>

| <composite special>

| <special>

| <keyword>

<name> ::= <underline>* <word> {<underline>+ <word>}* <underline>*

| {<decimal digit>}+ [ {<full stop>} <decimal digit>+ ]

<word> ::= {<alphanumeric>}+

<uppercase letter> ::=

A | B | C | D | E | F | G | H | I | J | K | L | M

| N | O | P | Q | R | S | T | U | V | W | X | Y | Z

<lowercase letter> ::=

a | b | c | d | e | f | g | h | i | j | k | l | m

| n | o | p | q | r | s | t | u | v | w | x | y | z

<decimal digit> ::=

0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

| <other special>

| <asterisk>+ <not asterisk or solidus>

| <solidus>

| <apostrophe> }*

| <implies sign>

| <is assigned sign>

| <less than or equals sign>

| <not equals sign>

| <qualifier begin sign>

| <qualifier end sign>

| <plus sign> | <comma> | <hyphen>

| <colon> | <semicolon>

| <less than sign> | <equals sign> | <greater than sign>

| <quotation mark> | <dollar sign> | <percent sign>

| <ampersand> | <question mark> | <commercial at>

| <reverse solidus> | <circumflex accent> | <underline>

| <grave accent> | <vertical line> | <tilde>

| <left square bracket> | <right square bracket>

| <left curly bracket> | <right curly bracket>

<exclamation mark> ::= ! <quotation mark> ::= "

<left parenthesis> ::= ( <right parenthesis> ::= )

<asterisk> ::= * <plus sign> ::= +

<comma> ::= , <hyphen> ::= -

<full stop> ::= . <solidus> ::= /

<colon> ::= : <semicolon> ::= ;

<less than sign> ::= < <equals sign> ::= =

<greater than sign> ::= > <left square bracket> ::= [

<right square bracket> ::= ] <left curly bracket> ::= {

<right curly bracket> ::= } <number sign> ::= #

<dollar sign> ::= $ <percent sign> ::= %

<ampersand> ::= & <apostrophe> ::= '

<question mark> ::= ? <commercial at> ::= @

The characters in <lexical unit>s and in <note>s as well as the character <space> and control characters are defined by the International Reference Version of the International Reference Alphabet (Recommendation T.50), which is basically the same as ASCII. The lexical unit <space> represents the T.50 SPACE character (acronym SP), which (for obvious reasons) cannot be shown.

When an <underline> character is followed by one or more <space>s or control characters, all of these characters (including the <underline>) are ignored, e.g. A_ B denotes the same <name> as AB. This use of <underline>

allows <lexical unit>s to be split over more than one line. This rule is applied before any other lexical rule.

A (non-space) control character may appear where a <space> may appear, and has the same meaning as a

<space>.

Any number of <space>s may be inserted before or after any <lexical unit>. Inserted <spaces> or <note>s have no syntactic relevance, but sometimes a <space> or <note> is needed to separate one <lexical unit> from another.

In all <lexical unit>s uppercase <letter>s and lowercase <letter>s are distinct. Therefore AB, aB, Ab and ab represent four different <word>s. A <keyword> with all uppercase letters has the same use as the (lowercase)

<keyword> with the same spelling (ignoring case), but a mixed case letter sequence with the same spelling as a

<keyword> represents a <word>.

For conciseness within the grammar, a <keyword> as a terminal denotes the uppercase and the lowercase variant with the same spelling. For example, the concrete syntax terminator

endblock

represents the lexical alternatives { endblock | ENDBLOCK }

However, both alternatives are not considered to be distinct within the concrete grammar.

A <lexical unit> is terminated by the first character which cannot be part of <lexical unit> according to the syntax specified above. If a <lexical unit> can be both a <name> and a <keyword>, then it is a <keyword>.

For similarity with the SDL grammar the following production is introduced for RSDL.

Im Dokument Formal Semantics for SDL (Seite 59-63)