• Keine Ergebnisse gefunden

1.2 Linguistic Deduction Algorithms

1.2.6 Constraints in Linguistic Deduction

Almost all contemporary grammatical formalisms can be characterised as constraint-based. The two interesting questions are what types of constraints they use, and how and when constraints are processed. This section provides a brief overview of these issues and gives pointers to the relevant literature.

1.2.6.1 Types of Constraints

Constraint Logic Programming was introduced by Jaffar and Lassez [Jaffar and Lassez, 1987], and has been generalised in H¨ohfeld and Smolka’s constraint logic programming scheme [H¨ohfeld and Smolka, 1988], which allows to treat various constraint-based grammar formalisms in a unified framework. The framework has been applied to natural language grammar formalisms in [Smolka, 1992], [Frisch, 1993] and [Crouch, 1994].

The constraints allowed in Prolog, and in grammar formalisms based on Pro-log such as dcg, are equality constraints between first order terms. Merging of constraints is performed by unification of first-order terms, and the unifying sub-stitution is the merged constraint. Constraint entailment in the case of Prolog is subsumption of terms.

In most work on constraint-based grammars, first-order terms as data struc-tures have been replaced byfeature structures. Early developments in this direction are Functional Unification Grammar [Kay, 1979; Kay, 1984; Kay, 1985], Lexical-Functional Grammar [Kaplan and Bresnan, 1982], thepatr-iiformalism [Shieber et al., 1983], and the work on the formal foundations [Kasper and Rounds, 1986;

Rounds and Kasper, 1986; Johnson, 1988; Backofen and Smolka, 1995]. Current systems make use ofsorted feature structureswhose formal foundations have been worked out by Smolka [Smolka, 1988] and Carpenter [Carpenter, 1992]. Examples are Head-Driven Phrase Structure Grammar (hpsg) [Pollard and Sag, 1987; Pol-lard and Sag, 1994], most current grammar formalisms (STUF [Boumaet al., 1988;

D¨orre and Seiffert, 1991], CUF [D¨orre and Eisele, 1991; D¨orre and Dorna, 1993], ALE [Carpenter, 1993a; Carpenter, 1993c; Carpenter and Penn, 1994], ALEP [Al-shawi et al., 1991; BIM-SEMA, 1993; Meylemans, 1994], ProFIT [Erbach, 1995], TDL [Krieger and Sch¨afer, 1994], TFS [Zajac, 1992] and others), and CLP lan-guages such as LIFE [A¨ıt-Kaci, 1991] or Oz [Smolkaet al., 1995].

Cyclic terms have first been introduced in Prolog ii [Colmerauer, 1982] as rational trees, and are allowed in most current grammar formalisms and logic programming languages. Prologiialso introduced inequality constraints (dif/2), which are allowed in most current formalisms.

A wide variety of other types of constraints for different kinds of applications (e.g., linear equations) have been introduced in various CLP languages. A number of these are important for linguistic applications, most notably:

Finite Domains: Finite domains [van Hentenryck, 1989] have been introduced in the CLP languagechip. A finite domain allows handling simple disjunctions without the creation of choice points. A finite domain variable can have a fixed finite set of possible values. When two finite domain variables are unified, the result is the intersection of their possible values, and fails if the intersection is empty.

Set Descriptions: Set descriptions and set constraints are widely used in linguis-tic descriptions, but have only recently been formalised and introduced in grammatical formalisms by Pollard and Moshier [Pollard and Moshier, 1990;

Moshier and Pollard, 1994], by Carpenter [Carpenter, 1993b] and by Man-andhar [ManMan-andhar, 1993; ManMan-andhar, 1994].

Linear Precedence Constraints: Linear precedence constraints have various uses in linguistic descriptions. Their most obvious use is the modelling of word order phenomena. Other uses are in natural language semantics in the description of temporal precedence relations and of underspecified quantifier scope. The logical foundations and a constraint solving algorithm for linear precedence constraints have been worked out by Manandhar [Manandhar, 1995].

Guarded Constraints: Guarded constraints are constraints whose execution is delayed until the precondition for their applicability is satisfied. Guarded Constraints can be used to attach goals to variables, and have been used in NLP in the implementation ofhpsgprinciples and to integrate morphological constraints into anhpsg[Matiasek, 1994a; Trost and Matiasek, 1994]. They are also used in the CLG(n) Constraint Logic Grammar framework [Balari et al., 1990; Damaset al., 1991; Damas and Varile, 1992].

Tree Constraints Tree constraints extend linear precedence constraints by adding constraints on dominance, and permit the underspecified

representa-tion of trees through tree descriprepresenta-tions. A complete first-order axiomatisarepresenta-tion for tree descriptions has been worked out in [Backofenet al., 1995].

Boolean Logic: Prolog iii [Colmerauer, 1987] introduces a solver for boolean constraints. This is put to a linguistic use by Lehner [Lehner, 1993; Lehner, 1994] who uses them to achieve the same effect as finite domains.

In the following, we will abstract away from the particular types of constraints, and talk about constraints quite generally, whether they are just equality con-straints between first-order terms, or a more powerful constraint language.

1.2.6.2 Order of Constraint Checking

The previous discussion has only been concerned with the types of constraints for definite clauses, but left open the question at what point in a proof the constraints are checked, or assumed that they are checked immediately at every inference step, as in Prolog and its successors. It has also left open the question in which order different constraints should be checked.

Checking constraints immediately has the advantage of detecting failure as soon as possible. The drawback is that constraint checking may be computationally expensive, and redundant if the particular branch of the search fails anyway due to constraints that are cheaper to check.

A case study for this has been done in the framework of lfg, where straints are divided into phrasal (c-structure) and functional (f-structure) con-straints [Maxwell and Kaplan, 1993]. The experience has shown that processing is most efficient if phrasal constraints are evaluated first, and all functional con-straints delayed — provided that some pieces of information are moved from the f-structure into the c-structure. Similar experiences have been made in thelilog project, where all the constraints that build up logical forms have been delayed.

Since this tradeoff between early failure and the cost of constraint checking is highly dependent on the particular grammar and on the efficiency of the constraint checking algorithms, we will not pursue the question further in this thesis, but just note that selective delaying of constraints can be applied as a means to fine-tune the performance of a system.

In the actual implementation, we have chosen to check constraints as soon as possible and to minimise the cost of checking equality constraints between sorted feature structures by compiling them into Prolog terms, and using Prolog’s built-in term unification (cf. chapter 5).

Uszkoreit [Uszkoreit, 1991] proposes another model which makes use of sta-tistical information for controlling the order of constraint checking in order to optimise performance. Conjuncts are given priority if they have a high failure potential, and disjuncts if they have a high success potential. The reason is that a conjunction fails if one of its conjuncts fails, and a disjunction is satisfied if one

of the disjuncts is satisfied. Since this model applies internally to the constraint solver, it can easily be added to the processing algorithms proposed in this thesis.

1.2.6.3 Coroutining

Coroutining is an important technique to provide for the handling of complex con-straints such as relational dependencies (e.g., concatenation concon-straints). Corou-tining ensures that these constraints are only checked when certain variables are sufficiently instantiated to guarantee termination. This is important in cases where no good ordering of the goals in a clause can be found, especially when it is not known in advance which arguments of a predicate are its input and its output arguments.

Coroutining was introduced in Prologiiwith thefreeze/2 construct, which allows to delay a goal until a variable becomes instantiated. In Sicstus Prolog, the condition can be that a particular variable is either instantiated, or instantiated to a ground term, or known to be equal to or different from another variable.

Conjunctions and disjunctions of these conditions are possible.

Inlife[A¨ıt-Kaci and Podelski, 1991], coroutining is achieved by treating func-tions as passive constraints, i.e., functional expressions in aψ-term are only eval-uated when their arguments become sufficiently instantiated to determine sub-sumption of the arguments specified in the function’s definition. Otherwise the function residuates, i.e., waits for further instantiation [Smolka, 1993].

The most general form of coroutining is known in logic programming lan-guages under the name ofguarded rules orguarded constraints, where the execu-tion of a rule is delayed until the specified condiexecu-tions for execuexecu-tion (the guard) are satisfied (by the instantiation of variables) [A¨ıt-Kaci and Podelski, 1994;

Smolka and Treinen, 1994].

Pfahringer and Matiasek make use of constraint logic programming for parsing of hpsg grammars.30 In their approach, the principles of hpsg are attached as constraints to variables, and are checked when these variables become instantiated by unification. Only a few relations serve as generators of structures, to which then the principles will apply to filter out the ill-formed ones. In this algorithm, processing consists almost entirely of constraint checking.

Their algorithm is implemented with a Prolog extension known as attributed variables, which allow user-defined unification and the attachment of arbitrary constraints to logic variables [Holzbaur, 1992; Pfahringer, 1992]. Attributed vari-ables are a generalisation of metaterms known from logic programming systems such as Eclipse.

30[Matiasek, 1993; Matiasek and Heinz, 1993; Pfahringer and Matiasek, 1992; Matiasek, 1994a;

Matiasek, 1994b]