Determination of Preference Values - Bottom-Up Earley Deduction for Preference-Driven Natural L

3.5 Incrementality

4.1.7 Determination of Preference Values

We still need to answer the question how the preference values can be determined from a given corpus. What is needed is a corpus annotated with analyses (e.g.

hpsg signs). Since such of corpus is currently not available, the ideas in this section cannot be backed up with empirical evidence and rates of accuracy.

There are two kinds of values that must be determined:

• the probabilities of clauses (lexical entries, grammar rules, principles)

• the relative weights of different goals in a clause

For the probabilities of the rules and lexical entries, we would follow the sug-gestion of Eisele to count the frequency of occurrence in a corpus [Eisele, 1994]. Of course, this raises a number of problems, especially whether just absolute frequen-cies are counted or frequenfrequen-cies relative to a given context, and about the choice of the context.

The assignment of weights is an even harder problem, since it is not obvious from a corpus how strong the influence of the different factors is. This problem is very complex because it involves a large number of variables that can be varied, and for which an optimal assignment of values must be found. Since this is an optimisation problem with a vast search space, appropriate search techniques must be used, such as evolutionary strategies [B¨acket al., 1991] or genetic algorithms [Holland, 1975; Goldberg, 1989]. These methods work by choosing several initial sets of weights and applying them to a corpus. The resulting performance of each set of weights is used as itsfitness measure. The sets with the highest fitness value are selected, and further variations (mutation and crossover) are used to produce variations of the sets. This new set of weights is the input for the next step of the algorithm, which will settle on an optimal solution after some number of steps.

4.2 Conclusion

We have presented a formalisation of the notion of degrees of grammaticality by augmenting definite clauses with preference values. Best-first search can be applied in a straightforward manner to obtain solutions with the highest preference values first.

We have shown by giving some examples that a notion of numerical preference value can indeed have beneficial effects for language engineering purposes because it provides criteria for making decisions in non-deterministic situations.

More theoretical and empirical work is required in order to arrive at a satisfac-tory foundation of preference values. The exact relationship between preference values and probabilities must be clarified, and methods for obtaining preference values from observable data (such as corpora) must be developed.

The methods discussed in this chapter are more speculative than those pre-sented in the previous chapters, and not yet supported by empirical evidence.

Since such empirical work involves a lot of effort (e.g. annotation of corpora with hpsgsigns), we find it first necessary to argue that the expected results may be worth the effort.

Implementation

The methods discussed in the previous chapters are implemented in an experimen-tal NLP system called GeLD (Generalised Linguistic Deduction). Considerable attention has been paid to efficiency issues in the implementation.

The aim of the implementation is to combine the useful linguistic deduction algorithms described in the preceding chapters into a logic programming system in order to make them applicable to constraint-based grammars that do not rely on a particular rule format. Grammars can be written in a definite clause language that is augmented with sorted feature terms and the possibility to add control information which determines how grammars are processed.

The emphasis of the system is to provide a framework for easy experimentation with different kinds of processing strategies for different kinds of grammars by adding control information. As such, it is a tool for the developer of processing algorithms, rather than a tool for the development of declarative grammars.

The realization of the linguistic deduction methods in a logic programming framework is achieved by manipulating the Control part in the famous equation that defines logic programming [Kowalski, 1979].

Algorithm = Logic + Control

We can characterise our Generalised Linguistic Deduction (GeLD) system by instantiating Logic, Control, and Algorithm as follows:

Logic. As our logic, we have a declarative theory of grammar, stated as definite clauses.

Control. Instead of a fixed control strategy (e.g. Prolog’s depth-first top-down strategy, or more specialised schemes such as chart parsers, etc.) control information is added to the clauses of the grammar, in order to permit ex-perimentation with different processing algorithms.

155

Algorithm. Depending on the control information, different algorithms (and mix-tures of algorithms) result.

Like Prolog, GeLD is a proof procedure for definite clause programs. Unlike Prolog, however, our interest is not in providing a universal procedural program-ming language. Therefore, we do not support control constructs in programs that eliminate some of the solutions (e.g. the cut, negation as failure, or tests of the instantiation state of variables such asvar/1,nonvar/1etc.). We find the control information presented here more appropriate for linguistic applications than the control facilities offered by Prolog. The control constructs we provide instead con-cern choosing different proof procedures for different goals, and preferred choices in case of non-determinism.

The GeLD system consists of three parts:

1. A sorted feature term language including set descriptions and set constraints, guarded constraints, and linear order constraints.

2. A partial deduction system for grammar transformations

3. A linguistic deduction system which allows the combination of Earley de-duction with head-driven processing, top-down processing and direct Prolog execution.

The processing of a grammar proceeds in three stages:

1. Sorted Feature Terms are translated into a Prolog term representation by the ProFIT system (cf. section 5.1).

2. Various grammar transformations are carried out by the partial deduction system according to the specified control information, and the grammar is converted into an internal format (cf. section 5.3.5).

3. The linguistic deduction system is used to parse and generate strings.

The overall architecture of the system is shown in figure 5.1.

The system is implemented in Sicstus Prolog. The sizes of the system’s com-ponents given in figure 5.2 give a rough idea of the size of the system.

The following sections give an overview of the implementation. Full user doc-umentation for ProFIT and CL-ONE is available.¹

1[Erbach, 1995; Ruessink, 1994; Erbachet al., 1995c]

compilation ProFIT compilation

ProFIT

Prolog-term-based principle-based grammatical specification

Prolog-term-based query

Prolog-term-based solution

uncompilation ProFIT

Feature-term-based solution

Engine Deduction Bottom-Up Earley Deduction

Top-down Earley Deduction

Top-down Processing

Head-Driven Processing

Prolog system

deduction Partial grammatical specification

principle-based Feature-term-based

Feature-term-based query

GeLD clauses

Prolog clauses

Figure 5.1: Architecture of the linguistic deduction system

System Component Lines of Size in Prolog Code Kilobytes

Sorted Feature Terms (ProFIT) 2850 77

Extended Constraint Language (CL-ONE) 3250 67

Deduction System (GeLD) 1320 36

Total 7420 180

Figure 5.2: Components of the GeLD system

Im Dokument Bottom-Up Earley Deduction for Preference-Driven Natural Language Processing (Seite 159-164)