• Keine Ergebnisse gefunden

Methodological Universals

Im Dokument On the metatheory of linguistics (Seite 103-107)

4.7 Methodological Universals

4.7.1 Which Languages Do We (Not) Obtain?

All structural pre-theories we considered so far yield context-free languages, so we have an upper bound for the class of languages we induce. However, we do not get all context-free languages, as can be easily deduced from the fact that 1. all finite languages are context-free, 2. we have finite languages which are projected, so not induced by themselves, and 3. all induced languages are infinite. This tells us that as a lower bound for the languages we obtain, we cannot consider a class containing the finite languages.

But this result is not only very unspecific, it is also in some sense trivial, as we only are interested in infinite languages (as candidates for “language”), so the fact that we do not obtain certain finite languages is of no concern to us.

What should be a concern to us are the infinite languages whicharecontext-free, yet not induced by any finite language and some pre-theory under consideration.

We will first try to bring some order in the relation of languages induced by (g, P r),(g, P1),(g2, P2), and the normalized pre-theories. Then we show some interesting examples of languages we cannot obtain by them. This will then also shed a better light on the properties of languages wecan obtain. We will restrict our attention mostly toC(f, P), because firstly the finite languages are the ones which remain invariant under our pre-theories, and secondly they do not have any relevance for us.

Lemma 78 We have C(g, P r) 6⊆C(g2, P2) and C(g2, P2) 6⊆C(g, P r) and

Proof. C(g, P r)6⊆C(g2, P2) Take the language{anbn:n≥4}. This is inC(g, P r) but not inC(g2, P2) because of the elementary string condition.

C(g2, P2)6⊆C(g, P r) Conversely, take the language{aaabbb, aaaabbbb} ∪ L, whereLis any infinite language inC(g2, P2) over an alphabet Σ such that

a, b /∈Σ.

So we have a relation of incomparability. The second part of the proof shows that results of this kind are however not very meaningful, because we can always recur to finite, alphabetically distinct sublanguages. So the case of the finite languages falls back on us, and we have to be aware that inclusions are only meaningful if they do not use this sort of argument. The reason why arguments of this kind will always work with our pre-theories is the following general property of our pre-theories so far, which we will have it for all pre-theories we look at.

Definition 79 A pre-theory (f, P) is alphabetically innocent, if for I ⊆ Σ, J⊆T,Σ∩T =∅,fP(I∪J) =fP(I)∪fP(J).

Lemma 80 (g, P r),(g, P1),(g2, P2) are alphabetically innocent.

This is immediate to see. So in a word, all our pre-theories are alphabetically innocent.

Lemma 81 We have C(pg, P r◦p(g,P r))(C(g, P r).

Proof. ⊆ Assume we have L = pgP r◦p(g,P r)(I). Then we put I0 = p(g,P r)(I0) =I, and thengP r(I0) =L.

104 CHAPTER 4. THE CLASSICAL METATHEORY OF LANGUAGE (Take a language asI={ab, aabb, aaaabbbb}, such thatgP r(I) ={ab,(aa)n(bb)n: n∈N}:=L. Obviously, we havep(g,P r)(I) ={ab, aabb}. Furthermore, for any finiteI0 such that {ab, aabb} ⊆I0⊆L, we will havep(g,P r)(I0) ={ab, aabb}; so

we cannot induceL.

So the normalizing mappcomes with a decrease in “inducing power”. The same result can be obtained if we substituteP r withP1. The results regarding P1 are the following:

Lemma 82 We haveC(g, P1)6⊆C(g2, P2)andC(g2, P2)6⊆C(g, P1).

Proof. See the proof of the corresponding lemma forP r.

Lemma 83 C(g, P r)6⊆C(g, P1),C(g, P1)6⊆C(g, P r).

Proof. C(g, P1)6⊆C(g, P r): TakeL ={((bab)na(bab)n :n∈N}. We haveL=gP1({a, bababab}). But assume we haveI0 such that gP r(I0) =L. We needa∈I0, consequentlybababab∈I0. But then we also needbbabababbabab∈I0 etc., soI0 is infinite.

C(g, P r)6⊆C(g, P1). PutI={ab, aabb, xaby} ∪L, whereLis an infinite language inC(g, P r) over Σ such that a, b, x, y /∈Σ (again, we see that this part of the lemma is quite meaningless, whereas the former is not).

Why should we be interested in the languages we do not induce, or, more generally, why should we be interested in the classes we induce, given they are very unnatural from the point of view of formal language theory? In my view, there is a very strong and good motivation for scrutinizing their properties, even though to the “normal linguist” this motivation will seem a bit queer at the first sight. They provide a first example of what we might callmethodological universals. These are universal properties of “language”, which are artefacts of our projection. So assume we say that (g, P r) is the right pre-theory to adopt, we formalize our linguistic observations and perform the projection under this assumption. Then we might observe some universal properties of “language”

(recall that, after all, “language” is the proper subject of linguistics!). The most obvious one is: “languages” are context-free. In addition, if we work with strong

“languages”, we will say that we only find phrase-structure style dependencies.

But these, obviously, are not properties of the observed languages; these are properties due to our methodology, which will obtain no matter what we observe.

Formally, a methodological universal of a pre-theory (f, P) is a property of the class of languages which are induced by some finite language under (f, P), that is, a property ofC(f, P). So it is important to know the methodological universals of pre-theories we consider, for two main reasons: for the metalinguist these are interesting in itself, as he can decide whether they make a pre-theory preferable or not. For example, he might opt in favor of context-free or mildly context-sensitive “languages”, or in favor of phrase-structure style dependencies.

For the normal linguist who simply applies a pre-theory it is also very important:

he has to know its methodological universals in order to exclude them from the “linguistic” observations he makes, that is, his empirical observations. For example, if he notices that all the “languages” he considers have a certain property P, he should make sure that it is not a methodological universal of his pre-theory – because otherwise his observation is void of content. If on the other side P is not a methodological universal of his pre-theory, then he can make the claim that he has made a meaningful, empirical observation (still taken

4.7. METHODOLOGICAL UNIVERSALS 105

“modulo the pre-theory”; we will work out what that means in the next section).

As I have tried to point out in the second chapter, the concern that we take properties of pre-theories to be properties of languages is quite realistic.

So we have already presented some methodological universals regarding our pre-theories; we will now go a bit more into detail. Obviously, by the fact that there are finite languages which our pre-theories cannot induce, it is quite easy to construct infinite languages which cannot be induced either: just take a finite languageIwhich cannot be induced, an infinite languageL which can be induced, such thatI ⊆Σ, L⊆T, and Σ∩T =∅, and putL0 =I∪L. We have applied this argument repeatedly, which is based on our requirement of

“alphabetical conservativity”: our pre-theories must not introduce new letters, and on the even stricter requirement of alphabetical innocence, namely that sublanguages which do not share some letters do not interact in any way, which our pre-theories satisfy.

Another point to note is the following: take the language L7 :={anaan : n∈N0}. Is this language induced by some finite language under (g, P r)? The answer is negative, the reason is as follows: if we haveI7:={a, aaa}, then we have the sameain a new, non-recursive context, because we cannot distinguish theain the context (a, a) from the ones in context (, aa), and thus violate the weakP r-condition. So takeI70 :={a, aaa, aaaaa}. Also the analogy (aaa, aaaaa) is prevented for the same reason, and so on, and so,L7∈/C(g, P r). How about L8:={anabn :n∈N0}? This is inC(g, P r), but is only obtained by using larger analogies, as in the languageI8={a, aab, aaabb}, where we haveaab≈P rI

8 aaabb.

4.7.2 Unreasonable Restrictions of the String Case

We now come to a final characteristic of the classes of languages induced by our pre-theories, which in fact is an unreasonable restriction and will directly lead to the first major extension of our linguistic universe. Assume there is a finite languageI, where we have~y≤I~x, as well as~x≈P rI ~x1~x~x2. From this it does of coursenot follow that~y≈P rI ~x1~y~x2; in fact, thisonly follows in a very particular case, which almost amounts to~x∼I ~y (though not exactly,~xand~y might have distinct contexts, as long as they are all recursive).

Consider I = {wxv, wx1xx2v, wyv, w1yx2v, yz}. In this setting, we have x≈P rI x1xx2, buty6≈P rI x1yx2. This means in particular that the relation≤I is not preserved over projection, not even for the elements of Σin thestrong language. This is a problem to our intuition. That it is not preserved for the weak language should not bother us, as it is undecidable anyway. But obviously, for the strong language the relation≤L⊆Σ×Σ, not containing any brackets, is decidable, because it remains finite. For this reason, the fact≤gP r(I) does not extend ≤I should bother us, because intuitively, we know that y has a more liberal distribution in I than x, and so it should have in gP r(I) for its free occurrences.

This is not a problem of the more liberal pre-theories we considered before P r (as the simple pre-theoryP1); it is a problem of the restrictiveP r-family.

Now the question is: can we be somewhat more liberal, yet not as liberal asP1?

We will answer this question positively in the sequel.

The last problem is a consequence and particular instance of a more general problem of the pre-theories defined on sets of strings. We can only speak of strings, not about strings in acertain distribution. For example, inI as defined

106 CHAPTER 4. THE CLASSICAL METATHEORY OF LANGUAGE above, we might say that the ~y in the word ~y~z “means” something entirely different from the~y in the position where also~xcan occur (as far as we can say something like this in a purely syntactic approach; linguistically speaking, we would say it belongs to different categories). If we were able to say something like this, then we would get rid of our problem: we would consider theset of strings{x, y}, with respect to the contexts in whichboth can occur. This would also solve some more general problems. Consider for example the language:

J :={a, ab, abb, abbb} ∪ {d, db, dbb, dbbb} ∪ {ac, de} (4.24) In this language, there is no pseudo-recursion; but clearly, one would say that thereis a pseudo-recursive pattern in there, because it is only the strings {ac, de} which spoil the pseudo-recursion. But as we said, for us there is no way to speak of strings in certain positions. The extension we will introduce later on will allow us to do so in a certain way, which is still based on purely language-theoretic notions.

4.7.3 Linguistic Reason

Here we will consider the inverse question to the question we asked above. The above question was: which of the properties that we ascribe to “language”, are necessary, that is, methodological universals? Here we are interested in the question: on the basis of our epistemological concerns, which empirical claims can we make about “language”? As a first point, if we want to claim that

“language” has propertyP independently of any pre-theory, we must be careful that there exists a finite language Isuch thatI does not haveP. In this case, we say thatP is finitary. Every property which is not finitary and which we ascribe to “language” depends on a projection. Contrary to what people tend to think, there are many interesting properties which satisfy this constraint. We will discuss this at length in the section on linguistic finitism, so there is no need to duplicate this discussion at this point.

What I want to point our here is the following: there are certain empirical claims we can make oninfinite languages, which are based on the observation that for certain datasetsI,fP(I) always shows a certain propertyP, even though P isnot a methodological universal of (f, P). As an example, let us reconsider the family of pre-theories P r-k we considered above. We have claimed that if we choose k large enough, then gP r-k(I) will be regular for any dataset I corresponding to a natural language dataset. This is a property in the above sense, but only if we consider a certain pre-theory. So there could be a dataset J such thatgP r-k(J) is not regular, but empirically, we do not find any. We say in this case that “language” has the property of being regularmodulo (g, P r-k).

In general, we can say that “languages” have propertyP modulo (f, P), if there is no datasetI corresponding to an observed language, such thatfP(I) does not have P, and if there is some finite languageJ such thatfP(J) does not have propertyP.

Take another example: assume there is a pre-theory (f, P) such thatC(f, P)6⊆

CF L. Assume then we do not observe any linguistic dataset corresponding to a finite languageI such thatfP(I) is not context-free. Then we can claim that natural languages are context free modulo (f, P). Or to reverse the example:

surely, we can make observations to the point that under the pre-theory (g, P1),

4.8. EXTENSION I: PRE-THEORIES ON POWERSETS 107

Im Dokument On the metatheory of linguistics (Seite 103-107)