• Keine Ergebnisse gefunden

Clustering consonants in terms of place features

Im Dokument The Induction of Phonological Structure (Seite 188-196)

Place of Articulation

7.3 Clustering consonants in terms of place features

In light of the results of the previous sections, I will now investigate the usefulness of the principle of SPA to cluster consonants in terms of their place features. The φ coefficients that have been used to show the degree of association for place categories can also be used for individual consonants. That is, rather than testing SPA for categories, the frequencies of co-occurrence of individual consonants can be statistically analyzed. In this case, no classification of consonants into place categories is required.

Since the co-occurrence of consonants with the same place feature is avoided in such contexts, the resulting φ value constitutes a distance measure for consonants with respect to their place features. In other words, the more often two consonants co-occur in a CVC context, the less similar they are in terms of their place features.

The question arises whether this distance matrix of all consonant pairs can be used to automatically induce their place feature distinctions.

The clustering of place features that will be described is for the most part based on the distribution of CVC sequences within word forms, no matter in which position. It is thus to be expected that a certain degree of noise is introduced with the inclusion of consonants from affixes (derivational or inflectional markers), as they are not known to dissimilate in order to preserve place avoidance when occurring next to consonants from stems or other grammatical markers. In comparison to assimilation processes, laws describing the dissimilation of consonants (or vowels) are far less frequently docu-mented.29 Another complication is to do with the fact that consonants in grammatical markers or stems whose place of articulation is different from those of the adjacent stem consonant in fact assimilate, thereby complicating matters further. Nevertheless, I assume that across a large amount of data such noise should be negligible. The cal-culation of the dissimilarity matrix will be illustrated with the results on the clustering of Maltese consonants based on their co-occurrence in verbal roots, which is presented in the next section.

29The most famous dissimilation process is Grassmann’s Law (Grassmann 1863), which accounts for a sound change in Greek and Sanskrit where the first of two aspirated stops in a Proto-Indo-European word dissimilated to an unaspirated stop in those languages.

7.3.1 Maltese roots

In the following sections, I discuss some results for the clustering method for differ-ent languages and data sources. As mdiffer-entioned before, the calculation of the distance matrix is based on the association strength of their co-occurrence in the consonant pairs that are extracted from the relevant contexts. It therefore differs from the cal-culations of the dissimilarity matrix for the vowel/consonant distinction in Chapter 4 or the vowel harmony features in Chapter 6 where the co-occurrence of sounds in the respective contexts is interpreted as a similarity index. In other words, in those cases two sounds are considered to be more similar the more often they co-occur in the con-texts (the higher their association strength), whereas in the case of place features two consonants are taken to be less similar the more often they are found in the extracted sequences. The distance matrix on which the clustering methods are based consists of theφcoefficients for the individual consonant pairs that have been extracted from the relevant contexts.30 The reasons for choosing this statistical value and its computation have been presented in Section 3.2.

For each consonant pair in the language, a contingency table is created with two binary variables, which stand for the consonant in question and all other consonants, respectively. In the case of Maltese verbal roots, the respective context for the as-sociation of two consonants is their adjacent positions within the root. From this contingency table the respectiveφcoefficients are calculated for all consonant pairs as explained in Section 3.2.

The resulting contingency table can be interpreted as a distance matrix of conso-nants with respect to their place features. The association strength of the consoconso-nants reflects the degree to which consonants are dissimilar in terms of their point of articu-lation. In order not to have negative distance values, the values from theφcoefficients have been standardized to fall in the interval[0; 1]by adding one and dividing by two.

The closer the value is to 1 the less similar they are; the closer it is to 0, the more similar they are. This distance matrix can then serve as the input for the clustering procedure.31

For the clustering, the hierarchical agglomerative clustering technique Ward (Ward 1963) has been used to generate a dendrogram on the basis of the distance matrix ofφ values. The motivation and a more detailed description of the method and its usefulness for the purpose of clustering speech sounds have been provided in Section 3.3.1. The result of applying the method on the distance matrix for Maltese consonants is given in Figure 7.4.

The dendrogram in Figure 7.4 shows several partitions whose members share their place of articulation feature. The most important distinction is made between the cluster containing only coronal consonants (top) and a larger cluster that contains three major subclusters. One of these subclusters is the coronal sonorants<n, l, r>, which have been grouped with the other “non-coronal” sounds. The labial and dorsal

30In fact, the association strength values are different depending on the linear sequence of both consonants, i.e. φ(a, b)6=φ(b, a)for all sound pairs aandb. For the dissimilarity matrix the average φvalue was taken as the distance of two consonants.

31A related approach to visualize the relationship of consonants was presented by Weitzman (1987) for Hebrew and Arabic roots in the form of an MDS (multidimensional scaling) plot.

consonants make up their own partitions within the larger cluster of “non-coronal”

sounds. The only misclassified consonant is the glide <j>, which shows up being grouped together with the dorsal consonants instead of the coronals. However, this might have to do with its special status within weak roots, where a glide is taken to fill up the missing third consonant in the triliteral skeleton. The labiodental fricative

<v>could not be clustered because it only rarely occurs in verbal roots in Maltese.

coronal

coronal sonorant

labial

dorsal

Figure 7.4: Dendrogram (Ward’s method) for the consonants in Maltese verbal roots.

The major clusters in the dendrogram show an almost perfect partitioning into the three place categories of labial, coronal and dorsal consonants with a somewhat deviant separate cluster of coronal sonorants that is more closely connected to the non-coronals than to the other coronal sounds. The fact that the coronal sonorants are not grouped together with the other coronal sounds might have to do with the influence of the avoidance of categories other than place. In this respect, all other sonorants

<m, j, w> are in the cluster of non-coronal sounds, which might have caused the other sonorants to be more similar to those because of their more similar manner feature. Interestingly, a partitioning of the results in four major clusters (coronal, coronal sonorant, labial, dorsal) reflects earlier findings for other Semitic languages.

Greenberg (1950:178), for instance, considers four clusters of consonants, which except for the glide<j>, directly correspond to the partitioning in four clusters in Figure 7.4.

Likewise, McCarthy (1994), when investigating SPA in Arabic, makes a more fine-grained distinction into labials, coronal sonorants, coronal stops, coronal fricatives, velars and gutturals, which is also partly reflected in the results in Figure 7.4, where coronal stops and fricatives as well as velars and gutturals are clustered together.

Although the consonant inventories of Maltese and Arabic differ, certain aspects of the clustering can be related to the partitioning made by McCarthy (1994). In the dorsal cluster, for instance, the velars <g, k, q> are separated from the gutturals <h, g¯h, ¯h>. It thus seems that a relevant partitioning of consonants with respect to their behavior in place avoidance can be automatically inferred from their distribution in triliteral roots. But also on a lower level consonants with similar place features are nicely clustered together, with consonant pairs that only differ in their[voice]feature showing up closely together (e.g., <t, d>).

For the generation of the dissimilarity matrix on which the clustering results are based CVC sequences with identical first and second consonant have been ignored. In-cluding those cases leads to slightly worse clustering results for the Maltese data. This is to be expected because all adjacent positions have been taken into account, which gives rise to a number of sequences where identical consonants are extracted (position two to three). Ignoring those sequences reduces the noise which is introduced by the fact that consonants with the same place feature co-occur with a higher frequency than expected. Although the omission of sequences of identical consonants is strictly speaking no longer a direct clustering on the basis of SPA, it is easy to implement for the induction of place features. Sequences of identical consonants can be identified in an unsupervised manner for all languages.

7.3.2 English lemmas from the CELEX database

The clustering of Maltese root consonants looks promising. Now it will be tested whether a similar result can be achieved if CVC sequences in words of a non-Semitic language are considered. We have seen that English also shows a tendency to avoid homorganic consonants in such sequences (cf. Section 7.2.3; Berkley 1994; Dmitrieva et al. 2008). The analysis of English lemmas thereby revealed a clearer tendency for SPA than the analysis of word forms. I thus present the clustering on the basis of lemmas, which come closest to the equivalent of roots in concatenative languages, viz.

stems.

The same method for clustering consonants that was described in the preceding section has been applied to English lemmas from the CELEX lexical database. The result of the clustering is given in the dendrogram in Figure 7.5. Compared to the Mal-tese result in Figure 7.4, the clustering of English consonants does not show a correct partitioning into major place categories. Most labial consonants are in the topmost cluster [m, p, b, w], which also contains the coronal glide [j]. Yet, overall the conso-nants with the same place features are scattered throughout the different partitions and do not nicely cluster together. The clustering on a lower level, on the other hand, shows a nice partitioning of individual consonants with their nearest neighbor in terms of place features. All stop consonants are clustered with their [voice] counterparts,

as are the fricatives [s, z]. The glides [j, w] and liquids [r, l] are grouped together on the lowest-level cluster where distances are smallest. The results for German and Dutch vary depending on the parameter settings (word forms vs. lemmas, ignoring identical consonants etc.) but almost always group together certain consonant pairs on the lowest level, while a clustering into major place categories like in the Maltese result has not been obtained. The strongest effect is with the liquids, which in almost all cases form a cluster of their own (see also Figure 7.6), suggesting that these sounds are most strongly avoided in CVC sequences.

!"

Figure 7.5: Dendrogram (Ward’s method) for the word forms in the English CELEX database.

There are various reasons why the results for word forms (or lemmas) in the CELEX database for English, German and Dutch and (to a lesser extent) also for the verbal

roots in Maltese are not optimal. The major problem is the fact that some consonant combinations have a very low frequency of occurrence, which makes it difficult to compute reliableφstatistics for the pair on which the dissimilarity matrix and later the clustering procedure is based.32 This is especially true for consonants that are generally very infrequent or only occur in a closed class of words in the language (e.g., English [D], which only shows up in function words). This effect is even worse when restricting the context for the extraction of CVC sequences further (for instance, to stressed syllables only). The lack of sufficient instances for the calculation of the χ2 (and φ) values for some consonant combinations leads to the effect that individual consonant pairs whose frequency of occurrence (or rather co-occurrence with consonants other than the one given) is high enough are correctly clustered. This is reflected in the results for the clusters on a lower level where consonants which rarely co-occur in such sequences are correctly grouped together, whereas a partitioning on a higher level does not yield the expected result because not all consonant combinations are frequent enough to compute the statistical values. Some consonants even had to be omitted in the clustering because they only occurred in a small number of cases in total.

Another problem with the approach of clustering consonants with respect to their place features is the fact that sometimes combinations of different categories are avoided that not only refer to the place feature of the consonant. We have seen before that languages do not only show a tendency for the avoidance of homorganic consonants in such sequences but also tend to have consonants with different manner and voice features (cf. Section 7.2.6; Twaddell 1939, 1940). This is also reflected in the clustering result of the English word forms where the glides [w] and [j] are most likely grouped together because of their shared manner feature rather than their place categorization.

If the avoidance of certain consonant pairs does not only rest upon the place feature alone but on a combination of feature values, the clustering procedure has to decide which feature is the strongest in the combination. This is not always necessarily the place feature (see, for instance, the pair [D, v] in Figure 7.5, whose members share the feature [±voice]). For some consonant pairs both the manner and the place feature yield the same result in this respect (e.g., the glides or the stop counterparts), which might explain their correct clustering in all the dendrograms.

In general, the method yields interesting results for the languages that I considered.

However, it is not robust and mainly suffers from the fact that not all sounds are equally well represented in the relevant contexts. Especially, more sonorous sounds (such as the liquids) occur more frequently in vowel-adjacent contexts, which might be the reason for the consistently correct clustering of land r. But even when the context is extended to cover those consonants that are not adjacent to the vowel in consonant clusters, similar inequalities with regard to the occurrence of certain consonants are still to be seen. Nevertheless, the results for Maltese are very promising and encourage future work to improve the methodology.

32It is recommended for a reliable calculation of theχ2 (andφ) value that the expected frequencies for all four cells in the contingency table are more than five (cf. Bortz 2005).

7.3.3 Cross-linguistic sample of word forms from the ASJP database The clusterings that have been presented in the preceding sections yield promising results, yet also reveal inadequacies regarding some of the consonants in the data.

These inadequacies might be due to several reasons, some of which are to do with shortcomings in the methods that are used or the contexts from which the sequences are extracted (as explained above). Others might be caused by the language under investigation and the number of consonants that adhere to the principle. Some of the consonants in a language might not entirely conform to the tendency because of their restricted occurrence in the words of the language. A comparison to the distinctive feature approach might illustrate this. Distinctive features are mostly defined in terms of the characteristics of the articulators, which are the same for all languages. Yet they do not always figure in the phonology for a particular language. Rather, languages have a choice as to how many and which features are needed in order for all sounds to be distinctive in their set of features (cf. Clements 2009). Similarly, languages could only require a subset of all consonants to be in accordance with the principle while others violate the principle. The differences in the strength of the tendency for SPA in the languages presented in Section 7.2.1 would suggest such an assumption. In the English result, it is mainly the stops that are clustered correctly on a lower level, while consonants from other manner categories are less well behaved in this respect.

If this is indeed true, then the expectation would be that a clustering on the basis of a representative sample of word forms across a large number of languages would result in a perfect clustering where such inadequacies do not play a role or can be neglected as noise in the data.

To test this assumption, the clustering procedure that was described above has been run on all the entries in the ASJP database, which I consider to be a more or less representative sample of word forms occurring across languages (cf. Section 3.5.2).

The result, which is shown in Figure 7.6, reflects a clustering into three main partitions that accurately represent the three major places of articulation of labial, coronal and dorsal consonants.33 An initial partitioning into two major clusters at a distance of 0.54 suggests a closer relationship of the labial and dorsal place features as opposed to the coronal consonants, which we have already seen in the result of the clustering of Maltese consonants (ignoring the special cluster of coronal sonorants). The same effect can be seen in Weitzman’s (1987) results for Hebrew and Arabic roots where coronal and non-coronal consonants are clearly separated on the MDS plots. Even regarding the clustering at smaller distances (or heights in the subtrees) the consonants are mostly grouped as would have been expected on a distinctive feature based classification. The voiced and voiceless counterparts form the tightest clusters for the stops, except for [b] and [p], where [m] is first grouped together with [b]. The fricatives, on the other hand, show a different picture. However, this might be due to the fact that the ASJP orthography does not always distinguish manner of articulation categories sufficiently and—especially in the case of fricatives—merges them with other symbols from other categories.34

33The only misclassified consonants are the voiced uvular stop [G], which ends up in the coronal cluster, and the dental nasal [n

ˆ], which is grouped together with the dorsal consonants.

34For instance, the bilabial fricatives [B] and [F] are merged together with their corresponding stops

!

"

#

$% &

'()*

+,-. +,-/ +,0+ +,01 +,0- +,0.

234$

coronal

labial

dorsal

Figure 7.6: Dendrogram (Ward’s method) for all the word forms across languages in the ASJP database. The most salient distinction in the dendrogram is between coronal (top) and non-coronal (bottom) sounds. For the non-coronal sounds a further clustering into labial and dorsal consonants can be seen. The only consonants which are wrongly clustered in the dendrogram is the uvular fricative [G], which shows up in the coronal cluster, and the dental nasal [n

ˆ], which is grouped together with the dorsal consonants. IPA sounds belonging to the same ASJP symbol are separated by commas.

The same dissimilarity matrix on which the dendrogram in Figure 7.6 is based has been used as the input for an MDS method. Figure 7.7 shows the first (x-axis) and second (y-axis) dimension of the resulting MDS for all consonants in the ASJP data set. The partitioning in the first dimension distinguishes between non-coronals (left) and coronals (right), whereas the second dimension makes a further distinction into labials (top) and dorsals (bottom) within the non-coronals. It is remarkable that all

The same dissimilarity matrix on which the dendrogram in Figure 7.6 is based has been used as the input for an MDS method. Figure 7.7 shows the first (x-axis) and second (y-axis) dimension of the resulting MDS for all consonants in the ASJP data set. The partitioning in the first dimension distinguishes between non-coronals (left) and coronals (right), whereas the second dimension makes a further distinction into labials (top) and dorsals (bottom) within the non-coronals. It is remarkable that all

Im Dokument The Induction of Phonological Structure (Seite 188-196)