• Keine Ergebnisse gefunden

Arguments Good and Bad

Im Dokument The Language of Life (Seite 29-35)

The t h e o r y of evolution is haunted by an image and a n observation: t h e f i r s t , t h a t of t h e hapless chimpanzee, typewriter-bound, endeavoring, quite b y chance, t o s t r i k e off t h e f i r s t twenty lines of Hamlet's soliloquy; t h e second, t h e comment of an anonymous Jansenist logician, who remarked, quite sensibly, "that i t would b e s h e e r folly t o b e t even t e n coppers against 10000 gold pieces t h a t a child arranging a t random a p r i n t e r ' s supply of l e t t e r s would compose t h e f i r s t twenty lines of Virgd's Ansicill. Image and observation do not quite c o h e r e into a single argument: i t is clear in n e i t h e r case how t h e imagined stochastic experiment is t o stop. Still, t h e notion of randomness y e t lies a t t h e c e n t e r of evolutionary thought, and t h e r e i t sits, toad-like and croaking. On t h e simplest and most intui- tive conception of probability, what can occur is weighted against t h e background of what might occur: five diamonds: all o t h e r combinations of t h e cards. In poker, t h e r e a r e 2 598 960 five-card hands, but only 5 148 flushes. I t is t h e i r ratio t h a t one might e x p e c t t o observe a s c a r d s are actually dealt; but in t h e longest of long runs, t h e passage t o t h e limit gives content t o t h e intuitive idea t h a t a number of successive trials will converge t o a particular real number: 0.002, f o r example, if flushes a r e being counted.

One of t h e curiosities of t h e very notion of probability is t h e inescapability of t h e improbable. The laws of thermodynamics, t o t a k e a notorious example, a r e anisotropic: t h e y go in one direction; downhill, a s i t happens, a circumstance with what a p p e a r s t o b e overwhelming personal support. Statistical mechanics provides a brilliant and persuasive explanation for thermodynamic laws; yet PoincarC demonstrated, in an absurdly easy proof, t h a t any statistical mechanical confi- guration, of whatever degree of implausibility

-

k molecules of gas, f o r example, occupying 1/V of t h e total volume V of a finite and bounded container

-

is bound

t o r e c u r , in all i t s vividness, poignant symmetry, and complexity, given enough time. Physicists often explain t h e discrepancy between thermodynamics and sta- tistical mechanics b y arguing t h a t t h e time involved is very long. No doubt.

The evolution of life on this planet is, as Darwin realized. not a h u r r i e d affair. Early on, Darwinian biologists got rid of t h e theological limits set t o t h e age of t h e Earth b y Bishop Ussher and o t h e r s in t h e seventeenth century; t h e scale within which Darwinian evolution might have worked is bounded b y p e r h a p s five billion years. Nineteenth c e n t u r y biologists assumed t h a t whatever else one might say about Darwinian biology, i t would not fail f o r lack of time; this thesis twentieth c e n t u r y biologists have c a r r i e d over intact.

Five billion years is a p t t o seem long if one is counting t h e minutes; but i t is not long enough t o sample on a point by point basis a space whose cardinality is roughly 1015

-

touching base with a new point a t every second, say; and y e t t h e r e a r e 20'~' possible proteins

-

a number larger b y f a r t h a n t h e e x p e c t e d life of t h e

22 D. Berlinski

Everyone will concede that in the games of whist or bridge any one particular hand is just as unlikely to turn up as any other. If I pick up and inspect a par- ticular hand and then declare myself utterly amazed that such a hand should have been dealt to me, considering the fantastic odds against it, I should be told by those who have steeped themselves in mathematical reasoning that its probability cannot be measured retrospectively, but only against a prior expectation

...

For much the same reason, it seems to m e profitless to speak of natural selection's 'generating improbability'

...

it is silly to be thunder- struck by the evolution of organ A if w e should have been just as thunder- struck by a turn of events that had led to the evolution of B o r C instead."

Medawar is roughly right about probability: the fallacy to which he refers is t h e e r r o r of retrospective s p e c i j k a t i o n ; and consists precisely in reading back into an original sample space information revealed only on t h e realization of a par- ticular event. In poker, a deal distributes n hands of equal probability: 1 in 2 598 960, as i t happens. This sample space is retrospectively specified if one hand in particular is contrasted with t h e full set of 2598 959 hands t h a t remain. and probabilities assigned to t h e partition so created; what appears initially as one among equiprobable events becomes under retrospective specification an improb- able event in a sample space of only two points. It is embarrassing for an author t o point such things out. Still. Medawar is wrong in t h e general conclusions t h a t h e draws from this paragraph. Card sharps and statisticians are little interested in t h e set of all five-card sequences. In poker, sequences are i n i t i a l l y partitioned into equivalence classes of uneven size: a royal straight flush, of which t h e r e are four, a straight flush, four of a kind. a full house, a straight, t h r e e of a kind. two pairs, and, then, finally, whatever is left

-

t h e vast majority. There are four ways to achieve a royal straight flush; many more ways in which to realize a full house.

Since they are specified in advance, partitions in poker c a r r y no taint of retro- spection; and plainly, in poker t h e r e is only a rough correlation between t h e internal character of sequences within a partition and their payoffs: what is important here, as elsewhere, is t h e classification, which is very largely arbi- trary.

Medawar's argument, on its face, thus involves rather an uninspiring mistake, but i t is not yet a mistake in evolutionary thought. The human eye, a chastened M e d a w a r might argue, turning his back on his own analogy between life and t h e cards, represents one arrangement of its constituents: any other might have done as well. In admiring t h e s t r u c t u r e t h a t results, w e suffer from misplaced awe, like a toad contemplating a dog. Does this argument c a r r y conviction eye-wise? Is i t reasonable to suppose that any o t h e r arrangement of t h e eye's constituents would result in an eye? In anything a t all? The question sounds an unavoidably Aristo-

The L a n g u a g e of LVe 23

Linguistics is possible if only because human beings have strong and reliable intuitions about natural languages. The polypeptides a r e alien strings, accessible only through an arduous a c t of t h e biochemical imagination. Grammar effects a segregation of strings in a language-like system; beyond grammar, aloof, untouch- able, t h e r e is meaning; t h e two concepts do not coincide. Some grammatical strings, in a natural language, a t least, a r e grammatical and meaningless; others, meaningful but ungrammatical; but meaning and grammar belong together, yoked pairs in t h e same corner of some dimly understood conceptual space. An algebraic system of strings in which no distinctions of meaning and grammar a r e recognized is profligate; and pointless because of its profligacy.

In a preanalytic sense, t h e concept of meaning indicates a kind of coherence;

and has a usefulness of application in domains other than language. A life well- spent is meaningful: its parts and patterns a r e ordered; full with life, biological creatures a r e filled with meaning, a kind of blunt, irrefrangible purpose; in death, this meaning disappears, and what is left, t h e corpse and its grim constituents, appears all a t once t o lose t h e integrity of t h e creature itself, and becomes, instead, a thing among other things, an object merely. To t h e vitalist, living creatures instantiate some unique property that remains stubbornly unseen else- where

-

in t h e domain of objects studied by mathematical physics, for example; in death, this property vanishes, like a fluid evaporating. In mechanistic thought, t h e passage from life t o death is r a t h e r like a phase transition, a singularity of biological parts that correspond to living systems. The unalterable fact t h a t living systems die and hence do not persist indicates t h a t some of their complexions fail to preserve life and hence meaning; in fact, the number of meaningless complex- ions must be significant: most of t h e arbitrary rearrangements of a complex organ

-

a mammal, say

-

result in nothing more than a botch

-

a circumstance with which every surgeon is familiar. The Central Dogma of molecular biology estab- lishes a relationship between strings of nucleotides and strings of proteins; to t h e extent that t h e whole of a biological organism may be resolved into its protein-like p a r t s , the Central Dogma establishes a larger, more indirect, relationship be tween molecular biological order and order in t h e larger sense of Life. This relationship has an inverse: if only certain forms of life have meaning, this, too, is reflected, as i t must be, in t h e universe of molecular biological strings

-

on t h e level of string ensembles, for example. If certain protein ensembles a r e meaningful, and not oth- ers, this suggests, but does not imply, t h a t t h e same distinction is palpable on t h e level of t h e individual proteins themselves. The t e r m viable I mean as a biological

coordinate to the Siamese concepts of meaning and grammar; a protein is viable only when i t achieves a certain minimum level of biological organization and useful- ness. What level? What kind of organization? Usefulness in what respect and to what degree? Who knows?

F U L L loads, fair loads, jhir s a m p l e s

In a natural language. sentences decompose to words; words to letters. Gram- matical constraints hold weakly a t t h e level of English words. The set of all word- like combinations of English l e t t e r s of fixed length n . I shall say, make up a f i l l load; t h e set of a l l grammatical words, a jhir load. Within molecular biology. a f u l l load corresponds to all possible proteins of normal length: a s e t whose cardinality is 2 0 ~ ~ . TO t h e f a i r loads in English correspond t h e viable proteins in molecular biology. How large is t h e biological fair load? Again. who knows? Whatever its ulti- mate size, those proteins t h a t have already been synthesized in t h e course of bio- logical history are viable if anything is: nothing succeeds Like success. This s e t is a fair sample of a f a i r load. Its size Murray Eden calculates a t 2 0 ~ ~ . The task that h e sets himself is t h e infinitely delicate one of drawing inferences about t h e f a i r load from its f a i r sample.

[I51

Between t h e f a i r sample of a f a i r load. and t h e f a i r load itself. is t h e differ- ence between what is and what might be; between t h e f a i r load and t h e full load.

t h e difference between biology and mathematics. In English, t h e difference between t h e f a i r load and t h e full load is as absolute as death. Any two words of English thus resemble each o t h e r more than they are Likely to resemble a word generated a t random from t h e l e t t e r s of t h e English alphabet. In t h e case of t h e polypeptides. Murray Eden writes:

Two hypotheses suggest themselves. Either functionally useful proteins a r e very common to this space. so that almost any polypeptide one is likely to find has a useful function to perform. o r else the topology appropriate to this protein space is an important feature of the exploration: that is. there exists certain strong regularities for finding paths through this space.

In asking whether'the viable proteins are common in t h e space of all polypep- tides, Eden is asking, in effect, whether t h e f a i r sample is marked by discernable statistical regularities. "We cannot now discard t h e f i r s t hypothesis. " he adds.

"but t h e r e is certain evidence which seems to be against it: if all polypeptide chains were useful proteins, w e would expect that existing proteins would exhibit very different distributions of amino acids." Statistical tests appear to show that pairs of proteins are drawn from a common stock. H i s example involves t h e alpha and beta human hemoglobin chain. One form of hemoglobin has 146 amino acid resi- dues, t h e other 140. The two chains may be s e t down, side by side, and matched.

residue by residue. They agree a t 6 1 points: t h e r e are 76 points a t which they differ, and 9 points a t which no match is possible because the chains a r e not of t h e same length. It is plausible t h a t one chain was derived from t h e other. o r t h a t both were derived from a common ancestor. What is curious about these pairs of proteins, however, is t h e fact t h a t even though t h e chains do not agree completely in t h e order of their amino acids, they do agree in their d i s t r i b u t i o n ; reason enough, Eden argues, to suppose that t h e proteins themselves are drawn from a statistically significant fair sample.

The criticism of this historically important argument. I leave as an exercise.

The Language ofLiJ'e 25 Delicate inferences

In What is Life?, Schroedinger argued t h a t living systems must have recourse t o what h e dubbed an "aperiodic crystal" in o r d e r t o s t o r e information.

Crystals a r e repetitive, regular, and information poor; t h e o r d e r of a living system is specific, irregular, information rich. There is a c e r t a i n splendid effulgence t o t h e vocabulary of theoretical biology t h a t i t would b e uncharitable not t o cherish.

H.P. Yockey identifies o r d e r with Kolmogorov complexity; and so does R.M. Thomp- son, a mathematician who in writing on theoretical biology a l t e r n a t e s between information t h e o r y and a pious endeavor t o communicate t o t h e r e a d e r his appre- ciation f o r t h e many faces of Krishna.[l6] On t h e o t h e r hand, G.J. Chaitin and R.M. Bennett identify biological o r d e r with algorithmic simplicity. A division of intuition on so fundamental a point may suggest a degree of conceptual confusion approaching t h e schizophrenic.

If biological words are characterized b y a high degree of Kolmogorov com- plexity, could time and chance have combined t o discover a s t r u c t u r e comparable, say, t o cytochrome c o r any of t h e modern hemoglobin chains? This is t h e ques- tion raised b y t h e redoubtable H.P. Yockey: t h e problem a s posed has b u t two parameters.[l?] In t h e beginning t h e primeval soup, which I always imagine as r a t h e r a viscous, Borscht-like fluid, contained perhaps amino acid molecules.

There is, inevitably, an element of fantasy t o all quantitative calculations of this sort. A t each second, over t h e course of 1 X 10' years, an indefatigable stochastic Deity arranges and t h e n rearranges t h e amino acid residues in sequences whose length N

=

101. There a r e

such sequences. The odds against discovering any one in particular thus stand a t 1 in 2.535 x 10131. Not all residues. however, are equally probable. Save f o r a very large s e t of strings of small probability, t h e number of sequences of length N is

where

Here pj measures t h e probability of t h e j t h residue, and a

=

2, so t h a t H is measured in bits.

In t h e e n d

-

t h e details a r e not important t o my argument

-

Yockey con- cludes t h a t

H

=

4.153 bits/ residue ; (9.12)

t h e number of 1 0 1 place sequences is

"Information theory," h e remarks, "shows t h a t , in this case, t h e actual number of sequences is smaller than t h e total possible number by a factor of

lo5".

Now t h e r e a r e , in all. 3.8 x

lo6'

families of cytochrome c sequences; in o r d e r t o obtain any one of them b y chance, Yockey argues, i t would b e necessary to r e p e a t an elemen- t a r y stochastic experiment 3.15 x times on

l o 8

s e p a r a t e planets "in o r d e r t o

have a reasonable expectation of selecting a t least once a member of the ensemble plex constituents o r atoms; I have suggested something similar in arguing t h a t t h e proteins inherit a grammatical distinction from t h e structures that they constitute. Kolmogorov complexity, however, is ill-defined on any level of bio- logical organization past t h e molecular; but even if a mammal or a mollusk could be represented as a binary string, nothing suggests t h a t those strings would be high in Kolmogorov complexity. Quite t h e contrary. Life in the large.

on t h e level of t h e organism itself, is organized with what appears t o be brisk algorithmic efficiency. Living creatures are simple in t h e sense of Kolmogorov complexity; but complex under t h e classification of t h e i r complexions. In this sense, they behave much as a language-like system. This observation is com- patible with t h e thesis t h a t protein strings are, nonetheless. high in Koho- gorov complexity; but i t is compatible, too, with t h e contrary thesis that pro- tein strings reflect t h e complexity of life by means of t h e i r organization cytochrome c its position of statistical distinction? "Because of the very fun- damental function of t h e cytochromes." Yockey writes. "

...

t h e histones and curious remark inasmuch as words in a natural language are low, and not high, in Kolmogorov complexity. Still. I am sympathetic t o t h e d r i f t of this Line; but t h e difficulty goes beyond t h e problems of an imperfect analogy. Certain classes of proteins, Yockey argues. are necessary for life. Such are t h e information-rich, complex strands; other strands a r e specific in the limited sense t h a t they are statistically unlikely: "only a tiny fraction of t h e (avail-

The Language of LVe 27

Life might have made do with any other protein of comparable complexity. If by specificity, Yockey means statistical unlikelihood in a uniform sample space

-

t h e space of all complex proteins, for example

-

his surprise a t the emergence of cytochrome c is attributable to retrospective specification; if not, what then is specificity, t h e mysterious middle term t o his argument? If t h e specific proteins have some independent description, Yockey does not provide it; and their size, apart from suggesting t h a t i t is low, he does not calculate.

Im Dokument The Language of Life (Seite 29-35)