• Keine Ergebnisse gefunden

"Testing - One, Two, Three ... Testing!"* Testing: A Test Case for the New Sociology of Technology

N/A
N/A
Protected

Academic year: 2022

Aktie ""Testing - One, Two, Three ... Testing!"* Testing: A Test Case for the New Sociology of Technology"

Copied!
27
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

F o r s c h u n g s g r u p p e G r o ß e T e c h n i s c h e S y s t e m e

des F o r s c h u n g s s c h w e r p u n k t s T e c h n i k - A r b e i t - U m w e l t des W i s s e n s c h a f t s z e n t r u m s B e r l i n für S o z i a l f o r s c h u n g

FS II 90-502

"Testing - One, Two, Three ... Testing!"*

Testing: A Test Case for the New Sociology of Technology

by

T r e v o r P i n c h

* The t i t l e of thi s p a p e r u s e s an e x a m p l e of t e s t i n g of a t e c h n o l o g y f irst b r o u g h t to m y a t t e n t i o n b y S t u a r t Blume.

W h e n r o a d i e s of p o p g r o u p s tes t the a m p l i f i c a t i o n e q u i p m e n t t h e y c a n o f t e n be h e a r d to say "Testing, One, Two, Three...

T e s t i n g " . It is l a r g e l y in r e s p o n s e to S t u a r t B l u m e ' s and A r i e R i p ' s c r i t i c i s m of an e a r l i e r v e r s i o n of this p a p e r for b e i n g too l i m i t e d in its scope th a t I h a v e s o u g h t the m o s t g e n e r a l a c c o u n t of t e s t i n g p o s s i b l e . I h a v e b e n e f i t t e d f r o m d i s c u s s i o n s at the ISA m e e t i n g , M a d r i d , 1990 a n d w i t h B e r n w a r d J o e r g e s a n d t h e L a r g e T e c h n i c a l S y s t e m Group, WZB.

W i s s e n s c h a f t s z e n t r u m B e r l i n für S o z i a l f o r s c h u n g g G m b H (WZB)

(2)

"TESTING __ ONE, TWO, THREE __ TESTING!" - TESTING: A TEST CASE FOR THE NEW SOCIOLOGY OF TECHNOLOGY

Summary

In this p a p e r the i m p o r t a n c e of t e s t i n g as a s t r a t e g i c r e ­ s e a r c h site in the n e w s o c i o l o g y of t e c h n o l o g y is d i s c u s s e d . The b a s i c c h a r a c t e r i s t i c s of t e sts as a p r o c e s s of p r o j e c t i o n w i l l be o u t l i n e d . We i n c r e a s i n g l y li v e in the age of the test a n d the d i s c o u r s e of t e s t i n g is p e r v a s i v e in m o d e r n society.

It is h o p e d to w i d e n out the d e b a t e f r o m the c o n f i n e s of the s o c i o l o g y of t e c h n o l o g y to a d d r e s s the p h e n o m e n o n of t e s t i n g in g e n e r a l . E x a m p l e s to be d i s c u s s e d i n c l u d e r o c k g r o u p s t e s t i n g t h e i r a m p l i f i c a t i o n e q u i p m e n t , t e s t i n g of O - r i n g s on t h e s p ace s h u t t l e C h a l l e n g e , the t e s t i n g of c o m p u t e r s o f t w a r e a n d the d i s c o u r s e of t e s t i n g in a s e r i e s of G e r m a n c i g a r e t t e a d v e r t i s e m e n t s .

"TEST __ EINS, ZWEI, DREI __ TEST!" - TESTS: EIN TESTFALL FÜR DIE NEUE TECHNIKSOZIOLOGIE

Zusammenfassung

In d i e s e m A u f s a t z w i r d die B e d e u t u n g des T e s t e n s als F o r ­ s c h u n g s g e g e n s t a n d u n d s e ine E i n b e t t u n g in die neu e T e c h n i k ­ s o z i o l o g i e e r ö r t e r t . Die w e s e n t l i c h e n M e r k m a l e , die T e sts als P r o j e k t i o n s p r o z e s s e v e r s t e h e n lassen, w e r d e n ski z z i e r t . W i r l e b e n h e u t e im Z e i t a l t e r der Tests, u n d die m o d e r n e G e s e l l ­ s c h a f t ist v o n D i s k u r s e n üb e r T e s t s d u r c h d r u n g e n . Es steht d e s h a l b zu hof f e n , daß die D e b a t t e die G r e n z e n der T e c h n i k ­ s o z i o l o g i e im e n g e r e n S i n n e v e r l ä ß t , u m sich d e m P h ä n o m e n des T e s t e n s a l l g e m e i n z u z u w e n d e n . Die zur I l l u s t r a t i o n h e r a n g e z o ­ g e n e n B e i s p i e l e r e i c h e n v o n T e s t s der V e r s t ä r k e r a n l a g e n bei R o c k k o n z e r t e n , der O - R i n g e bei der R a u m f ä h r e " C h a l l e n g e r " , übe r T e s t s v o n C o m p u t e r - S o f t w a r e b i s h i n zu e i n i g e n Ü b e r l e ­ g u n g e n zum P h ä n o m e n des T e s t e n s in e i n e r W e r b e s e r i e für e i n e d e u t s c h e Z i g a r e t t e n m a r k e .

(3)

Testing as a Test Case in the New Sociology of Technology

Testing is a strategic area for any study of technology to focus upon. This is because testing can be seen as the attempt formally to specify and iden­

tify how the technology in question will perform, is performing or has per­

formed. For the so-called new sociology of technology, which takes as its target, the position of technological determinism1 (e.g., Bijker, Hughes and Pinch 1987; Elliott, 1988), testing is of particular relevance (MacKenzie, 1989; Pinch, Ashmore and Mulkay, 1991). Test data are usu­

ally thought of as providing access to the pure technical realm - a means whereby the immanent logic of a technology can be revealed. At the very least, the technological determinist’s account will be replete with formal assessments of a technology’s performance.2 3

Put most simply: the aerodynamic properties of, say, an aircraft wing can be determined by testing a scale model in a wind tunnel, this data can be compared with that produced by another design and then both designs can be assessed as to which gives the ‘better’ performance.^ The techno­

logical determinist’s argument is the claim that we ‘know5 in a reliable, objective and independent way how a technology will perform and this provides ‘the bottom line’ for future developments.

In view of the importance attached to testing in the standard account, it is prerequisite for any alternative social constructivist account of technol­

ogy to address the issue of testing. As Mulkay (1979) has argued, it is not enough to show that a technological artefact like a television set can be

1 Bimber (1990) distinguishes three different meanings given to technological deter­

minism. It is technological determinism in the guise of what he refers to as ‘Logical Sequence Accounts’, where the technology is seen as developing according to inde­

pendent and universal laws of nature, which has most exercised the social construc­

tivist approach to technology.

2 A typical example, of the gerne is C. Ellam, »Developments in Aircraft Landing Gear, 1900-1939«, Transactions of the Newcomen Society, 55, 1983-4, 48-51.1 am grateful to W alter Vincenti for drawing my attention to this article as an example of a tech­

nological development being totally conditioned by the internal logic of the tech­

nology reacting to increased levels of performance.

3 Of course, this version of technological development is too simple because the design of any complex piece of technology is dependent upon many different fac­

tors. The aerodynamic properties of the wing have to be balanced against its strength, weight, the cost of the material, and so on (Vincenti, 1986).

(4)

given different meanings by different social groups: what has to be shown is that the very working of the TV set can be subject to sociological analy­

sis. Testing, in that it establishes workability, is an important test case for the new sociology of technology to tackle.

Testing as a Research Site

The testing of technology is a promising research site for sociological inquiry. This is because in testing something is at stake. Unlike merely fiddling around or getting a feel for how something works, in testing expectations are built around a certain outcome arising from the test. In other words, testing is usually carried out with a purpose in mind.

Although the purpose may not always be formally specified, on some occasions it will be clear to participants just what sorts of outcomes are to be expected. If we turn to the familiar example of roadies testing the amplification equipment of rock groups before a concert, then if »Testing - One, Two, Three . . . Testing!« is heard booming around the stadium, this is usually taken by the sound engineers as an indication that the amplification equipment is working. If nothing is heard then something has gone wrong.

Many tests are performances which can be witnessed by others. For instance, a new aircraft or a boat may be demonstrated to prospective purchasers (as was the case for the Turbina - the first turbine powered boat [Constant, 1980]). The witnesses or audience to a test may them­

selves have different interests in the outcome.4 Within some companies, it is during testing that competing interests in the development of the tech­

nology are manifest. Such is the case for the development of the Phantom F-4 Fighter plane discussed by Bugos (1989). Bugos identifies what he refers to as ‘Integrative Testing’ as the key research site where sets of technical specifications are negotiated. In a technically complex project

4 The importance of the audience for a public test or demonstration and the role they play in interpreting the results is discussed by Collins (1988). The importance of witnessing in the natural sciences has been stressed by Shapin and Schaffer (1985).

(5)

such as building a modern fighter aircraft many different groups of engi­

neers are involved. Often such engineers push for the maximum technical performance from the components for which they are responsible. This can lead to problems. For instance, the latest and most sophisticated radar technology may not match the capabilities of the aircraft’s weapons sys­

tems. It is during integrative testing that such technical specifications are negotiated between the different interest groups.

Testing in General

As well as playing a role in the development of technologies, testing and the discourse of the test are increasingly prevalent in modern societies. IQ tests, genetic tests, mid-term tests, MOT tests, tests for radio-activity, HIV tests, pregnancy tests - we live in the age of the test (Nelkin and Tancredi, 1989). Perhaps an indication of the extent to which the discourse of testing has become part of common parlance can be seen from its use of adver­

tising. The format »In four out of five tests Brand X was found to be supe­

rior to Brand Y« has long been with us. One of the more intriguing uses of the discourse of testing in advertising is the current (as of August 1990) campaign mounted in West Germany for a brand of cigarettes known as

‘West’. Advertising hordes show two people (Figs 1, 2 & 3), one of whom is being offered a cigarette by the other. The caption reads, ‘Test the West’.

This slogan is to be found written in Russian at the first S-Bahn station encountered by East Berliners when entering West Berlin.^ In a follow-up campaign for the brand name of ‘West Lights’, the slogan ‘Test the Lights’

has been used.5 6

5 In this particular version of the advertisement (Fig. 3) what looks like a caricature of a Russian General is being offered the cigarette by a young smartly dressed woman.

6 Another pertinent example of the importance of the discourse of testing has come during the current Persian Gulf crisis. At a recent lecture on the topic the speaker (a member of the US team at the United Nations) referred to the crisis as the »first real test for the free world in the post-cold war era«. The lecture »The Crisis in the Gulf« was delivered at Cornell University, Politics Department on October 15, 1990 by Shibley Telhami.

(6)

By examining testing within the confines of technology I hope to be able to obtain insights which then can be given a wider applicability. As part of this ambition I will return to the ‘Test the West’ advertisements in the last part of the paper.

Prospective, Current and Retrospective Testing of Technologies It is useful to distinguish between three types of activity that are often grouped together under testing. The commonest form of testing is when a new technology is tested before being introduced. This prospective testing may be carried out on scale models (as is often the case for aircraft parts), on mock-ups, on prototype models or on the final manufactured product before it is released into wider use. The idea of such testing is to find if the design is feasible, whether the technology works as specified in the design, whether different components can be integrated, and to monitor the per­

formance for a range of putative uses. The testing is prospective because it takes place before the technology is formally released into use.

Current testing occurs once a technology is up and running. It may be carried out in order to assess performance, to make improvements, to compare it with rivals, for legal and safety reasons (as in British MOT tests of cars), or to ascertain any special operational difficulties. For instance, in Britain when pollution of a river is suspected the water authorities will often take ‘test samples’ from reservoirs for analysis. The assumption is that a limited number of these samples will give information on how the system in general is functioning. Another type of current testing occurs when an established technology is put into local operation. Such testing is carried out in order to identify potential failures because something has been done to disrupt the normal smooth working of the technology. For instance, we may have moved a machine or system to a new location or installed a new part. This is the type of testing involved in the music ampli­

fication system case. The amplification system is an established technology which has worked adequately on previous occasions at other concerts: the purpose of the test is to establish if all components work on this particular

(7)

occasion, when the delicate equipment has been reassembled after a lengthy journey.7 The test simulates how the equipment will be used later during the concert.

The last sort of testing to be distinguished is retrospective testing. This occurs after a major accident or malfunction has occurred in a piece of technology. Tests are run to try and find out what went wrong. For exam­

ple, after the space shuttle Challenger disaster many tests were carried out to investigate O-ring impairment and its relationship with temperature.

Perhaps the most famous of these was Richard Feynman’s ‘test’ at the Rogers Commission when he clamped a piece of O-ring in a beaker of iced water to demonstrate the effect of temperature on the O-rings (see Gieryn and Figert, forthcoming). Forensic science is an area where many such post facto tests are carried out to establish the cause of road accidents or the modus operandi of criminal acts (Smith and Wynne, 1989).

In the course of the life of a piece of technology all three types of test­

ing may be found. For instance, in the case of the space shuttle, some O- ring tests were carried out before the solid rocket boosters were built, some were carried out during the operation of technology when the problem of O-ring blow-by was first identified and some were carried out after the fatal Challenger accident.

Testing as Projection

Now that some rudimentary distinctions have been offered, I would like to focus on the general properties of testing per se. My discussion here par­

allels that outlined in the excellent paper of MacKenzie (1989). Although they occur in different circumstances, the types of test described above can be seen to follow a similar pattern. A set of activities is carried out in a circumscribed environment which is designed to produce an outcome which gives us information as to the operation of the technology.

7 Of course, another purpose of the test is to obtain the best sound balance, but this can only be achieved after it has been determined that the equipment is functioning properly.

(8)

Testing always proceeds by a process of projection. If a scale model of a Boeing 747 airfoil performs satisfactorily in a wind tunnel we can project that the wing of a Boeing 747 will perform satisfactorily in actual flight. If a microphone works now we can project that it will work later when being held by Mick Jagger. If an O-ring fails in tests at low temperatures we can project that the O-ring in the right-hand solid rocket booster failed on the fatal morning of the Challenger launch when cold temperatures prevailed.

The act of projection, whether from the present to the future, from the present to the past, from the particular to the general, from the small to the large, or from the large to the small (as in miniaturization) depends crucially upon the establishment of a similarity relationship. It is assumed that the state of affairs pertaining to the test case is similar in crucial respects to the state of affairs pertaining to the actual operation of the technology. In other words, to all intents and purposes the behaviour of an airfoil tested in the wind tunnel must be similar to the behaviour of a Boeing 747 wing at 30,000 feet and at four-hundred miles an hour; the microphone in the hands of the roady must behave similarly to when it is held later in the hands of Mick Jagger; and the O-rings being tested by Richard Feynman at a press conference must have similar properties to those on the solid-fuel rocket boosters used on the Challenger. It is the assumption of this similarity relationship which enables the projection to be made and which enables engineers warrantably to use the test results as grounds that they have found out something about the actual working of the technology.

Testing in Technology and Experimentation in Science

Although it is a hazardous process arguing for similarities between science and technology (Russell, 1986; Pinch and Bijker, 1984, 1986; Pinch, 1988) and certainly any simple-minded equivalence between the two endeavors must be rejected, it is nevertheless the case that parallels can usefully be drawn in some domains. One such area is that of technological methodol­

ogy and design, broadly conceived of to include testing (Vincenti, 1979;

Constant, 1983; MacKenzie, 1989). This area of technology in many

(9)

respects matches what scientists do at the laboratory bench. Although scientists do not straightforwardly build scale models or prototypes in the way that engineers do, and the naive inductive logic of parameter varia­

tion is rarely used in science, technological design and methodology and lab work in science share the characteristic that they are both carried out by highly-trained skilled practitioners who are engaged in the production of knowledge. In the case of technology it is knowledge about specific technological artefacts and processes; in the case of science it is knowl­

edge about the natural world. It is in this area, of the production of knowledge amongst a group of specialists - in short the social production of knowledge - that work in the sociology of scientific knowledge has most salience for technology.

It would be tedious and unnecessary here to go over the claims and findings of the modern sociology of scientific knowledge (see Barnes and Edge, 1983 and Mulkay, 1979 for the general arguments). Suffice it to say that our understanding of experimentation - the scientist’s equivalent of testing - has undergone a radical transformation. Experiment, rather than being seen as the means to verify, confirm or refute scientific theories, or as providing the observation statements upon which correspondence theo­

ries of truth may be built, is treated as a process of argumentation and persuasion: human agency in the production of agreement about the con­

tent of nature has become the principal locus of enquiry (Latour and Woolgar, 1979; Knorr-Cetina, 1983; Pickering, 1984; Lynch, 1985; Collins, 1985; Shapin and Schaffer, 1985; Pinch, 1986; Gooding, Pinch and Schaf­

fer, 1989).

The relevance of the sociology of scientific knowledge to our under­

standing of testing has been recognised by Constant (1983) in his work on the testing of water turbines and dynameters and further developed by MacKenzie (1989) for the case of the testing of ballistic missiles. In order to understand the thrust of much of this work we need to go back to a general point made by Wittgenstein - namely that a relationship of simi­

larity or for that matter a relationship of difference does not rest out there waiting for us to find it, but depends upon us classifying two things as be­

ing similar or different. In order to say two things are similar we bracket, or place in abeyance, all the things that make for possible differences. In

(10)

other words, we select from the myriad of possibilities the relevant proper­

ties whereby we judge two things to be similar. We do this usually in a taken-for-granted way such as when we are able to recognize a person’s face as being ‘the same’ despite viewing it in different profiles and differ­

ent lighting conditions. We make such similarity judgments all the time as a matter of course. Our similarity judgments of faces rest upon a network of other assumptions such as what we take to be crucial about facial feat­

ures and personal identity in our society. In Wittgenstein’s terms, there exists a common ‘form of life’ within which such similarity relationships are constituted.

The purchase which this insight has given the sociology of science comes when we consider issues such as the replication of experiments (Collins, 1975, 1985). An experiment and a replication of it can always be found to be different. For instance, trivially, they may be done by different scien­

tists on different days of the week. What studies of replication have shown (Collins, 1981) is that when two results conflict - such as the recent failures by some experimenters to replicate the claims to observe cold fusion - it is possible to explain away the conflict in two very different ways. One way, which is that usually taken by the scientists who have come up with the negative result, is to say that negative results show that the original experiment was incompetently performed and that the claimed phe­

nomenon (e.g., cold fusion) is spurious. The other way to explain away the conflict is that often taken by the scientists who carried out the original experiment which others are trying to replicate. They point to the negative experiment as being crucially different from the original in some significant respect. Of course, in the case of cold fusion, it will convince no-one to blame the failure of other groups to observe the phenomenon on their experiments being carried out by different scientists on different days of the week. It is not part of the scientific form of life that such matters are relevant. However, if it is claimed that the cathodes in the replicating experiments have not been poisoned in the correct manner, then this may be taken as a much more plausible reason as to why in fact the second experiment is not a competent replication.

The way that scientists tell which differences are significant and which are irrelevant in part depends upon their theories and background

(11)

assumptions or the ceteris paribus clauses assumed in the production of any experimental result.8 The background theories and assumptions, especially when a new phenomenon is claimed, are often themselves far from straightforward to elicit as they are embedded within the very experimental claims at issue. In short, similarity and difference relation­

ships which are constituted within a wider framework of culture and action are at the very heart of how truth and falsehood are established in science.

In technological testing the same point arises in principle. It is possible to refuse to accept test results by questioning the similarity relationship needed to make the projection from test to ‘real world’ use. Unlikely as it is to be significant, it is nevertheless the case that there is difference between the roady holding the microphone and Mick Jagger holding the microphone. Of course, whether this difference is considered significant depends on a lot of other assumptions. Its relevance could be made more plausible by pointing out, for instance, that the physical vibrations pro­

duced by two thousand rock and rollers gyrating at the sight of Mick Jagger holding the microphone may make it fail on the night. In other words the test results can be challenged by pointing to the potentially deleterious effects which such vibrations may produce.

In the case of ballistic missile testing discussed by MacKenzie (1989), the similarity judgments were actually disputed. For a while the manned- bomber lobby in the USA refused to accept as definitive, ballistic tests of dummy nuclear missiles flown off the Pacific coastline of the USA because it was held that a strike at Moscow in a real nuclear war could be signifi­

cantly different. Such judgments of similarity and difference always rest within a broader framework of commitments and assumptions as to how a technology will operate. Hence right at the heart of testing we find that the simple story that testing is about the constraints from reality can be challenged. Valid test results depend upon the acceptance of a similarity

8 This issue is known as the Duhem-Quine thesis within the philosophy of science.

For further elaboration, see Pickering, 1983; Collins, 1985; and MacKenzie, 1989.

The dilemma which this poses for experimenters in recognizing competent and in­

competent experiments has become known as the ‘experimenters’ regress’ (Collins, 1985). There is no independent measure of competence in such disputes because attributes of competence depend upon whether or not the phenomenon in question is found - and whether or not the phenomenon is found depends upon which experi­

ments are taken to be competent.

(12)

relationship and such a relationship can only be constructed within a body of conventions or within a form of life.9

I will now go on to illustrate these points further by showing how simi­

larity and difference judgments involved in testing can be contested. The first case I will look at - the test of financial management software systems, known as ‘clinical budgeting’ systems - is an example of ‘prospective test­

ing’. Such systems were tested before being introduced into the British National Health Service. As we shall see, the outcome of the tests can either be taken to be a success or a failure depending upon the sorts of similarity and difference judgments made. The second example involves

‘current testing’. It concerns the interpretation of test data on the func­

tioning of the O-rings which preceded the fateful decision to launch the Challenger. Again it will be argued that similarity and difference judg­

ments were central to the implications drawn from these data for the safety of the shuttle.

Testing Clinical Budgeting

Clinical budgeting systems are computer-based financial management sys­

tems designed to help clinicians make economically rational decisions about patient care. The systems can be described as a social technology (Pinch, Ashmore and Mulkay, 1991) as they are intended to change social behaviour, and in particular turn clinicians into rationally calculating indi­

viduals. They work by allocating a set budget for a fixed period of time and by providing clinicians with information on the resources they have

9 There is, of course, much more to be said about how facts are constructed during the process of testing. For instance, we could analyze the ways the particular test results are produced before the act of projection is made. Constant (1983), M acken­

zie (1989) and Nelkin and Tancredi (1989) have many useful suggestions in this area. Like scientific facts, tests results are produced from theory-laden and practice- ridden instrumentation. Such instrumentation and the credibility attached to any particular result can be analyzed in standard SSK terms. In principle all the pro­

cesses of social construction which have been documented for the production of sci­

entific facts apply to this domain. In this paper I have focussed on the construction and deconstruction of test results via the notion of projection. I take this to be the central issue in testing.

(13)

used within that period. The information may include the cost of drugs, X- rays, scans at the pathology laboratory, the length of stay of patients, and the case mix. Such systems are now in widespread use in the health sector in the UK and have recently been recommended in a Government White Paper for use by doctors in general practice. They are currently being introduced into other sectors in the UK such as education. In principle such budgetary systems could be applied to any area of social life where decisions over financial resource management are made. Such systems form a key part of the Thatcherite policy of introducing basic economic principles into the public sector where it is perceived that large amounts of resources are used without obtaining the best ‘value for money’.

In 1979 a group was funded by the UK Department of Health to test one version of a clinical budgeting system. Testing went ahead in three districts of the NHS (control districts were used for comparison). The test in one district was soon abandoned because of management changes in the NHS which meant that a properly agreed budget could not be set. In 1985 the group doing the testing reported that in the other two districts no improvements in the way resources were used could be found according to any of their quantitative indicators (for instance, spending on drugs, throughput, case mix). However, some qualitative successes were reported such as clinicians and managers talking more together. Given the eco­

nomic rationale for such systems and the attempt to measure their impact, the results even by the admission of those carrying out the test were ‘dis­

appointing’ (Wickings, Childs, Coles and Wheatcroft, 1985:133).

Prima facie this case seemed somewhat puzzling because shortly after­

wards clinical budgeting was taken up by even more groups within the NHS.10 Furthermore, this particular test was widely quoted as having shown the general feasibility of such systems. How then was it possible for a test which looks to have been a failure to be interpreted in such a posi­

tive light?

The answer can be found by looking at the similarity relationships entailed in the projection from test to actual use. The reason given by

10 For further details, see Pinch, Mulkay and Ashmore, 1989; Ashmore, Mulkay and Pinch, 1989; and Pinch, Ashmore and Mulkay, 1991.

(14)

those who carried out the test as to why the negative results were not damning was that, at the time the test was being carried out, the NHS was undergoing a major reorganization. This meant that managers were com­

ing and going and this fouled up the working of the budgetary system as stability in the management team was prerequisite for an effective bud­

getary arrangement to be drawn up. Thus, according to the proponents of such systems, the projection from test to real use could not be made in this case because there was not enough similarity between the two. Thus the negative results did not imply that the technology in general did not work.

Indeed, it was claimed that much was learnt from the test as to how to implement such systems more effectively in future. However, for the critics the technology had failed because it was exactly in the capricious real world of the NHS that such systems ought to operate. Thus the critics con­

structed a similarity relationship between the test and the real world of the NHS, enabling the projection to ‘actual use’ to be made and leading them to conclude that the technology of clinical budgeting would not work.

This example reveals that different judgments about similarity and dif­

ference can lead to radically different conclusions concerning the outcome of a test. For the proponents, clinical budgeting is difficult to implement, but is worth persevering with. For the critics it has failed miserably and should be abandoned.

Of course, a lot more can be, and has been, said about this example both in terms of the micro-sociology of testing and the macro-sociology of UK Government policies towards the NHS (e.g., Ashmore, Mulkay and Pinch, 1989). The significance of the analysis offered here is that it opens the way to showing how the outcome of tests can be treated as a matter of politics and social negotiation. The fates of technologies are not settled by test results in the way that the simple technological determinist story would have it.

(15)

O-Ring Blow-By Data and the Challenger Launch

On the night before the fateful launch of Challenger a tele-conference was held between engineers at Morton Thiokol, the manufacturers of the Shuttle’s solid rocket fuel boosters, and the NASA Marshall space flight centre. At issue were some results put before the tele-conference by Mor­

ton Thiokol engineer, Roger Boisjoly. These results indicated that there would be a risk of O-ring impairment if the shuttle was launched at the low temperatures expected the next morning at Cape Canaveral. Boisjoly’s argument in its essentials was that O-ring blow-by observed from the boosters recovered from previous missions was correlated with tempera­

ture. Boisjoly’s claim, when put in terms of the above analysis of testing, was that there was enough similarity between previous results at low tem­

perature (the ongoing monitoring of the recovered boosters) and actual use (the launch of the space shuttle the next morning) to project that low temperature could lead to O-ring blow-by. As Boisjoly told the Rogers Commission:

I expressed deep concern about launching at low temperature . . . I started off talking about a lower temperature than current data base results in changing the primary O-ring sealing timing function. And I discussed the SRM-15 [flight January 1985] observations, namely, the 15A [left hand motor] had 80 degrees arc black grease between the O-rings, and make no mistake about it when I say black, I mean black like coal. It was jet black.

And SRM-15B [the right hand motor from the same flight] had a 110 degree arc of black grease between the O-rings. (Presidential Commission, 1986: 88)

However, the validity of this conclusion was questioned during the course of the conference when, amongst other things, it was pointed out that CD- ring blow-by had also been observed on flight SRM-22 which had been launched at a high temperature. This result put in doubt the projection which Boisjoly wished to make from the SRM-15 result to the Challenger launch. The conclusion which Larry Mulloy, the NASA official who made the decision to go ahead, reached was:

. . . my assessment at that time was, that we would have an effective simplex seal, based upon the engineering data that Thiokol presented, and that none of these engineering data seemed to change that basic rationale, (ibid:

92)

(16)

For Mulloy, the projection that Boisjoly was trying to make from the case of SRM-15 to the case of the Challenger was not valid. The similarity and difference judgments involved in assessing the bearing of data from SRM- 15 and SRM-22 can clearly be seen as Boisjoly relates how he made one last desperate effort at an off-air meeting at Morton Thiokol to convince his management to recommend against the launch.

I tried one more time with the photos. I grabbed the photos, and I went up and discussed the photos once again and tried to make the point that it was my opinion from actual observations that temperature was indeed a dis­

criminator and we should not ignore the physical evidence that we had observed.

And again I brought up the point that SRM-15 had a 110 degree arc of black grease while SRM-22 had a relatively different amount which was less and which wasn’t quite as black. I also stopped when it was apparent that I couldn’t get anybody to listen, (ibid: 92)

Again much more can be, and has been, said about the Challenger case (Gieryn and Figert, forthcoming; Pinch, 1990; Wynne 1988). Part of the difficulty Boisjoly faced in convincing Mulloy was that his data were not quantitative enough to satisfy Mulloy. The Marshall space flight centre and Mulloy’s boss, George Lucas, had a reputation for their hard-nosed quantitative orientation. Also it should be stressed here, that the moral of the story is not the simple one that Boisjoly was right to recommend against launching and Mulloy was wrong. Managing a capricious and un­

certain technological system often means that certainty cannot be found (Wynne, 1988) and it is always easier to make such judgments with hind­

sight. Furthermore, as I have pointed out elsewhere (Pinch, 1990) identi­

fying the O-rings as the definitive cause of the Challenger accident and thereby construing Boisjoly as the whistle-blowing hero is not a straight­

forward matter for social scientists (or indeed for Presidential Commis­

sions). However, what the example shows is that judgments of similarity and difference lie at the core of testing, and, as we have seen, such judg­

ments can be contested.

(17)

Testing the User?

Before concluding this paper with a discussion of testing discourse as it is encountered in other contexts I would first like to examine one further issue raised by the testing of technology. This concerns the place of the user during testing.

Many technologies depend for their operation upon the skilled actions of a user - someone who will work or operate the technology. For exam­

ple, even a comparatively simple technology such as a door depends upon users who know how to operate it - pull the handle, walk through the doorway, close the door behind them, etc. (see Latour, 1991). Even if a technology is black-boxed, such as a home stereo system, the user still has to know how to switch on the amplifier, how to connect the wires cor­

rectly, and under which conditions the machine can be used and so on. In general the more the technology depends upon the concerted actions of human users for its successful operation, the more it will need to be tested in vivo. Computer software with the complex of tasks carried out by the user in order to operate the software is the obvious case. However, any technology which requires the user to act in new sorts of ways (such as when a new technology is first introduced) will involve some in vivo test­

ing. This is because the manufacturers cannot be sure that the users will be able to do what is required of them.

We should also not forget the radical new ‘form of life’ which some new technologies entail. Just as it has been argued that paradigm revolutions in the sciences involve new ways of thinking and acting (Collins and Pinch, 1982), the same case can be made for technologies. For instance, the medical technologies which accompanied the germ theory of disease required users to act in often a completely new way (sterilizing equipment, wearing face masks, and so on).

When testing becomes co-extensive with use such that the difference between test and ‘actual use’ dissolves it is possible to ask just what is being tested. Is it the technology or the user? Arguably such tests are as much about testing the user as they are about testing the technology.

What is at issue in such tests is not so much the projection from test case to actual use of the technology, but the projection from test case to actual

(18)

use of the user! With the example of software testing we can say that in an important sense the computer is determining our future competences.

That such a shift to testing the user has occurred is indicated by thinking about how we first encounter a new piece of software. If we can’t get the software to work we assume at first blush that it is we who are incompe­

tent, not the software. It is we who are being tested. It is our future com­

petences which are being projected by each engagement with the software.

And just as all tests involve constructions of similarity and difference, so too does this one. The computer assumes all kinds of similarity relation­

ships about us in testing us - that is to say, if we get something right in our interaction with the machine, such as being able to use the cursor cor­

rectly, the computer assumes that we will act similarly in the future. As with all test results, we could challenge the computer’s conception of us by pointing out that its assumptions about similarity are invalid. For instance, because we use the cursor correctly in the training exercise, does not mean that we will always use it correctly.

When I refer above to the computer testing users, I do not mean to grant volition to the machine. The properties the machine exhibits are there only because they have been put there by humans who designed and manufactured the machines (or the software) in the first place. In other words, there is no need to go down the route advocated by Calion (1986) and Latour (1988) of treating machines as actants which are analytically indistinguishable from human actors.

There is however one important difference between the test of the user discussed above and the other tests discussed earlier in the paper. The similarity relationships in the tests previously mentioned were all actively constructed by human actors in the course of testing. They are therefore defeasible. In the case of testing the user the relationships of similarity are either preprogrammed into the software or built into the machine. In other words the technology has embedded within it assumptions about us whereby our future interaction with it can be projected. This ‘embedded projection’ gives the appearance of a kind of volition - it seems that the machines are training us to use them properly. And indeed it is the case that there are differences between dealing with a machine and with a social actor. The possibility of negotiating with and persuading the

(19)

machine to view the similarity relationships in a different way are extremely curtailed. It is not easy to persuade my computer that every time I have a cup of tea in my hand and revert to one-hand keyboard input that I mis-key more often and that it should be more discriminating when I accidently push the delete key.

Another way of describing this process whereby the user acts in concert with the machine, is as a process of black-boxing. The aim of a successful technology can be said to black-box the user along with the technology.

However, this process of black-boxing users’ projected actions should not lead us to talk about machines as actants with independent volition, any­

more than black-boxing of a scientific instrument requires us to talk about an independent natural world (Pinch, 1986; Schaffer, 1989). The black­

boxing of scientific instruments can be treated within the sociology of sci­

entific knowledge as a process whereby social choices are frozen in the machine. Similarly we can view the black-boxing of the user as a process whereby certain classes of projected action are frozen in the machine. If theories and culture can be embedded in machines so too can projected ways of acting.

Another related approach to this issue comes from the opposite direc­

tion and asks just what kinds of acts can be programmed into machines in the first place. This is the problem of artificial intelligence. Collins (1990) has answered this question by showing that it is only our ‘machine-like acts’ which can be programmed into machines. Collins’ approach is per­

fectly consistent with a sociology of machines whereby there is no need to grant volition or actant status to machines.

The way to recover the ‘social’ in machines, or to open the black-box of the user, is to employ the same strategy as has been used for opening up the black-box of scientific instrumentation. There it has proved productive to look at how scientific instruments are first introduced, before they become black-boxed (Latour and Woolgar, 1979; Pinch, 1986; Schaffer, 1989). In the case of technology this means looking at cases where machines are first introduced and where the user is still an open box. In such cases whether it is machine which is being tested or the user is still an open question. For instance, if when doors were first introduced users could not use them competently (perhaps failing to recognize them as a

(20)

means to go through a wall at all) then we could attribute the failure to the door rather than to the users. Of course, once a technology is well established and a culture has built up over how to use the machine then any failures are more likely to be attributed to the user rather than to the machine. For instance, if a user with a passion for under-water music insists on trying to operate their home-stereo in the bath tub and com­

plains that the machine does not work (or receives some nasty shocks) then we will rightly attribute blame to the user rather than the technology.

This is because we share a culture within which the correct operation of stereos is taken-for-granted. However, we can imagine cases of encounters with machines where the necessary user competences have not yet been acquired (e.g., children11 or people from pre-modern societies12). In such cases it becomes quickly evident that a way of acting with a machine rests upon a set of cultural conventions. That a set of cultural conventions can be frozen into a machine is an impressive feat but does not warrant treat­

ing machines as independent of society and culture - as actants in their own right.

Another case where the black-box of the user is at least partly open is the operation of computer software. Because software is so complex in the demands it makes upon the user (compare operating a piece of software with operating a home stereo) the black-boxing of the user has never been successfully accomplished. Computer manufacturers are, of course, increasingly sensitive to this problem and that is why so much current effort is devoted into the field of Human Computer Interaction (HCI) and to studying user interfaces. Indeed it is now HCI researchers who negoti­

ate with the machine (actually with the designers and manufacturers) on behalf of users.

11 If a child can easily subvert the technology (for instance, by putting electrical appliances under water) we may attribute blame to the machine and label it as too dangerous to use. How such boundaries get drawn is an interesting topic for further investigation. For instance, all electrical appliances are dangerous if we assume a completely culturally incompetent user.

12 This issue is at the centre of the question of whether technologies can be success­

fully transferred from western industrial societies to developing societies.

(21)

Testing the West

The analysis of testing developed here is, I suggest, completely generaliz­

able. The notion of projection and the similarity relationships which it entails are present in all situations where we would want to talk about testing. Furthermore when a test involves a user the ambiguity over what is being tested - the user’s ability to carry out some task or the appropri­

ateness of the task - always remains. It is this ambiguity which is, I suggest, central to the ‘Test the West’ advertisements described earlier.

In the advertisements (Figs. 1, 2 & 3) a cigarette (presumably the brand is West) is offered by one person to another. The person offering the cigarette is always smartly dressed (presumably symbolic of Western cul­

ture). In the most common form of the advertisement (see Fig. 1) it is a smartly dressed young woman who is offering the cigarette, whilst the per­

son being offered the cigarette is a man (an Elvis Presley look-a-like) who is ludicrously dressed in a white spangled jacket, over-weight, and is por­

trayed as acting in a bizarre manner clumsily eating a slice of cake. Of course, the advertisement works by playing on the word ‘West’. The per­

son on the right of the advertisement is being invited not only to test a

‘West’ cigarette but also to test Western culture.

There is, however, something which is at first puzzling about the adver­

tisement. Why should the company wish to associate themselves at all with images of such ludicrous figures smoking their brand of cigarette and why should potential purchasers identify with such ludicrous figures such that they might take up smoking West cigarettes? And isn’t such a deviant per­

son likely to reject Western culture anyway? Such adverts convey quite dif­

ferent images to those of the suave and sophisticated men and women who are the staple fare of such advertisements. The recipients (usually fat people or bizarrely dressed people) do not seem to match any easily iden­

tifiable consumer group, neither do they correspond to Easterners.

There are many readings of this advertisement to be had and probably the success of the campaign lies in part in the diverse set of symbolic meanings to be found. Also we have no way of knowing for sure whether indeed the advertisement is a success. However, I would like to suggest one reading which the above analysis of testing points to and which may

(22)

F igure 1

*«0NIJ Bi

Ä M 1

L 1 o n D O

•yi

[ H ■I J 1

(23)

go someway towards explaining the anomaly. It is not so much Western culture which is at stake in the advertisement, rather, as in the case of computer software, it is the competence of the potential user (smoker) of the cigarette to consume such culture. The particular act of competence being projected in the advertisement is for the person to take the cigarette and thereby become a bona fide part of western culture. Rather than western culture being tested it is the man’s capacity to be part of that cul­

ture which is at stake. All his previous incompetencies (such as his bizarre dress, and manner) dissolve away in comparison to his future projected competence as a smoker of ‘West’ cigarettes. Western culture is the mea­

sure of the man rather than the man being the measure of Western cul­

ture.

I suggest that a similar shift towards testing the user is occurring in testing in general. The problem for a sophisticated technological society is more and more that of the user. Users need to be black-boxed; culture and technology need to project how we will act. Often we are persuaded to take part in such tests because it seems a good thing to find out more about ourselves. However, the crucial aspect of such tests is I suggest their use as a basis whereby to project our future competencies. Although we may be getting more testing we are not getting more choice - it is our future actions that are being constrained. It is we who are more and more being tested by technology and by culture as it tries to black-box us as part of its operation.

In short, crucial relationships of similarity and difference are embedded in such tests of the user and unless such relationships can be contested, one particular version of the user will increasingly be incorporated into technological systems and into culture in general. As Nelkin and Tancredi (1989) have shown in the case of genetic testing, such tests make all sorts of assumptions concerning how they project from an individual’s occa­

sioned response in a test to some general recommendation for future action. It is not clear for instance, how individual responses are to be aggregated. Testing for HIV is similar in that it embodies many assump­

tions which allow the testers to project a future course of action. For instance, it is assumed that a person who is HIV positive now can be pro-

(24)

jected into the future as person who is a risk to others and who will develop AIDS and who therefore should not be allowed to visit the USA.

To recover our autonomy we need to question the similarity and differ­

ence relationships embedded in all such tests. The choice lies with us. It is towards fulfilling this purpose that a sociology of testing should ultimately aim.1-’

13 My own arguments can, of course, be deconstructed by pointing to differences where I have argued for similarities. Social science is itself a technology - see Pinch, Ashmore and Mulkay, 1991.

(25)

References

Ashmore, M. (1989) The Reflexive Thesis: Wrighting the Sociology o f Scientific K nowl­

edge, Chicago: The University of Chicago Press.

Ashmore, M., Mulkay, M. and Pinch, T. (1989) H ealth and Efficiency: A Sociology o f H ealth Econom ics, Milton Keynes: Open University Press.

Barnes, B. and Edge, D. (1983) Science in Context, Cambridge: MIT Press.

Bijker, W., Hughes, T.S. and Pinch, T.J. (1987) The Social Construction o f Technologi­

cal Systems: N ew Directions in the Sociology and History o f Technology, Cambridge:

MIT Press.

Bijker, W. and Law, J. (1991) Proceedings o f the 2nd Twente Conference, Cambridge:

MIT Press (forthcoming).

Bimber, B. »Karl Marx and the Three Faces of Technological Determinism«, Social Studies o f Science, 20,1990,333-51.

Bugos, G. (1989) »Program Management and the Manufacture of Certainty in the 1950s«, paper presented to SHOT, Sacramento, October 13.

Calion, M. (1986) »Some Elements of a Sociology of Translation: Domestication of the Scallops and the Fishermen of St Brieuc Bay«, in Law, J. (ed) Power, A ctio n and Belief: A N ew Sociology o f Knowledge? London: Routledge and Kegan Paul, 196-233.

Collins, H.M. (1975) »The Seven Sexes: A Study in the Sociology of a Phenomenon, or The Replication of Experiments in Physics«, Sociology, 9,205-24.

Collins, H.M. (ed) (1981) Knowledge and Controversy: Studies o f M o d em N atural Sci­

ence, Special Issue of Social Studies of Science, 11.

Collins, H.M. (1985) Changing Order, London and Beverly Hills: Sage.

Collins, H.M. (1990) Artificial Experts, Cambridge: MIT Press.

Collins, H.M. and Pinch, T.J. (1982) Frames o f Meaning: The Social Construction o f Extraordinary Science, London: Routledge and Kegan Paul.

Constant, E.W. (1980) The Origfns o f the Turbojet Revolution, Baltimore: Johns Hop­

kins University Press.

Constant, E.W. (1983) »Scientific Theory and Technological Testability: Science Dynameters, and Water Turbines in the 19th Century«, Technology and Culture, 24, 183-98.

Elliott, B. (ed) (1988) Technology and Social Process, Edinburgh: Edinburgh University Press.

Gieryn, T. F. and Figert, A.E. (forthcoming) »Ingredients for a Theory of Science in Society: O-Rings, Ice Water, C-Clamp, Richard Feynman and the New York Times«, in S. Cozzens and T. Gieryn (eds) Theories o f Science in Society, Blooming­

ton: Indiana University Press.

Gooding, D., Pinch, T.J. and Schaffer, S. (eds) (1989) The Uses o f Experiment, Cam­

bridge: CUP.

Knorr-Cetina, K. (1983) The M anufacture o f Knowledge, Oxford: Pergamon.

Latour, B. (1987) Science in A ction, Harvard: Harvard University Press.

(26)

Latour, B. (1988) [& Johnson, J.] »Mixing Humans with Non-Humans: Sociology of a Door-Opener«, Social Problems, 35,298-310.

Latour, B. (1991) »Where are the Missing Masses? Sociology of a few Mundane A rte­

facts«, in Bijker and Law (1991).

Latour, B. and Woolgar, S.W. (1979) Laboratory Life, London and Beverly Hills: Sage.

Lynch, M. (1985) A rt a nd Artifact in Laboratory Science, London: Routledge and Kegan Paul.

MacKenzie, D. (1989) »From Kwajalein to Armageddon? Testing and the Social Con­

struction of Missile Accurate«, in Gooding, Pinch and Schaffer, 409-36.

Mulkay, M. (1979) Science and the Sociology o f Knowledge, London: George Allen and Unwin.

Nelkin, D. and Tancredi, L. (1989) Dangerous Diagnostics: The Social Power o f Biologi­

cal Inform ation, New York: Basic Books.

Pickering, A. (1984) »Against putting the phenomena first: the discovery of the weak neutral current«, Studies in the History and Philosophy o f Science, 15, 85-117.

Pinch, T. J. (1986) Confronting Nature, Dordrecht: Reidel.

Pinch, T. J. (1990) »Negotiating the Social and the Technical in Systems Failure: The Case of the Space Shuttle Challenger«, paper to be published in the Proceedings of the Large Scale Technological Systems Conference, University of California, Berkeley, October 17-21.

Pinch, T.J., Ashmore, M. and Mulkay, M. (1991) »Technology, testing and Text« in Bijker and Law (1991).

Pinch, T.J. and Bijker, W. (1984) »The Social Construction of Facts and Artefacts«, Social Studies o f Science, 14,199-441.

Pinch, T. J. and Bijker, W. (1986) »Science, Relativism and the New Sociology of Sci­

ence: Reply to Russell«, Social Studies o f Science, 16, 347-60.

Pinch, T.J., Mulkay, M. and Ashmore, M. (1989) »Clinical Budgeting: Experimentation in the Social Sciences: A Drama in Five Acts«, Accounting, Organizations and Soci­

ety, 14, 1989, 271-301. Presidential Commission (1986) Report on the Space Shuttle Challenger Accident, Washington.

Russell, S. (1986) »The Social Construction of Artefacts: A Response to Pinch and Bijker«, Social Studies o f Science, 16,331-46.

Schaffer, S. (1989), »Glass Works: Newton’s Prisms and the Uses of Experiment«, in Gooding, Pinch and Schaffer (1989), 67-104.

Shapin, S. and Schaffer, S. (1985) Leviathan and the Air-Pum p: Hobbes, Boyle and the Experimental Life, Princeton: Princeton University Press.

Smith, R. and Wynne, B. (eds) (1989) Expert Evidence: Interpreting Science in the Law , London: Routledge.

Vincenti, W. (1979) »The Air-Propeller Tests of W.F. Durand and E.P. Lesley: A Case Study in Technological Methodology«, Technology and Culture, 20, 712-51.

Vincenti, W. (1986) »The Davis Wing and the Problem of Airfoil Design: Uncertainty and Growth in Engineering Knowledge«, Technology and Culture, TEl-5%.

(27)

Wickings, I., Childs, T., Coles, J. and Wheatcroft, C. (1985) »Experiments Using PACTS in Southend and Oldham HAs«, CASPE Research Report, King Edward’s Hospital Fund, London, December.

Wynne, B. (1988) »Unruly Technology: Practical Rules, Impractical Discourse and Public Understanding«, Social Studies o f Science, 18, 147-68.

Referenzen

ÄHNLICHE DOKUMENTE

The usual methodology for testing the nullity of coefficients in GARCH models is based on the standard Wald, score and QLR statistics. This article has shown that caution is needed

Methodology: Using an augmented aggregate production function (APF) growth model, we apply the bounds testing (ARDL) approach to cointegration which is more appropriate

The graphical evidence furthermore suggests that this sort of violence often seems to come in clusters so that periods of relative calm follow phases of massive

For example, the transitions from health to depression (and viceversa) can be abrupt and unexpected and this is why their prediction is a major problem in psychiatry. When the ASS

For the test suite of the copy application the tool high- lights the second test method with the sequence { Mock- Sink, MockSource } and reveals the detected index out

There are many reasons for this lack of devel- opment, here we will present one cause: the lack of systematic and comparable testing across lan- guages and speller engines to

Brearley, F. Testing the importance of a common ectomycorrhizal network for dipterocarp seedling growth and survival in tropical forests of Borneo.. experiment) of seedlings of

• But, starting inflation requires a patch of the universe which is homogeneous over a few Hubble lengths, and thus it does not solve the horizon problem (or!. homogeneity