• Keine Ergebnisse gefunden

Evaluation of a life cycle assessment tool - adjusting the software developers' view to the expectation of the user.

N/A
N/A
Protected

Academic year: 2021

Aktie "Evaluation of a life cycle assessment tool - adjusting the software developers' view to the expectation of the user."

Copied!
6
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

INTERNATIONAL DESIGN CONFERENCE - DESIGN 2OO4 D u b r o v n i k . M a y l 8 - 2 1 . 2 0 0 4 .

EVALUATION OF A LIFE CYCLE ASSESSN{ENT

TOOL

.

ADJUSTING THE

SOFTWARE

DEVELOPERS'

VIEW TO THE EXPECTATIONI

OF

THE USER

T. Felsing, M. Dick, H. Birkhofer and B. Rüttinger

Kev-w'ords: desigtt for ertvironntent (DJE), lifu cvcle assessnrcnt (LCA), heuristic evaluatiort, software evaluatiort, I soMetrics'

1. Introduction

The Collaborative Researclt Center (CRC) 392 at Darmsradt Ltttiv,ersity of Technolog.-v develops methods and instruments thal enable product cievelopers io assess the environmental impact of the product they design. The research is particularly focused on a cornputer-based design environmcnt. w h i c h i s c a p a b l e o f a p r o s p e c t i v e a n d h o l i s t i c a s s e s s m e n t [ A n d e r l . W e i ß m a n t e l . D a u m . R i t t e r & W o l f

1 9 9 9 1 . T h e P r o d u c t D e v ' e l o p n t e n t E r w i r o n n t e r t ( P D E ) consrsts o f t h r e e c o m p o n e n t s : ( l ) a 3D CAD s y s t e m . ( 2 ) t h e L i f e C . t ' c l e M o d e l l e r ( L C , V | ) a n d ( 3 ) a n a s s e s s m e n t t o o l ( L C A D - L t f , C l c / e A s s e s s m e n l fbr Compurer Aided Desigtt) lsee .figure 1). This paper describes the evaiuation of the PDE.

Figure 1. Components of the Product Development Environment (PDE)

The evaluation is based on a four-phase concept: In the first phase, potential users evaluated the usability of the implemented, PDE prototype. In the second phase, after revision of the system, its usability was tested again. This was done to get some empirical evidence about the success of the revision process. In the third phase, the PDE will be compared with other ecological assessment tools

DESIGN 2CO4

(2)

and evaluated with respect to a set of subjective and objective crireria. Finally, in the last phase, the PDE will be evaluated during and after rhe implemenrarion in a company.

Within the second phase. which is of central interest in the present paper. the usability of the PDE was measured by a mix of gualimive and quantitative methods, conceming the following dependenr variables: ( I ) well-being of users before and after working with the PDE, (2) the subjective assessmenr of the system by users and (3) usabiliry problems noted by users after working wirh the PDE. The results were compared with the results of phase one, which are documented in [Wiese & Rüttinger ?001]. Furthermore, the participants of the second evaluation phase were classified wirh regard ro rheir knowledge and expertise in the field of product development and especially with reference ro the PDE. After this. the different groups of participants were compared with respect to the dependent variables (see above).

2. Method

2.l Participants

Founeen participants took pan in the studl'. Thel' were recruited from the staff of the CRC 392 anrl were mainly engineers (one panicipanr '*'as a psychologist).

2.2 Procedure

The panicipants had to u,ork about ninery minutes with the PDE in the Design for Environment l:boratory OfE-tab) of the CRC 392. Before and afterdoing this. thel'had to fill in questionnaires, which are described in detail in the follou'ing secrions.

2.3 Design for Endronment Laboratory' @fE-Lab)

The srudy rvas carried out in the Dcsrgn for üu,ironnrcnt l-aborarort' tDJE-Inb) at Darntstadt Lirüversit),of Technologr'. The DfE-l-ab is equipped u,ith several workstations u'ith the PDE installed on them. A video obsen'rng svstem recorded the participants, u'orkins rvith the PDE. as well as the video signals of the workstations. The audio and video signals u'ere converted into digital files, raking it possibie to record the u'hole process u'hile simultaneousiy marking the video file with a time and code scheme. This enables an ex-post evaluation of crirical siruarions.

2.4 Data collection methods: Quantitative Data

The subjecrive assessment of the PDE w'as done u'ith a slightll' modified version of the IsoMetricsl. IsoMetncs is an instrument for the formative and summative evaiuatron of software according to the international norms of the ISO 9241 Part l0 [Gediga, Hamborg & Duntsch 1999]. With respect to these norms. the IsoMetrics includes the follorr'ing subscales: suitabilitl'for the rask. self-descriptiveness, controllabiliti'. conformiry with user expecntions. error tolerance, suitability for individualisation and suiubility for leaming. The IsoMetrics is described as a "reliable and valid tool. supponing formative and summatir,e evaluation of software systems" [Gediga" Hamborg & Düntsch

1999, p. 162]. Two versions of the IsoMetrics were developecl: One for summative - IsoMetncss - and one for formative evaluation - IsoMetncst Jcediga, Hamborg & Düntsch 19991. In the presenr srudy. a slightly modified version of the IsoMetrics' was used. With this instrument. formative evaluation is done in two steps: First, the sofrware is assessed with items according to the TSO 9241Part l0 ("rating score"). In a second step. the importance of every item for the general impression of the evaluated software is gathered ("u,eishting index"). The distance between rating score and rveighting index is seen as a measurement for usabiiiry problems: High distances are interpreted as indications for usabilrty problems whereas little riistances are seen as indications for good usability. The IsoMetricsl is described in detail in [Willumeit. Gediga & Hamborg 1996].

The well-being of the test persons was tested with a shortened version of the Multidimensional Mood Questionnaire of [Steyer. Schwenkmezger. Notz & Eid 1994]. With this psychometric instrument. three mood dimensions can be assessed: "pleasant-unpleasant", "au'ake-sleepy" and "calm-restless".

(3)

For these scales, reiiabilities between .85 and .97 are reported [Steyer, Schrvenkmezger, Notz & Eid l 9 e 4 l .

Finally, the expertise of the test persons was measured by their self-reported knowledge about the PDE and their self-reported experience in product deveiopment.

2.5 Data collection methods: Qualitative Data

Qualitative data were collected with a so-called Heuristic E,-aluation [Nielsen & Molich 1990]. The Heuristic Evaluation requires putting down in writing subjecuve impressions on the object of evaiuation, atter having rvorked with it. in the present study, the participants had to wnte down the usabiiity problems they encountered during working with the PDE. To make this process easier, the test panicipants got an evaluation sheet, which was strucnrred according to the single working steps, thev had done before. As a resuit, people were able to assign their problems to these single steps. To stimulate answers, they got a list of nine usability heuristics that should gave them an idea about what feanrres a good computer based working environment should have.

3. Results

3.1 Quantitative Data

The results of the IsoMetricsl show tbr ail subscales clear differences between ratings and weighun-es, with the ratings always lower than the wei-ehtings (see figure 2). With regard to the subscales suitability for the task/"st", seif-descriptiveness/"sd", error tolerance/"et" and suitability for learnrng/"si" this differences became sismticant (p < .05).

Five of the seven ratings are of middle size (suitabiiity for the task. controilabiiity/"ct", suitability for leaming, error tolerance and conlormrtv with user e.\pectattons/"ce"), the others are a little weaker (suitability for individualisation/"si" and seif-descriptiveness). Overaii, the results are very simriar to the resuits of the t-irst evaluation phase. rvhrch are descnbed in lWiese & Rüttineer 2001-l.

--l-weighting -l- rating

Figure 2. Results of the IsoMetricsl: Whole sample

To analyse the IsoMetricsl-results with reference to the expertise of the participants, the expertise of the participants was measured by their self-reported knowledge about the PDE. Along this criterion, the sample was splitted in two subgroups of novices (n = 8) and experts (n = 6). In addition to their s i g n i f i c a n t h i g h e r k n o w l e d g e a b o u t t h e P D E ( F = 4 0 . 1 8 ; d f = l ; p < . 0 1 ) , e x p e r t s h a d a l s o s i g n i f i c a n r more experience in product deveiopment (F = 4.9; df = 1; p < .05).

As tigure 3 shows, experts ratings could be found in all subscales lower than novices ratings. ln the case of the subscale "suitability for individualisation" ("si") the difference became simificant

(4)

(F = 5.08; df = l; p < .05). With respect experts and novices are fairly similar.

to the individual weightings of the subscales, results of

* e x p e r t w e i g h t i n g {-expert rating -+- novice weighting +novice rating

Figure 3. Results of the IsoN{etricsl: Differences bet$'een experts and novices

The u'ell-being of the participants \\'as after u'orking with the PDE significant lorver than before (F =.4.61: df = 1: p = .05). This effect results from significant differences within the subscale ''pieasant-unpleasant" (F = 8.61, df = i: p < .05). Within the other two subscales,

no significant ditTerences occurred. The comparison betiveen experts and novices shows that only in the subgroup of the experts. the u'ell-bei1s rÄ,äS significantiy, detracted from worl<rng with the PDE (F = 20.00: df = 1; p < .01) u'hereas in the group of the novices such an effect could not be observed. The resuits ri'ith respect to the $'ell-being are illustrated in figure rl.

I before

E after

whole sample

experts

novtces

Figure 4. Well-being before and after working with the PDE

$/ith regard to the quality of the used scales, satisfying reliabiiities between .58 and .84 for the isoMetrics'and .66 and .93 for the Multidimensional Mood Questionnaire could be obtained. Items with a corrected item-total correlation < .30 were excluded from the analysis.

(5)

3.2 Qualitative Data

In addition to the quantitative data, the qualitative answers serve to detect usability problems in a more detailed and more concrete rvav. Especially those problems were of interest that persons wearin-s "professional blinkers" do not encounter any more.

The fiiled in questionnaires have been analysed as foliows: ( I ) The questionnaires have been transcribed as a basis for further information processing. (2) In order to understand the - often colloquially expressed - cornments in a right way a semanticai interpretation foilowed. (3) For structuring the information synonymous answers have been clustered. (4) Finally, the imporrance of the reported problems has been classified. The steps (3) and (a) have been carried out in interdisciplinary workshops involving psychologists as well as engineers. The main t'indings were "mirrored back" to the developers of the PDE and are summed up in Figure 5 and in the tbllowine lines.

LCM

:::,:--: : . - _ . . - - . . " 1 : ; : , :.=--:**-i- i , , t ' , , , , a ' : : : : - - i i * i ' ; r ', - j;-.=.'-i:,1.- ..-j, ! data handling

w

data handling

consistency

cnecK

pr@ess

inseftion

information

supply

product

companson

traceability

system

integration

(uniform

appearance)

feedback

help

functions

Figure 5.Identified problem fields

Al:ove ail, the data handling caused offence. lv{ost r;f the users reacted witfr irrcompreheirsion aS thc-r were "forced" to roll product models and process pians in a database and tc, retrieve tircm agarn awkwardly when srvitchrng benveen different PDE components. Nloleover. the three constiruents of the PDE lacked an unitbrm appearance. A correcLive measure ',vill be the integration of the PDE system in a so-called "Eco Design Workbench" rvitirin the next revision phase.

While modelling the product lite c,""cle with the LCM the participants had some problems with the insertion of processes. The processes belonging to patterned t'eatures had to be inserted as ofien as they appeared in the product model. The users remarked that the s)'stem could not automatically assign processes to special fearures such as drillings. chamfers, etc. Finding the "right" process out of an unstructured list of processes was judged to cause problems when the limited number of processes will be raised in funrre. Apart from that. the total number of processes implemented in the sofnvare prototype has been seen as still too small for the environmental assessment of complex industriai products.

Users had problems to cope with the information supplied by LCAD. It nrrned out that the front-end of thrs evaluation system rvas conceived by environmental experls that did not take the product developers' need of infbrmation into account. The user felt overloaded with cletailed information about impact categories while a simpie comparison between the cumulared environmental impact of twc.' products in one window was not supported. Also tracing back the environmental impacts to the processes, which cause them, rvas difficult and in some cases not possible. These findings have led to a complete revision of the LCAD front-end that guides the user from aggregated values down to detailed information.

(6)

Concerning the user interface, most test persons missed feedback and help functions that have not been implemented yet in the examined prototype.

4. Key Conclusions

All in all the study gives some worthy indications about the usability of the PDE. As the data show, a further revision of the system has to be done. Within the first revision, the focus laid obviously too much in technical improvements, whereas the usability of the system was not as much taken into account, as it seems to be necessary'. This may be a typical fault of people, which are focussed in their daily work more on the 'hard" technicai characteristics of technical svstems than on their more "soft" aspects.

The application of a combination of quantitative and quaiitative methods in the present study can be seen as very successful. As intended, the different types of data complement each other very weli: Whereas the quanutauve data give a more general overvierv about the "status quo" of the system, the qualitative resuits give a lot of individual and more concrete indications. which aspects of the system should be improved and which not.

The dividing of the participants into subgroups of different expertise u'as also a successful step: The data shou', that the expertise of a software evaluator is connected wirh his assessment of the softu,are he is u'orking with. With respect to typical results in the field of expertise research, it seems to be probabli' that the assessments of experts are more realistic than the assessments of novices [Anderson, 2000]. ln spite of this, we ar€ue that the inclusion of novices in the evaluation process can be helpful, too. Novices ma)' bnng another point of view into account and ma1' for example recognize aspecrs, which experts do not remark. because they may be posrulated from them. Furtherrnore, the inclusion of novices makes sense if someone wants to estimate hou' difficuit and strenuous it is to work with a special s),stem u'ithout basic knou'iedge about it.

The data about the weil-being can be interpreted as an indication for a slight frustration or anger wittun the group of expens because the PDE may not met their requirements. ln sum the well-being of the participants \\'a*s not affected verv strongly b1' working ri'ith the PDE, so working with the PDE seems to be not too strenuous - nerther for exDerts nor for novices.

Referencqs

Ander.son- J. R.. "Cognitn'e Ps'chologt und its Intplication.r'". Neu I'ork.200.0

Anderl, R.. V'eitJnantei. H.. Daum, 8,, Piirter, C.. V)ol.f, 8., "Lift Cycle lvlodellutg. A Cooperative Merhod Supports Expen,r in tht Entire Product Life Cvcle", Proceeding.s of lCED 99, 14unich, Germant', 1999.

Gediga, G., Hamborg, K.-C. &. Diintsch. 1., "The IsoMetics usabilir_'' iu,enro\': on operationalization ctf I.SO 9241-10 supporting summatit,e and fornative evaluntion of so.fty'are slsrcnts", Behavictur & Information Technologt, Vol. 18. iio. 3, 1999, pp. l5)-164.

Nielsen, J. & Molich, R., "Heuistic Evaluation of User interfaces", CHI Proceedings, Apnl1990.

Steyer, R., Schwenlonezger. P., Notz, P. & Eid, M., "Testtheoretische Analysen des Melvdimensiornlen BeJindlichkeirsJiogebogens (MDBF2) ("Testtheoretical analyses of the Mubidimensional Mood Questionnnire" ), Diagnostica, 40, 4, 1994, pp 320-328.

Wiese, B. S. & Rüninger, 8., "Akzeptanz lT-gestützter Methoden der umv'eltgerechten Produktentwicklung: Vorschläge fär eine tlrcoriegeleitete Et,aluation" (,,User acceprance o.f [T-based methods fttr Design for Env ironme nt " ), D ann s t adt : Ins tit ut s b e ri c ht 2/2 00 I .

Willumeit, H., Gediga, G. & Hamborg, K.-C., "lsoMetricsL: Ein Verfahren zur formativen Evaluation von Software nach ISO 9241/10"("l.soMetricsL: An instrumentfor tlrcformative eyaluation of software with respect to ISO 9211/10"), Ergonomie wtd Infornntik,27, 1996, pp 5-12.

Tobias Felsing. Dipl.-Psych.

Darmsudt Universitv of Technology, Institute for Psychology Hochschulstr aße l, 64289 Darmsudt, Germany

Telephone : ++49- (0)6 I 5 1 - | 6 -2097 . Telefax : ++49-(0)6 I 5 l - 1 6 -4 19 6 E-mail: felsing @psychologie.tu-darmstadt.de

Referenzen

ÄHNLICHE DOKUMENTE

Gravity cores (SL) were measured in coring liners including end caps, whereas Kastenlot (KAL) cores were measured in sub-cores retrieved from the original core using length-wise

Source, digenesis, and preservation mechanisms of dissolved organic matter (DOM) remain elemental questions in contemporary marine science and represent a missing link in models

To test this hypothesis, we used the RMT station grid in the Lazarev Sea between 60 and 70°S to collect additional data during winter on the distribution and abundance of krill

Incubations of sediment cores with inhabiting communities (bacteria, meiofauna, macrofauna) and boundary water (either from the multi-corer or sub-sampled from box cores

Besides the analysis of satellite data and applied model studies, field measurements in the open ocean of phytoplankton pigment composition, optical characteristics of

The species diversity of the oceanic zooplankton is characterized by a high local diversity, in contrast to low global diversity. Moreover, in the mesopelagic

In order to assess the annual fresh water cycle in the Greenland Gyre, a special profiling shallow water yoyo CTD has been installed in 2008 (NGK winch and

A total of 5,032 km of multichannel seismic profiles were collected from the Ross Sea, along the continental rise of the Marie Byrd Land margin, across the shelf of western