• Keine Ergebnisse gefunden

TI RBI

4.6. Methods for Interface Evaluation

How to evaluate human-computer interfaces? To get an overview on how to answer this question, I conducted an email survey in which I asked experts in TAI-related fields on how they would evaluate a software system for its interaction quality.6 Twelve of twenty people were responding. They are experts in the fields of information technology, mobile technology, high-level programming languages, computer science, computer vision, Exploratory Data Analysis, sonification, computer interface design, virtual reality, ambient intelligence, media art, psychology, visual communication, and chemistry. After translating the (originally German) answers to English, I extracted the described approaches to identify steps of the

6 The translation of the originally German text of the sent email can be found in Figure Figure4.8.

4.6. Methods for Interface Evaluation

Hello,

this is a little mini-survey.

Theme: How to evaluate the Interaction Quality of an Application?

Imagine that you designed and implemented an interactive part of a bigger software system (e.g. a photoshop plugin). Now, you want to know, if and how people are getting along with it, i.e. how well the implemented kind of interaction works.

What is your opinion on how to test for this?

Your suggestions and ideas are of great interest for me.

kind regards and happy thinking Till

Figure 4.8.: The translated text of the email survey on the evaluation of Human-Computer Interfaces.

proposed methods.7 In this way, I identified four modules that were considered necessary for a survey on interaction quality by almost all participants.8 These are

Scenario In which environment is the interaction device tested?

Material Which survey-related media is used? Audio, video, questionnaire, subjective monitoring or (time-)measurements were named.

Methodology Which theoretical methods are used? Qualitative, quantitative, question-naire, comparison, heuristics were named.

Indicators Which indicators are of importance for the analysis? Qualitative, quantitative or correlation-based indicators were named.

Although all respondents agreed in these general terms, their opinions regarding the concrete methods were very heterogeneous. This might be caused by the broad variety of expertise, however, even experts of the same field expressed very divers suggestions on how to cope with interaction quality evaluation. On a general level, their suggestions included qualitative and quantitative methods and indicators. Especially observation and interpretation of people’s activity were considered useful. As measurable indicators for quantitative evaluation, besides measuring time and counting numbers of clicks needed for a predefined task, comparative stress tests before and after interface usage were named.

These indicators can be identified as the basis for quantitative user studies. Such a study Quantitative Methods

relies on the assumption that an intelligent combination of quantitative indicators and a structured/closed survey design can be developed, which can measure the quality of the explored interface. This quickly arises the question about the definition of the termquality; in many cases – at least in quantitative studies – it is defined asgood performance in terms of the utilised underlying indicators for a concrete task that has to be accomplished by the participants. This means that the time user’s need then is directly used as a criterion

7I used methods from grounded theory for this, a qualitative technique that will be described below [SC90].

8 The curious reader can find the anonymised and translated wording of the participant’s replies in AppendixA. The found codes are summarised in AppendixA.2.

4. Interfacing Humans with Computers

for their performance, respectively the differences in stress tests before and after interface usage are interpreted as indicators for the quality of the interface. Although this might be a valid method for interface types in which a user task can be explicitly defined, this method is not sufficient for the evaluation of Exploratory Data Analysis systems due to their original intent. Their focus is to assist in finding new structures and insights, the outcome of such a session therefore cannot be determined beforehand and will differ from user to user. This in particular makes it very difficult to design a quantitative user study, since a big part of the outcome of a data exploration session depends on the intuition of the users.

One qualitative method that can be used to evaluate user interfaces is grounded theory.

Grounded theory

It was also named by one of the participants. This explorative approach can be used to generate and proof hypothesises based on observations that are made when people use the system under exploration. Grounded theory was developed by Glaser and Strauss in the late 1950s as a sociological method for their studies in the Department of Nursing at the University of California in San Francisco. Travers explains in his book Qualitative Research Through Case Studies [Tra01] that

[. . . ] Glaser and Strauss accepted that the study of human beings should be scientific, in the way understood by quantitative researchers. This meant that it should seek to produce theoretical propositions that were testable and verifiable, produced by a clear set of replicable procedures, and could be used to predict future events.

Grounded theory relies on observations made in the collected data and thereof generated codes or categories. Its (possibly overlapping) phases are [Dic05]

1. data-collection, 2. note taking, 3. coding, 4. memoing,

5. sampling and sorting and 6. writing.

The research process itself relies on the translation of any collected material (1.), be it researcher notes, video or audio into written notes (2.). During this process, the writing of memos on hypothesises and possible sub- resp. super-categories is an essential part (4.), it helps to codify the observed behaviour into emerging categories (3.). As core categories emerge, their validity is proofed by sampling and sorting the data collection (5.). They later form the base of the theory and proof the validity regarding the data, since they can be used to name concrete sections in the original material in which the categorised behaviour can be observed.

Video interaction analysis is an emerging way to evaluate tasks in human-robot

interac-Video analysis

tion [Hae09] and artificial conversation analysis [Kru09]. It can be combined with the grounded theory approach to form a qualitative analysis method that can be used to get insights into the recorded processes and actions. This suggests that video analysis is a valid tool for the initial evaluation of people’s usage strategies when confronted with alternative human-computer interfaces.

4.6. Methods for Interface Evaluation

The methods of grounded research combined with video analysis are applied to evaluate aspects of MoveSound, Reim and AudioDB. The results of the case studies (reported in Section9.1.6, Section9.3.6and Section 9.4.3) indicate that this strategy can be considered useful to get an impression of user-related characteristics of TAIs. Due to the relatively small sample of four respectively five participants, though, its quantitative validity can be questioned. The case studies sufficiently confirm the usefulness of TAIs as interfaces for rich and nature-inspired representations of digital data and algorithmic processes on a qualitative level. Based on these investigations and their results, the systems may be undertaken further quantitative studies. However, this was not feasible during the work on this thesis.

4. Interfacing Humans with Computers