Structure of the Thesis - Automated Field Usability Evaluation Using Generated Task Trees

The thesis is structured as follows. We start with foundations in Chapter 2, in which we introduce the terminology used in this thesis. This terminology spans the usage of soft-ware (Section 2.1), the concepts of task models and trees (Section 2.2), as well as usability engineering and evaluation (Section 2.3).

In Chapter 3, we refer to scientific work related to this thesis and put our work into a broader research context. This is subdivided into existing work on automating usability evaluation in general (Section 3.1), processing of recordings of software usage with a focus on usability evaluation (Section 3.2 and 3.3), the generation of task trees (Section 3.4), and the automation of usability issue and smell detection (Section 3.5). We close Chapter 3 with a description of the research delta provided by this thesis.

In Chapter 4, we depict the details of our approach for automated field usability evalu-ation. We start by introducing the approach in general (Section 4.1). Then we describe, how we record user actions (Section 4.2) and derive a model of the GUI of a software (Sec-tion 4.3). Afterwards, in Sec(Sec-tion 4.4, we explain the genera(Sec-tion of task trees from recorded user actions. The usability smell detection, subdivided into smells for task trees and smells for user actions, forms Section 4.5.

We validated our approach in three case studies. The required implementation is de-scribed in Section 5. In Section 6, we first introduce the basic setup of all case studies and provide reasons for the case study selection. Then we provide one subsection per case study (Section 6.2, 6.3, and 6.4), in which we describe and list the results of applying our ap-proach on two websites and one desktop application. Afterwards, in Section 6.5, we briefly mention two additional experiments, which we performed in the context of this thesis. We discuss the results of the case studies in Section 7. This includes answering the research questions formulated in Section 1.2 in the sections 7.1 and 7.2, as well as showing strengths and limitations of our approach (Section 7.3). We close Section 7 with a consideration of ethical aspects. In Section 8, we conclude the thesis and provide an outlook on potential future work.

This chapter introduces the foundations of this thesis which consist of terminology and basic concepts. We start by introducing terms related to GUIs and their usage. Then, we describe our notion of task trees and their structuring used throughout the thesis. Finally, we introduce usability and related concepts.

2.1. GUIs, Actions, and Events

Any software has a User Interface (UI) that can be utilized by users to interact with the software [18]. Nowadays, this interface is mostly graphical and, therefore, called Graphical User Interface (GUI). A GUI consists of many GUI elements. We subdivide GUI elements intointeraction elements,visual elementsandcontainer elements. Interaction elements are those directly utilized by users for executing functions of a software [18]. Examples are buttons and text fields. Visual elements present information to the users but do not allow for direct user interaction. Container elements are used for structuring interaction elements, visual elements, and other container elements of a GUI. Usually, these are panels, tabbed panes, frames, or dialogs. Container elements can be in conflict with each other regarding their visibility. For example, of several sibling tabbed panes, only one can be visible at a time. We call these container elements aview. A view belongs to a set of views of which only one is visible at a time and which have the same parent container element. In addition, container elements can have a virtual nature in that they are not presented to the user but only used for structuring the GUI.

Container elements contain other GUI elements. Because of this relationship, GUIs fol-low a tree structure, that we call theGUI model. The leaf nodes of this tree are interaction and visual elements. The parent nodes are container elements. The root node of a GUI is a container element containing all other GUI elements directly or indirectly as its children.

A GUI model following this approach can be drawn for desktop applications, apps, and websites. An example of a GUI model for a website is shown in Figure 2.1. The root node is a virtual container element representing the whole website. Its children are also virtual and represent the individual pages of the website. These nodes are views as only one of the pages can be displayed at the same time. The other nodes refer to the Hypertext Markup Language (HTML) Document Object Model (DOM) structure of the specific page by referring to the name of the HTML tag they represent. Some of them are also virtual container elements, e.g., the node representing theformtag, which itself is not displayed on

host login

html

div body head

form(id=“form1“) input(id=“username“) input(id=“password“) input(id=“login“) content

html ...

Figure 2.1.: Example of a simple GUI model.

a website. The leaf nodes of the GUI model represent interaction elements. Visual elements are not included in the example.

The following terminology is based on several papers [19, 20, 21, 22, 23] that we pub-lished in the context of our work. Interaction elements of a GUI offer to users different actions[18] that can be performed on a software. For example, a user can click on a but-ton to trigger some functionality or enter a text into a text field. We refer to the set of all available actions that can be performed on a GUI of a software as the setA.

Actions can be subdivided into the two groups ofefficientandinefficient actions. Efficient actions are contributing semantically to a users task. For example, entering a text into a text field can contribute to a login process. Inefficient actions are the opposite and do not contribute semantically to a users task. For example, scrolling vertically usually does not have any semantic meaning when performing a login process.

The execution of a specific action a∈A by a user is called anaction instance a⁰. All action instances recorded on a software belong to the setA⁰. An action instancea⁰triggers aneventinside the software. This event signals that the user performed the respective action a and the software handles the event to process a⁰. An event has an event type and an event target[10]. The event type denotes the type of action the user performed such as a click, stroking a key on the keyboard, and moving the mouse. The event target refers to the GUI element on which the action was performed. Event targets are usually interaction elements. Events can also be observed on GUI elements which are no interaction elements.

In this case, the corresponding action instances belong to actions which are not inA, i.e., which cannot be executed on the software. All events contain additional information, e.g., a time stamp or coordinates of a mouse click [10]. As events are representations of action instances, these information are available also for action instances.

All action instances recorded on a software can be subdivided into lists of subsequent ac-tion instances that were performed in the same view. Each list represents one opening of the view and contains the action instances that were performed when this view was displayed.

For the determination of these lists, we define the functionviewActionInstances(A⁰,view).

The result of this function is a number of sublists resulting from the number of times the users opened the corresponding view.

The interaction of a user with a software can be seen as a language spoken by the user and understood by the software [10]. Herewith, the actions correspond to the words of the language. The combination of several actions builds sentences. Word combinations of a certain lengthn are named n-grams. We reuse the term n-gram to denote a certain combination ofnactions.

2.2. Task Trees

The following terminology is based on several papers [19, 20, 21, 22, 23] that we published in the context of our work. Users perform an ordered list of actions a₁. . .an to reach a certain goal and, hence, to perform an individualtask[24]. For example, users combine the actions for entering text into text fields and clicking on a confirmation button to accomplish the task of logging in on a website.

Tasks can be combined with other tasks and actions to form higher level tasks. For example, on an online shop website, the higher level task of buying a specific product is a combination of the task of logging in on the website, actions for searching and selecting the respective product, and a further task for performing the checkout. We refer to all tasks that can be performed with a software as the setT. Formally, any taskt∈Thas an ordered list of childrenc(t) =c₁. . .cn. These children are either actions or other tasks, i.e.,ci∈A∪T\ {t}. The number of children of a tasktis defined as|c(t)|. Additionally, we defined that neither direct nor indirect children of a tasktrefer tot. This means, a task is never its own direct or indirect child.

A task is of a specific type through which it defines the execution order of its children.

This order is calledtemporal relationship[25]. In our work, we consider the tasks of type sequence,iteration,selection, andoptional. A sequence is a task having two or more chil-dren (i.e.,|c(t)|>1) which are executed in their given order. An iteration is a task that has only one child (i.e.,|c(t)|=1) which can be executed one or more times. A selection is a task having two or more children (i.e.,|c(t)|>1) of which only one can be executed. An optional is a task having only one child (i.e.,|c(t)|=1) which can be left out.

Through the child relationships defined above, a task forms a tree structure, that we call atask tree. The root node of a task tree is the task itself. The leaf nodes are the actions belonging to the task. The intermediate nodes are the child tasks belonging to the root task and define together with the root task the execution order of the actions. An example for a task tree representing a typical login procedure on a website including the entering of a

user name and a password is shown in Figure 2.2. The leaf nodes are the actions that can be performed. The parent nodes, i.e., tasks, define through their type (indicated through the node name) the execution order of the actions. For example, the taskSequence 2, which represents the entering of the user name, is a sequence and defines that its childrenClick on Text Field "username" and Enter Text in Text Field "username" must be executed in their given order. Selection 1 defines that the user may choose between entering the user name, represented through Sequence 2, and entering the password, represented through Sequence 3.Iteration 1defines that the user can perform this selection any amount of time.

After the user name and the password are entered, the user may optionally check a check box to stay logged in, represented throughOptional 1. Finally, the task is completed through a click on the login button.

Sequence 1 Iteration 1

Selection 1 Sequence 2

Click on Text Field „username“

Enter Text in Text Field „username“

Sequence 3

Click on Text Field „password“

Enter Text in Text Field „password“

Optional 1

Check Checkbox „stay logged in“

Click on Button „login“

Figure 2.2.: Example of a simple task tree representing a login process on a website.

The execution of a tasktis called atask instance t⁰[26]. A task instance also has children being task or action instances. Hence, it also forms a tree structure similar to that of a task. The leaf nodes of this tree are action instances. The root node is an instance of the respective task. The number and types of children of a task instance depend on the type of the corresponding task. The children of asequence instance s⁰ of sequenceswith n children c(s) =c₁. . .cn are a list of instances of the children of s in the same order, i.e.,c(s⁰) =c⁰₁. . .c⁰_n. Aniteration instancehas one or more children all being instances of the single child of the iteration. The number of children of an iteration instance defines how often the child of the iteration was executed. A selection instance has exactly one child being an instance of one of the children of the selection and representing the selected execution variant. Anoptional instancehas zero or one child being an instance of the single child of the optional. If the optional instance has no child, the execution of the single child of

the optional was left out. Otherwise it was performed. An example for a task instance of the task tree in Figure 2.2 is shown in Figure 2.3. The nodes are instances of the respective tasks or actions. The user first enters a user name, then a password, leaves the box for staying logged in unchecked, and performs the login through a click on the respective button.

Instance of Enter Text in Text Field “username“

Instance of Sequence 1

Instance of Enter Text in Text Field “password“

Instance of Click on Text Field “username“

Click on Button “login“

Instance of Click on Text Field “password“

Instance of Selection 1 Instance of Iteration 1

Instance of Sequence 2

Instance of Sequence 3

Instance of Optional 1 Instance of Selection 1

Figure 2.3.: Example of a task instance representing an execution of the task in Figure 2.2.

A task has diverse characteristics, that are of importance for our work. For example, we consider the depth of a taskdepth(t)that we calltask depth, which corresponds to the number of levels of the corresponding task tree. The task itself is the first level, its children the second, and so on. The actions are the last level of a task tree. The example task in Figure 2.2 has a depth ofdepth(Sequence1) =5. In our work, we generate task trees based on recordings of action instances, i.e., events. For generated task trees, we define several functions. One isa⁰(t)which returns all recorded action instances based on which the taskt and its task tree were generated. Similarly,a⁰(t⁰)returns the recorded action instances which represent the instancet⁰of taskt. A further function isx(t)which returns all instances, i.e., executions, of taskt.

2.3. Usability Engineering

Usabilityis a characteristic of products in general [27]. After ISO 9421 part 11, the usability of products focuses on executing tasks with effectiveness, efficiency and satisfaction [28].

Effectivenessmeans that the tasks are fully completed.Efficiencyrefers to the effort for task execution which should be as low as possible. Satisfaction, in addition, considers that the task execution must be pleasant for the users. Usability depends on the usage context, which

covers user groups, tasks to be executed, as well as the physical and social environment of the user. Usability may vary strongly between different usage contexts which means, e.g., that a product can have a high usability for one person and a low usability for another.

Usability can be considered to reflect "... how easy a system is to learn and use, how productively users will be able to work and how much support users need" [29]. ISO 9126 part 1¹ also provides a definition for usability. There, the focus is on software quality and usability is, therefore, "[t]he capability of [a] software product to be understood, learned, used and attractive to the user, when used under specified conditions" [30]. Although dif-fering in some aspects, this definition is similar to the one of ISO 9241 part 11 as it also considers the usage context, i.e., the specified conditions. In addition to effectiveness, effi-ciency, and satisfaction, further aspects may be considered. Examples are learnability [31], error rate [32], and attention [33].

In this thesis, we use the definition of usability after ISO 9421 part 11. Although the term is defined for products in general, in this thesis, we consider usability of software, only. The term software in this thesis covers mainly websites and GUI based computer programs on PCs. But we also consider apps on mobile devices.

Ausability issueis a problem with the software that decreases its usability. This means, it decreases one or several factors of effectiveness, efficiency, and satisfaction [22]. Usability issues can have different causes like the visual design, the information architecture, the performance, or failures of a software. For example, a specific color combination in the visual design can make it hard for users to identify a certain GUI element and, therefore, to fulfill a task (effectiveness). Furthermore, the information architecture may require users to perform long navigation paths through a website (efficiency) to reach a specific information.

A usability smell in our work is exceptional behavior of users indicating one or more usability issues [22]. For example, users click on an unclickable GUI element which may indicate a usability issue with respect to the visual design. Another example is that users perform long navigation paths through a website. This indicates an inefficiency of finding a specific information, the usability issues mentioned above. A usability smell has a descrip-tion of expected user behavior and refers to usability issues it may indicate. Furthermore, it has an intensity being the likelihood of indicating a usability issue.

The goal of ausability evaluationis to measure different aspects of the usability of a soft-ware [3], like efficiency and satisfaction. Basically, it aims at identifying usability issues.

This requires the predefinition of evaluation goals, the analysis of the usage context, and finally the measurement and assessment of usability aspects using dedicated methods. The analysis of the usage context includes the identification of typical tasks users perform with a software. These tasks serve as input for the evaluation methods.

The usability evaluation methods can be subdivided into expert- and user-oriented meth-ods [34]. Expert-oriented methmeth-ods are performed by experts who know how a specific method must be applied. These methods define concrete steps the expert has to take to

1In the meantime, ISO 9126 is superseded by ISO 25000.

identify usability issues. For example, an expert measures the achievable efficiency of a user executing a specific task by identifying detailed actions a user has to take and then estimating the average time for the action executions [35]. In contrast, user-oriented meth-ods follow a process in which users use a prototype or a running software for predefined tasks while they are observed by an evaluator. The observations are then analyzed and help to identify usability issues. For gathering data during the observations, different methods like taking notes, recording user actions, letting users fill out questionnaires, orthinking aloud[27] can be applied. Thinking aloud asks the users to verbalize their thoughts while performing the tasks so that the evaluator gets respective insights not visible from the plain actions that the users perform. User-oriented usability evaluation can be done in a laboratory or in the field [36]. A laboratory setup might influence the user and, hence, the evaluation results [27]. When done in the field, user-oriented usability evaluation lets users do their tasks in their natural environment, i.e., in the matching usage context making the results more reliable [36]. For a user-oriented usability evaluation, already three to six users are sufficient to determine the most important usability issues [34].

Asmodel-based usability evaluationwe refer to usability evaluation methods that utilize some kind of model. These models can describe users and the way they utilize a soft-ware [35] or the softsoft-ware itself [18]. For example, a model can define average durations for specific actions or it may describe the GUI. Usually, the model is created before and analyzed during the evaluation. A model can be created manually or automatically where in the latter case it is usually derived from other models (like the GUI itself) through model transformation.

The continuous application of usability evaluation methods during the development

Im Dokument Automated Field Usability Evaluation Using Generated Task Trees (Seite 18-0)