Usage-based Generation of Task Trees

3. Related Work 17

3.4. Usage-based Generation of Task Trees

In addition to usability evaluation, we describe in this thesis an approach for automatically generating task trees based on recorded action instances. Task trees are one variant of

task models which describe the nature and structure of the tasks that users perform with a software [5]. Mostly, they refer to the users’ goals that can be achieved when performing a task. Following an ontology for task models defined by Van Welie et al. [24], task models are formal and they describe 1) the decomposition of tasks into subtasks and actions, 2) a tasks’ flow, 3) the objects required or important for a task execution, as well as 4) the task world, i.e., the environment in which tasks are executed. The task trees that we generate in our approach focus only on task decomposition and task flow description.

Task models can be applied at several stages of the development of a software. For example, they can aid the software design, help on validating design decisions, or be used for generating task-oriented user interfaces [24]. The task trees generated in our approach can be used for a summative validation of a software and, through this, support subsequent design adaptations.

The ontology of Van Welie et al. [24] allows for comparing different variants of task models. It defines a terminology for typical concepts and their relationships used in task models. The concepts being important for our work aretask,basic task, anduser action.

Van Welie et al. describe the two latter concepts as being more concrete variants of the first.

A user action in Van Welie’s terminology is what we simply call an action in this thesis. A basic task is "... a task for which a system provides a single function. Usually[,] basic tasks are further decomposed into user actions and system operations" [24]. With our approach, we mainly identify tasks on the level of Van Welie’s basic tasks but without considering system operations. Van Welie et al. further utilize the termunit taskwhich they describe "...

as the simplest task that a user really wants to perform" [24]. This is the level of tasks that our approach can generate as long as sufficient users are recorded performing these tasks.

However, our tasks do not refer to a user’s goal as this can not be derived automatically.

In addition to the different terms for tasks, Van Welie et al. define relationships between tasks. The relationships that are generated in our approach are Van Welie’ssubtaskand trig-ger. The subtask relationship corresponds to the parent child relationships in our task trees.

The trigger relationship defines the order in which tasks are executed. In our approach, this is covered through the task types. Van Welie et al. mention three different trigger relation-ship types beingAND,OR, andNEXT. The NEXT trigger, defining a subsequent order of tasks or actions, is covered by our task type sequence. The OR trigger, defining execution variants, is supported through our task types selection, iteration, and optional as well as their possible combinations. The AND trigger used to defined parallel task execution is not supported by our approach. According to Van Welie et al., the trigger relationship can be implemented on task level through temporal relationships or through modeling a workflow representation [24]. In our approach, we focus on the first variant which has the disadvan-tage that intermediate nodes may be required in the task trees to fully describe a trigger relationship [24]. The advantage is that we do not require an additional time axis which is needed for the workflow representation.

There are many different approaches utilizing tree structures for task modeling similar to our approach. These approaches usually focus on a specific utilization of the task trees.

For example, Goals, Operators, Methods, and Selection Rules (GOMS) is an approach that utilizes tree structures for manually describing a user’s task with all its actions, but also with all its mental and physical effort [52]. Based on this model, an evaluator can estimate the average task execution duration. In contrast, the Task Modelling Language (TaskMODL) focuses on specifying task trees with reference to the resources required for task execu-tion [18]. This helps to identify all required resources especially for complex tasks. Fur-thermore, Paternò et al. provide a notation for task trees called ConcurTaskTrees [53, 25]

that allows specifying conditions, software side activities, and data flow on top of usual task structures. Furthermore, parallel task executions are supported. The task structures gener-ated in our approach do not intend to replace any of the modeling approaches introduced by other researchers. Instead our goal is to have a simple variant of task trees which provides exactly what our approach requires or is able to generate. Through this, we hope to make our approach more understandable. Nonetheless, we show that our task tree notation can be transformed into other notations like ConcurTaskTrees.

There have been several attempts to detect tasks in recorded user actions. An example is Automated Website USability Analysis (AWUSA) [54] which tries to determine usage patterns from recorded website navigation. This work focuses on navigational patterns and does not include other actions. Furthermore, some works calculate a probabilistic model of system usage, e.g., using Petri nets [55]. These models include representations of tasks in the form of the most probable system usage. But they usually include longer action com-binations which were not executed by users. In addition, these longer action comcom-binations may have a high probability because shorter contained action combinations have a high probability. As such, these models may provide a wrong view on the actual system usage.

Our task trees may also represent invalid action combinations. However, the tasks always refer to their instances which show how the tasks were effectively executed, making our model more realistic.

There are several attempts that detect tasks in form of action combinations in recorded action instances based on labeled data [4]. Labeling means to identify when a task starts or ends in a list of recorded action instances. The labels can be defined by the evaluator [56]

or by the user when they start a task [57]. Other approaches try to generate the labels, for example, via time stamps of the events [58]. Afterwards, the approaches use different methods, e.g., machine learning [59] or sequence alignment [58], to identify tasks based on the labels. Our approach does not require labeled data and is as such more flexible. There are also approaches working on unlabeled data, e.g., Maximal Repeating Patterns (MRPs), which provide statistics for action combinations up to a specific length [15, 9]. This is similar to subdividing natural language texts into n-grams up to a certain length and then providing statistics for them. These approaches consider always full action combinations which may include repetitions of single actions. In our approach, we also detect repeti-tions of acrepeti-tions as standalone and include them in larger action combinarepeti-tions. Furthermore, MRPs do not consider shorter action combinations being part of longer action combinations as done by our approach.

A further variant for identifying tasks from user actions is programming by example.

Here user actions are observed at runtime. If a pattern of actions is identified as a task, users are either proposed with next steps to be executed [60] or with automating repeated action combinations, i.e., tasks. These approaches intend to improve the efficiency of a concrete user. They do not consider the broader view of many users having different or similar tasks.

There are approaches that generate GOMS models from recorded user actions. As men-tioned above, GOMS models are a hierarchical approach for task modeling similar to task trees with the goal to estimate task execution time. Examples for these approaches are ACT-R [61] and Convenient, ACT-Rapid, Interactive Tool for Integrating Quick Usability Evaluations (CRITIQUE) [62]. The goal of these approaches is to ease the definition of GOMS mod-els. Usually, an optimal task execution is performed once by an evaluator and the tools then generate a GOMS model based on this single recording of actions. A further approach called TOME generates similar models from recordings of several executions of the same task [57]. For this, the recordings are manually linked to the executed task. All of these approaches require predefined tasks whose executions are recorded. Our approach aims at detecting real user tasks and not predefining them.

The interaction of a user with a software can be seen as a language spoken by the user and understood by the software. Task trees are a grammatical description for such a lan-guage [10]. Executions of tasks are n-grams of the words belonging to the lanlan-guage. There-fore, approaches for grammatical inference for given language examples could be used for generating task trees based on recorded user actions. But most of the current approaches for grammatical inference require complete sentences of the language for which a grammar shall be created [63]. For natural languages, the sentences can be identified by splitting the input at, e.g., the punctuation marking the end of a sentence [64]. Considering recorded user actions, this can not be automated that easily. Instead, it would mean to label in the event streams where a task starts and where it ends. This can be time consuming especially if a large number of users is recorded. Our approach does not require labeled event streams and is, therefore, also applicable for larger data sets.

There are other attempts for grammatical inference for user actions that do not require a labeled event stream. But still, some of them require manual interaction of an evaluator [65]

which is a disadvantage in comparison to our approach. To the best of our knowledge, Ac-tionStreams is the only approach that generates grammars describing user actions without labeling the recordings [66]. This works by iterating the recorded events and if a first sub-sequence, i.e., n-gram, is detected that occurred a second time, then a non-terminal node for the grammar is introduced. In this approach, iterations will only be detected as a whole and not as the single elements being repeated. Through this, ActionStreams generates grammars which are quite different from our task trees. Not the actions being executed most often are subsumed to a task, but those that occurred first in their combination. Through this, their approach generates different grammars for different orders of data input. Furthermore, con-sidering a large amount of user recordings, their approach may lead to grammar structures which include non-terminals that represent seldom executed action combinations, whereas

non-terminals for often executed action combinations are missing. Our approach, instead, creates the same task trees independent of the order in which user sessions are read. The task trees are generated considering the full data set at once and not only the part that has already been read.

In parallel to the work on this thesis, there was an attempt to identify task trees having the same structure as those resulting from our approach using sequence alignment algo-rithms coming from the context of bio informatics [67]. These alignment algoalgo-rithms are usually used to align protein sequences with each other to detect similarities. The approach is capable of detecting sequences, iterations, optionals, and selections. It was compared to our approach in terms of processing duration and the resulting task tree structures. The alignment approach is significantly slower than our approach [67]. Furthermore, a major disadvantage of the alignment approach is that the resulting task trees describe many in-teractions that are not possible with the recorded software. In our approach, the task trees containing only sequences and iterations (which is the case at the beginning of the genera-tion process) do not have this downside, which allows for reliable analysis. In contrast, the final merged task trees resulting from our approach may describe invalid interactions.

Im Dokument Automated Field Usability Evaluation Using Generated Task Trees (Seite 33-37)