• Keine Ergebnisse gefunden

Which Quality Characteristics Can We Detect Automatically?

5. Discussion of Results 69

5.3. Which Quality Characteristics Can We Detect Automatically?

5.3. Which Quality Characteristics Can We Detect Automatically?

When discussing the question which defects can be detected automatically, we have to define the population of defects. There are two approaches towards defining this population: Either we take definitions of quality as the population, or we refer to their instances in the form of defects in RE artifacts. In Publication G, we performed the latter type of analysis, in Publication I, we performed the former (see Section 4.4.2). In the following, we want to further discuss the former view.

For the sake of the argument, we assume quality to be defined with the quality models of the ISO 29148 and the IREB curriculum, as displayed in Fig. 2.11 and discussed in depth in Section 2.4.3. In the following, for each characteristic, we provide a brief definition and discuss which types of defects cannot and which can be found automatically. We base the analysis upon the results to RQ 4.3 and the discussion in the previous section.

For those defects that cannot be automatically detected, we provide five reasons that make it hard or even impossible to develop an automatic requirements smell detection (see Section 5.2.2):2

R1: Requires more explicit or precise quality definition.

R2: Requires semantic understanding of text.

R3: Requires domain knowledge.

R4: Requires knowledge of the system’s scope (e.g. goal).

R5: Requires knowledge of the project’s process and progress.

For those defects that can be automatically detected, automatic requirements smell detection requires a quality factor to be broken down to or approximated at the syntactic level. Therefore, as discussed in Section 4.3.2.2, we have to break down the problem (and also semantic aspects) to one or multiple of the three levels:

S1: Lexical smells S2: Grammatical smells S3: Structural smells.

We start analyzing the characteristics on the RE artifact level (i.e. for a set of requirements), before looking at characteristics for individual requirements, as proposed by IREB and IEEE 29148.

5.3.1. Characteristics for Sets of Requirements

In the following, we discuss the characteristics at the RE artifact level (i.e. for a set of requirements) as proposed by IREB and IEEE 29148.

Consistency. A consistent RE artifact is free of redundancy and contradictions.

What we cannot detect: In general, consistency requires understanding the semantics of a text, including the semantics in context, e.g. pragmatics [CW14], or discourse analytics [JM14]. Unfortunately, existing technologies that could provide such information do not yet provide appropriate accuracy [CW14] (R2).

What we can detect: However, to some extent, we can detect redundancy through clone detection and information retrieval: As discussed by Juergens et al. [JDF+10], clone detection can find consistent and to a certain extent also inconsistent reuse,

2 We drop the reason of data noise, since this analysis is performed on the definition level, not on the instance, i.e. data, level.

74

5. Discussion of Results if it stems from copy-and-paste reuse. Furthermore, Falessi et al. [FCC13] present a systematic analysis that shows the potential to detect equivalent requirements with approaches based on information retrieval. Some approaches leveraging shallow semantic parsing [RMDP16] also show promising results in first examples. These approaches work on the lexical (S1) or structural level (S3).

Completeness. A complete RE artifact includes all relevant stakeholders, goals, and requirements, as well as all information necessary for the RE artifact stakehold-ers3.

What we cannot detect: Completeness cannot generally be detected for three reasons:

First, it is a matter of the system scope which stakeholders, goals etc. are in scope of the project (R4). Second, if we knew the system scope, we would have to understand the semantics of the text to know whether the description is complete (R2). Third, we would need a more precise definition of completeness for different types of artifact stakeholders (R1, cf. [EVF16, EVFM16]).

What we can detect: However, if we can break down the completeness to a structural completeness, e.g. if a company sets up specific templates, one can detect whether all sections contain content. In addition, heuristics can check whether certain expected terms in this section are used or not (S1, S3).

Affordability. In an affordable RE artifact, the team members can create a system fulfilling all requirements within life cycle constraints (e.g. time and budget).

What we cannot detect: Affordability requires unambiguity and completeness, but is more than this. For automatic detection of violations to affordability, we would need an automatic approach to estimate the time required to create a system (such as an automatic version of use case point methods [ADSJ01]) including the project and system constraints and context. By judging from the current state of the even simpler problem of use case points, we argue that, currently, such an automatic detection is still far from reality (R3, R4).

What we can detect: However, an ambiguous or incomplete requirement inherently cannot be estimated either, which is the basis for affordability. Therefore, existing automatic approaches to improve unambiguity and incompleteness, also serve to improve affordability (see Ambiguity and Incompleteness in Characteristics for Individual Requirements,S1, S2).

Boundededness. In a bounded RE artifact, all requirements are within an identi-fied scope, and within user goals and needs.

What we cannot detect: Boundedness refers to an externally identified scope (R4).

Therefore, there is no generic approach to detect unbounded requirements without adding external information.

What we can detect: Yet, some terms such asincluding oretc., can generally hint for some types of scope creeps, i.e. imprecise definitions of the scope (S1).

Clear Structure. This IREB criterion is not properly defined by IREB. We assume it contains two factors: In a clearly structured RE artifact, the artifact follows a common structure, and this structure can be understood efficiently and effectively by the readers.

What we cannot detect: We cannot automatically detect whether a structure can be understood efficiently and effectively. This requires structured, empirical analyses, as we describe in RQ 2 (R1).

3 Please note that, according to our quality paradigm, an absolutely complete RE artifact is not necessarily advisable. It might not even be possible, as discussed by Glinz [Gli16].

5.3. Which Quality Characteristics Can We Detect Automatically?

What we can detect: However, if a structure is defined, we can automatically detect whether an artifact follows this structure (S3).

Modifiability. A modifiable RE artifact allows that RE artifact changes can be performed quickly and error-free.

What we cannot detect: Modifiability, is an activity-based property and as such, its definition depends on the activity, i.e. the modification. In Publication C, we analyzed executed changes to clarify this quality characteristic. In our study, we found locally dispersed information, UI details and improper references to be particularly difficult to maintain. Another large set of changes was related to the taxonomy. We cannot give a perfect accuracy for detecting whether two parts of the document are related.

However, information retrieval approaches (cf. Falessi et al. [FCC13]) can help us in this task. Furthermore, we cannot foresee future taxonomy changes and therefore it is difficult to judge whether an existing taxonomy will change in future (R4).

What we can detect: To some extent we can detect locally dispersed information (seeconsistent), as well as UI details and improper references. However, all of these

approaches are limited to syntactic aspects (S1, S3).

Traceability. In a traceable RE artifact, each requirement correctly defines its links to the origin, implementation and related requirements.

What we cannot detect: Whether two items should be linked, ultimately depends on the domain of the system (R3) and, therefore, cannot be perfectly predicted in general. In addition, understanding the correctness of existing links requires also semantic understanding of texts (R2).

What we can detect: However, if this traceability is part of the requirements structure, and every requirement must have certain types of traces, we can structurally detect whether these traces exist (S3). To analyze whether links are missing, or whether the existing links are correct, various heuristics exist, mostly drawing on information retrieval methods (S1). The current state of traceability is discussed, i.a. by De Lucia et al. [DFOT07], or more recently by Eder [Ede16].

Unambiguity. Ambiguity is the degree to which a document can be understood in multiple ways. We discuss this inCharacteristics for Individual Requirements.

5.3.2. Characteristics for Individual Requirements

In the following, we discuss the characteristics on level of individual requirements as proposed by IREB and IEEE 29148.

Unambiguity and understandability. Ambiguity is the degree to which a document can be understood in multiple ways. We will discuss this factor on the individual requirements level. Berry et al. [BKK03] differentiate between lexical, syntactic, semantic and pragmatic ambiguity.

What we cannot detect: In contrast to lexical and syntactic ambiguity, semantic and pragmatic ambiguity refer to the meaning of the word, or even the meaning of the word in context. For both of these types of ambiguity, we are constrained by the current state in NLP (R2).

What we can detect: Yet, for the other two levels, various approaches exist. For lexical ambiguity, such as synonyms [BKK03], we can make use of existing thesauri (S1).

For syntactic ambiguity (which Berry defines as a sentence having more than one parse [BKK03, p.10]), NLP can detect if multiple parses of a sentence are (grammatically, not semantically) possible (S2).

76

5. Discussion of Results

Necessity. An individual requirement is necessary, if the removal leads to a de-ficiency, the requirement is not obsolete, and the planned expiration is clearly identified [ISO11b, p.11].

We cannot detect: To detect whether something is necessary is a question of the system scope and is therefore, in general, impossible to automate without additional input (R4).

We can detect: However, it is possible to detect whether a requirement structurally contains information on a planned expiration (S3). In addition, existing approaches (e.g. AQUSA [LDBvdW15]) aim at detecting additional, unnecessary information, such as fill words (S1).

Completeness and verifiability. A complete individual requirement contains all information necessary for this requirement to be used in the consecutive process.

Therefore, verifiability is one aspect of completeness of individual requirements. We detail this aspect in our related publications [EVF16, EVFM16].

What we cannot detect: As we detail [EVF16, EVFM16], completeness is very subjective to the usage context. Therefore, which information makes an RE artifact complete must be more precisely specified (R1) and also depends on the context (R3).

What we can detect: However, certain phrases and grammatical structures are inherently incomplete. Many of our smells address this aspect and can therefore be considered improving completeness or verifiability, such as the unverifiable terms smell or the superlatives smell (S1, S2).

Freedom from implementation. An implementation free specification is defined on the right level of abstraction. However, since RE is often an iterative process between the problem and the solution domain (according to, e.g. the twin-peaks-model [Nus01]), theright level of abstraction varies.

What we cannot detect: What is the right level of abstraction, what is problem and what is solution domain, cannot be generally detected since it depends on the context (R3) and on the project scope (R4). In addition, sometimes, e.g. for system constraints, we constrain the implementation. On this level, we intentionally dive down into the solution domain. Therefore, whether this level of abstraction is intentional or not, depends on the context (R3).

What we can detect: However, if we know the intended level of abstraction, e.g. in use cases, or through the definition in company templates or guidelines, we can often identify certain violations. Common examples for this are references to the user interface, or also references to variables or signals (S1).

Singularity. According to ISO 29148, an individual requirements statement is singular if it contains no more than one requirement [ISO11b, p.11].

What we cannot detect: It is still unclear what an individual requirements is (unless it is structurally defined) and consequently, how to separate individual requirements in general (R1).

What we can detect: As proposed in the standard, we can detect conjunctions, nominalizations or other grammatical smells for violations of singularity (see e.g.

Koerner et al. [KB09]).

Agreement. Although this characteristic remains undefined in the IREB glos-sary [Gli14], we assume that agreed requirements are accepted by all stakeholders.

What we cannot detect: There can be known and unknown disagreements. Unknown disagreements are part of the mental model, and therefore usually not manifested in the artifact, but a matter of the context (R3).