• Keine Ergebnisse gefunden

3.2 Study Design

3.2.4 Interview Study

We interviewed 12 stakeholders using semi-structured interviews. The reason for selecting semi-structured interviews as our method is that we aim to gain in-depth, qualitative insights. This research method allows us to cover a fixed set of questions but also gives us the freedom to follow up on the answers to better understand the stakeholders’ reasoning.

Based on our insights from Chapter 2, we created a mockup of a tool for app review analytics. The app review analytics mockup presents scenarios for filtering user feedback automatically. It filters app reviews into problems report and feature requests. We decided to create this mockup to propose a possible solution to stakeholder challenges introduced in Chapter 2. In the interviews, we first asked the stakeholders if they find user feedback useful and about how they include it in their work. Later, we presented the mockup and asked the stakeholders if they find that mockup helpful.

In the following, we present and explain the capabilities of the mockup. Then, we describe the interview setting, including a description of the interviewed stake-holders.

App Review Analytics Mockup

Figures 3.2, 3.3, and 3.4 show the web-based mockup for app review analytics.

The objective of the mockup is to show stakeholders a potential solution on automatically filtering explicit user feedback. The mockup is the basis of our discussion in the semi-structured interviews.

All figures of the mockup have the top bar in common. On the top bar, stake-holders can navigate through the mockup and can log in to an existing account.

If logged in, stakeholders can see personalized notifications. Personalized noti-fications are about updates in the analysis but also about “watched users” that update their feedback. If, for example, a stakeholder replies to an app review,

the user might update their review in return. That update triggers a notification that allows the stakeholders to react on the update timely.

The landing page of the mockup is shown in Figure 3.2. It contains features such as an import for existing explicit user feedback. Stakeholders can download user feedback from the dedicated platform, such as the app reviews from the Google Play Store, and import them. Alternatively, stakeholders can develop crawlers that retrieve data periodically, in case the platform provides APIs. App stores, for example, usually do not provide these APIs, which makes crawling them cumbersome.

Figure 3.2: Mockup of a review analytics tool: review types over time.

As soon as the stakeholders imported app reviews, they see a trend analy-sis. Figure 3.2 shows an example for a trend. The mockup prototype creates trends around the four user feedback categories bug reports, rating, feature re-quest, and user experience. The trend analysis visualizes the occurrences of the four categories in user feedback for each app release. Therefore, the analysis helps understanding if, for instance, a new release either introduced more bugs or if users report fewer bugs after the release. From that view, project managers can get inspiration for what to focus on next. For instance, if there is a peak of bug reports, the development team may want to switch the focus to fixing bugs before introducing new features. In case there are unusually many feature requests, the stakeholders of the project may want to discuss them and eventually turn them

into official requirements.

Figure 3.3: Mockup of a review analytics tool: app store comparison.

Software companies of popular apps usually support both major mobile operat-ing systems—Android and iOS. In the next widget, we want to monitor how apps perform per platform because each operating system requires different program-ming languages, different development teams, and other factors such as operating system fragmentation. We illustrate the widget for this comparison in Figure 3.3.

The comparison view shows for each app distribution platform the total number of occurrences per user feedback category, as well as their relative size among each other. In each pie chart (app distribution platform), the stakeholders can see if, e.g., the Android app has more bug reports than other apps. Additionally, the pie chart depicts if any user feedback category currently dominates. Therefore, Figure 3.3 shows a general overview to compare the overall performance of each app to enable individual decision making.

The previously discussed views give a general impression of the app’s perfor-mance. However, stakeholders need to look at the actual user feedback to allow discussions about, e.g., feature requests or to learn from it for fixing bugs. For that purpose, we provide a third view in the mockup. Figure 3.4 shows the review details tab of the mockup. It contains the aggregated analysis results of the app review filtering. For instance, an approach classifies user feedback into the four categories bug reports, rating, feature request, and user experience. Stakeholders interested in discussing, e.g., feature requests, can use this view to filter all user feedback for this category. We further added additional filters to narrow and focus the results. One additional filter is the app distribution platform, as the

Figure 3.4: Mockup of a review analytics tool: review details.

Android development team most likely wants to discuss bug reports on their app while the iOS development team wants to focus on their platform. Further, the view allows filtering for the user feedback language, which helps organizations that distribute apps in several countries.

We display the filtered user feedback below the filter bar (see Figure 3.4).

The left side of the figure shows a pie chart illustrating the overall distribution of the user feedback categories. The chart further allows filtering for the user feedback categories. The right side of the figure shows a list of the user feedback that matches the filter options. This list of user feedback shows the available information like the user name, the star rating the user gave the app, as well as the feedback text body. Next to the list of feedback, the stakeholders can perform two actions. First, there is a dropdown menu that shows the classification result for the user feedback categorization. Therefore, if stakeholders filter for bug reports, all listed feedback will have “bug report” as the default selection in the dropdown menu. In case stakeholders disagree with the classification result, they correct it manually. These manual updates are fed back to the machine learning algorithm to help improve it. The second action is a button that allows stakeholders to

watch certain users. If, for example, a user gave feedback, but the stakeholders may need more information, they can contact that user and get notified as soon as the user got back to them. Alternatively, the user may have given a bad star rating because of a frustrating bug. Stakeholders can then reply to, e.g., showing appreciation for the feedback or informing the user that the team is working on it. Finally, if the user updates the feedback, the stakeholders get notified.

Interview Setting

We describe the interview setting in more detail by stating how we performed the interviews and by describing the interviewed stakeholders.

Table 3.1: Overview of the interview participants.

# Role Company

P1 Senior app developer German SME, app development P2 Senior tester, quality manager Large European social media company

P3 Lead engineer European telecommunication company

P4 Project manager for apps Global market research company P5 Lead architect, project manager Mac solutions software development P6 RE Researcher with practice experience University

P7 Usability/Requirements engineer Large software development company P8 Project manager, requirements analyst Danish SME, app development P9 Project manager, requirements analyst Danish SME, app development P10 Technology enterprise architect Italian telecommunication company P11 Technology innovation manager Italian telecommunication company P12 Research and Innovation Senior Manager Italian telecommunication company

Participants. We interviewed twelve stakeholders. Table 3.1 gives an overview of the interviewed stakeholders. In the stakeholder selection process, we tried increasing diversity concerning roles, company size, and domains. The reason for varying the stakeholders’ roles is that the review analytics mockup provides different levels of abstraction regarding the information it displays. As the ta-ble reveals, some stakeholders had several roles. Of the twelve stakeholders, six stated that they have management responsibilities. The review detail view of the review analytics mockup contains actual app reviews, which, e.g., developers can use as a direct input for understanding bugs, but also to gather feedback that suggests new ideas for features. Therefore, we not only wanted to rely on the management perspective but also interview stakeholders, such as developers who

have experience in app development. Regarding the company size, we aimed at diversity because we want to understand if also smaller companies, which poten-tially receive less feedback, can benefit from the approach. We cover companies from small and medium-size to enterprises operating in the global market. Among the stakeholders, there is one researcher from a university who previously worked in a small iOS app development company in the role of an app developer. The domains of the companies are also diverse. We cover more general app develop-ment companies that work on commissions but also others that run their own product.

Interview details. We performed semi-structured interviews as we needed some questions to be answered directly but allowed open questions as we were also in-terested in more details such as potential use cases of our approach. Further, we aim at getting in-depth qualitative insights, which we can receive by asking follow-up questions. Therefore, we selected semi-structured interviews as our research method. The interviews with P1 to P9 took place in January 2016, while the interviews with P10 to P12 took place in March 2019. The reason for these two time frames is that in 2016, to the best of our knowledge, there had been no qualitative study, including interviews with stakeholders asking about the usefulness of software requirements intelligence with a focus on app stores.

In 2019, we conducted a second interview study to not only extend our existing interviews but also to enrich them by including social media as a data source for the feedback analyses. The interview setting was similar, and each of them lasted about 30-45min. At least two researchers conducted each interview to avoid bias introduced when a single interviewer documents and leads the interview. There-fore, we had dedicated roles in the interview for leading and note-taking. One difference between the two interview iterations is that for the 2019 interviews, we could also make voice recordings, which we later transcribed. Additionally, we performed the 2019 interviews face to face while we conducted the 2016 interviews over the phone.

3.3 Review of Feedback Usefulness Studies