• Keine Ergebnisse gefunden

RQ 2: How can we model developer contribution behavior?

7. Discussion 105

7.1.2. RQ 2: How can we model developer contribution behavior?

The second main research questionRQ2 is dedicated to modeling developers’ contribution behavior. We answer this question with the help of several subquestions. The first sub-questionRQ2.1 ask whether a state-based probabilistic model is appropriate for modeling

107 7.1. Answers to Research Questions developers’ contribution behavior. The idea of using a state-based model for the activities of developers is inspired by the nature of OSS projects. There, developers can choose freely which project they want to join, at which time of day they want to work and to which extent they contribute. Therefore, a model allowing for different states of involvement is benefi-cial. We choose HMMs as suitable for the approach. Another advantage of using HMMs is that the multi-dimensional version allows for multiple observations at the same point in time. We choose to include communication activities as well as code-related activities to describe developers’ contribution. To train HMMs from empirical software repository data, a classification step needs to be done beforehand since no labeled contribution data, i.e., whether a developer is low, medium, or highly involved a point in time, exists. Thus, within our case study we evaluated on the one hand the choice of the classifier and on the other hand the goodness of fit of the HMM to the data. This is done by measuring the amount of mismatches between the learned hidden state and the classified one (misclassification rate).

For every developer an individual HMM is trained. The results showed that the average misclassification rate for the individual models reaches from 8.8% to 12.7% depending on the classifier. These findings already state the answer toRQ2.1:

Developers’ contribution behavior can be modeled accurately using state-based probabilis-tic models like multi-dimensional HMMs.

To address RQ2.2, that asks for the similarity of retrieved contribution models for the same developer type, we perform a correlation analysis for the transitions and compare the emissions of the general models. Since the transitions are highly correlated for the same developer type, we built general models taking the average over all HMMs that belong to the role. Thus, we build three general HMMs, one for each developer type (core, major, minor). These general models reveal some interesting insights: Firstly, the transitions are similar among all types of developers, but the emissions are quite different. This matches our intuition since the underlying dynamics maybe similar for all OSS developers, but their workload depend besides the assigned role on more factors, e.g., personality, expertise, tech-nical and social interest in the project. Thus, the workload of developers is more complex to model. Furthermore, we evaluated the general models applied for training each developer state sequence again. The general models perform only 1−5% worse in comparison to the individually trained models. A major advantage of the general models is that they are also appropriate for developers where an individual model cannot be trained, e.g., due to sparse observation data. Together, these observations present the answer toRQ2.2:

Retrieved HMMs for the developer role are very similar in terms their transitions, but their emissions can be more diverse.

Discussion 108 The next subquestionRQ2.3 asks whether the retrieved general models can be applied in practice. To demonstrate the applicability we use the general developer models as stand-alone method to predict contribution behavior of a given set of developers. Besides, we embedded the general models into our simulation tool as refinement of the implemented developer behavior. Both methods are demonstrated in our case studies supporting the usage of general models in practice. This answersRQ2.3:

General contribution models can be applied in practice, e.g, to predict future contributions of developers or for the simulation of software evolution.

ForRQ2.4, that asks if such a fine-grained developer contribution model improve simula-tion results, we compared simulasimula-tion results for the STEPS (average contribusimula-tion behavior) as well as the DEVCON model (fine-grained state-based contribution behavior). In doing so, we compared metrics like the project growth in number of files and the total number of commits. There we observed that for both simulation models the best matching metric values are achieved for mid-size projects. A significant improvement of simulation results cannot be confirmed which answersRQ2.4:

Although able to reflect more realistic trends in the simulation of software evolution, signifi-cant results, e.g., for the comparison of empirical and simulated data, cannot be confirmed.

The last subquestionRQ2.5 addresses the transferability of the proposed approach. We investigate this question in a separate case study where we use HMMs for summarizing project activity and describe the underlying dynamics of developer contribution as well as developer and user discussions. In contrast to the work done so far, we take the aggre-gated number of commits and ML posts into account, since we want to generalize from the individual level to the project level. The HMM training was successful for every project included in the case study. Although the average misclassification rate was higher than for modeling developer contribution, we can say that project activity can be modeled using state-based probabilistic models like HMMs. One major goal of this application is a pos-sible characterization of project activity and, more importantly, project inactivity based on the trained state sequence. The early detection of software project that are likely to become inactive is a challenging task in software engineering. No labeled data of project activity exist which is the first burden in preprocessing the data. Therefore, we performed an expert labeling. The experts detected many projects that are not completely active, but also not yet inactive. Thus, we introduced a third state specifying maintenance projects. By analyzing general models built for each type of activity, we observed common characteristics, e.g., in terms of the state sequence pattern and the output produced in a low, medium, or high state.

Although the approach produced fruitful results, for universal guidelines the data used is

109 7.2. Strengths and Limitations