• Keine Ergebnisse gefunden

3.2 Quality Measures and Human Perception – An Empirical Study

3.2.3 Results

Figure 3.20: Projections of scatterplots used in the experiment. Participants had to select the best five projections and order them by their quality. The order of the scatterplots was permuted for each participant separately using the Latin-Square method.

Procedure

The experiment consisted of two parts. In the first part, participants had to read a short description of the scenario, the task and fill out a short standardized form on general questions (such as age, study stage, experience with computers and scatterplots)18. In the second main part of the experiment, participants had to perform the task by selecting and ordering the five best representations that classified three wine types19. Clearly, the best suited scatterplot is the one that allows a clear distinction of the three wine types by the two attributes. Participants’ effectiveness mainly depended on their ability to read and interpret scatterplots. The group of participants was quite homogeneous with regard to age and previous education. Expectedly, their performance did not show significant deviations or anomalies. This was assured by computing that none of the scores is above or below the triple standard deviation. In order not to be biased towards any of the measures, participants were not directed on how to define a high quality projection, nor how to look for dense or consistent clusters.

3.2.3 Results

A linear regression analysis was carried out using the Pearson coefficient for assessing the correlation between users’ classification and the measures’ quality assignment of the selected projections. In order to make the measures comparable, we normalized the as-signed quality measures individually for the projections between 0 to 1. From the users’

answers we computed the probability of selecting a projection by counting the number of times each projection was selected. These probabilities were weighted with the averaged ranks assigned by the participants. This resulted in a sequential order of the projections reflecting users’ quality preferences. The dependent variable of the statistical evaluation

18Appendix A.2 contains this general question form (in German) in Section A.2.1.

19Appendix A.2 contains two examples of the experiment form (Figure A.1 and Figure A.2).

was the user rankings, and each of the four measures was one independent variable in sep-arate computations. The results show significant positive correlation for all four measures (p <0.05,DF = 1, DF e= 16)20 with the users’ selection, as shown in Table 3.4.

Table 3.4: Results of the regression analysis.

Measure t-value StdErr. Adj. R2

1D-HDM 3.366 0.196 0.378

2D-HDM 6.723 0.127 0.722

DCM 6.451 0.118 0.705

CDM 5.082 0.151 0.594

There are interesting differences in R2 values tying the results to our hypotheses.

These results indicate what proportion of the variance is explained by the regression. The highestR2 value is achieved by the 2D-HDM, DCM performed slightly worse, followed by the CDM and the lowest by the 1D-HDM measure. Our hypotheses were partially fulfilled by these results and revealed some new significant insights. The results of the correlation are shown in Figure 3.21. The classification made by the users is mapped to the x-axis and by the measures to the y-axis. The charts also show the linear regression line with equation and unadjusted R2 value.

(a) 1D-HDM (b) 2D-HDM

(c) DCM (d) CDM

Figure 3.21: Correlation of measures with users’ classification shows highest R2 values for the 2D-HDM measure.

20Abbreviation explanation: p = probability-value, DF andDFe are the degrees of freedom.

3.2.3 Results 61 2D-HDM and DCM assigned the best quality to the projection exactly as did the users.

CDM assigned for this projection 99% quality (rank 2), and 1D-HDM only 68% quality (rank 4). The projection of users’ highest quality is shown in Figure 3.22(a).

The highest quality projection selected by CDM and 1D-HDM is shown in Figure 3.22(b).

This projection shows a clear and very dense cluster for one of the wine types, however, it also shows a high overlap for the other two types. Users assigned rank 4 for this projection.

In users’ eye the worst quality projection was the one showing high density of all three wine types but also a high overlap, as shown in Figure 3.22(c). This was also confirmed by three measures, except by the CDM measure that still assigned a quality of 26.3% (rank 11) to this projection.

(a) Users’ highest quality ranked projection was con-firmed by DCM and 2D-HDM quality measures.

(b) Highest quality ranked projection by CDM and 1D-HDM measures.

(c) Users’ lowest quality ranked projection was con-firmed by DCM, 2D-HDM and also by 1D-HDM qual-ity measures.

Figure 3.22: Correlation of measures with users’ classification for highest and one lowest quality projection.

Interesting is also the phenomenon that none of the users selected 8 of the 18 projec-tions21. CDM, however, still assigned 65% quality to one of these projections as shown in Figure 3.23(a). The highest quality assignment to one of these 8 projections was 58% by 1D-HDM, 50% by DCM, and only 40% by 2D-HDM. Surprisingly, the projection shown in Figure 3.23(b) was selected by a user and ranked between the best five, but all the measures ranked it second to last, or even last by CDM.

(a) Not selected by any user, but ranked by CDM with 65.

(b) Selected by a user, ranked by all the mea-sures second to last, and by CDM last.

Figure 3.23: Surprising study results.

21In Appendix A.2.3 Figure A.3 shows the 8 projections that where not selected by any user.

In summary, 2D-HDM, tightly followed by DCM, reflected users’ quality assignment best by reaching the highest and lowest quality ranking accurately, and having the highest R2value of the correlation. These results should however not indicate that density (CDM) is unimportant for quality assignments. It should rather motivate to combine and improve these measures, so they can sufficiently support users in their task.