• Keine Ergebnisse gefunden

Size reduction and transformation processes can be harmful for a network-analytical study. I have demonstrated that the results of a study with a size reduction or transformation process can be significantly different from those without. Furthermore, there is a clear bias in most cases. That means that the error introduced is not completely random but to some extent predictable. The deviations of the estimates from the real values do not only depend on the extent of size reduction. Density and centralization have an influence in many cases as well, though the direction of this influence varies with different network-analytical measures. The influence is too complex to say, for example, that a low density is always bad.

I have shown that certain measures like closeness centrality and the detection of subgroups are especially vulnerable towards size reduction processes. Concerning subgroups, k-plexes turned out to be even a bit more vulnerable than cliques, although the idea behind k-plexes was to have a less strict concept of subgroups that allows ties to be missing. With the parameters used in this study there are indeed more k-plexes than subgroups in the networks; however, an even higher share of them tends to disappear when networks are reduced in size.

I have also demonstrated that some processes are more dangerous than others. For example, in spite of the bunch of literature on that topic it could be shown that sampling in network analysis is a rather bad idea. But even nonresponse and forgetting as processes that are inevitable can be harmful. The most obvious solution to deal with problems is not always the best. When dealing with nonresponse, the actors not responding should not be left out from the matrix but rather included with values of 0 or filled up through symmetrization when applicable. One should be aware that in most cases there is the danger of having outliers. While normally the results vary only to some extent, there are also cases in which the error is much stronger than on average.

This leads to the question of the generalization of these results that unfortunately turns out to be more problematic than I have expected. My hope was to find an overall pattern concerning the influence of density and centralization that is visible with most size reduction and transformation instruments and with most network-analytical measures.

The expectations outlined in section 4.4. could not be confirmed clearly. While some of the results are in line with these ideas, others show different patterns. Therefore, it is

difficult to say if those cases that confirm the expectations show this pattern by chance or because there really is a pattern. When results do not hold true over different processes and measures, it is problematic to interpret them.

For example, when you have a research design with one dense and one sparse matrix, and you discover that your results are more valid for the dense matrix, is that because density has a positive influence on validity, or is it by chance? What happens if you include a matrix with a medium density (as it was done here), and you discover that it does not fit into that trend? And how should you deal with the non-scale-free “d-out”

matrix? When it confirms your expectations concerning the influence of density, does this mean that it does not play a role whether networks are scale-free or not? When it does not fit into the trend, does it mean that this is because the matrix is not scale-free, or simply because there is no influence of density on the validity of the results? The idea of including a non-scale-free matrix was an attempt to see what happens. The results are not straightforward enough to reach path-breaking conclusions, but I also never expected them to be. All I can say is that the “d-out” matrix shows patterns differing from the trend of the other density matrices quite often. Therefore, I would expect that it does make a difference whether networks are scale-free or not.

Still, it was not useless to take different matrices with different density and centralization values for the study. First of all, it confirmed that problems with size reduction and transformation processes seem to occur with all kinds of matrices.

Secondly, it prevented premature judgments concerning the influence of density and centralization that could have happened with only two matrices in the research design.

Thirdly, there are cases in which there seems to be a clear influence which is confirmed by the trend over three or four matrices and not only by the difference between two.

Finally, these results pose a lot of possibilities for further studies. While this thesis intended to give an overview, there are now many aspects that could be investigated more in detail. My hope is that there might be students of network analysis who are willing to investigate such aspects, for example in a term paper. I will make all materials of this thesis available on CD-ROM to everyone who is interested. But there is not only potential for new studies. Even the results of this study have room for further investigation. For example, all tables reported here are only based on one factor each, while for the others averages were calculated. Certain combinations of influencing factors could be investigated more in detail.

Finally the consequences of these results should be discussed. When you are using certain network-analytical measures in combination with certain size reduction or transformation instruments in your own study, you could look at my results and see if it is rather safe or rather dangerous to use this particular combination. More general, I would conclude from these results that transformation and size reduction should be well considered: what is good or bad also very much depends on the theoretical assumptions of your study. This was already discussed in connection with dichotomization. When you want to find cliques of close friends in a valued acquaintance network, you should choose a rather high cut-off value indicating intensive contact. So when my study reports that the “validity” of a cut-off value of 75% is very low, compared to a dichotomization by the presence or absence of ties, it in the first instance means that there is a huge difference between the results of both methods, and that it matters even more which one you choose. If it turns out that a high cut-off value is required by the theoretical assumptions of your study, this solution will be most valid. Again, there are many cases in which it makes sense to do transformations. I only argue against their blindfold use.

About the influence of density and centralization it should be noted that these are only indicators for certain structural weaknesses that can cause network-analytical measures to be more vulnerable towards size-reduction and transformation. Maybe they are not even good indicators, as they are not independent of each other; however, they can be calculated easily and could therefore serve as a fast way for a researcher to get a first impression about the vulnerability of his data set. In the long run, it would be a good idea to develop algorithms that are specifically geared to detecting the structural weaknesses described in this study. When these measures are included in network-analytical software, they can serve as a better-suited warning mechanism than density and centralization. The necessity for such research is shown by this thesis, and therefore I hope that the debate on the error and attack vulnerability of networks from the physical sciences will diffuse into the area of social network analysis as well. It could help to gain an awareness and a better understanding of the dangers that certain interferences can pose for a study.

Bibliography

Albert, R. / H. Jeong / A.-L. Barabási (2000): Error and attack tolerance of complex networks.

Nature 406(6794): 378-382.

Anderson, B.A. / B.D. Silver (1987): The Validity of Survey Responses: Insights from Interviews of Married Couples in a Survey of Soviet Emigrants. Social Forces 66(2): 537-554.

Barabási, A.-L. / E. Bonabeau (2003): Scale-Free Networks. Scientific American 288: 50-59.

Batchelder, W.H. (1989): Inferring Meaningful Global Network Properties from Individual Actor's Measurement Scales, pp. 89-134 in: L.C. Freeman / D.R. White / A.K. Romney (eds.), Research Methods in Social Network Analysis. Fairfax, VA: George Mason University Press.

Bernard, H.R. / P. Killworth / D. Kronenfeld / L. Sailer (1984): The Problem of Informant Accuracy:

The Validity of Retrospective Data. Annual Review of Anthropology 13: 495-517.

Bernard, H.R. / P. Killworth / L. Sailer (1981): Summary of Research on Informant Accuracy in Network Data, and on the Reverse Small World Problem. Connections 4(2): 11-25.

Boissevain, J. (1979): Network Analysis: A Reappraisal. Current Anthropology 20(2): 392-394.

Borgatti, S.P. / M.G. Everett / L.C. Freeman (2002): Ucinet for Windows: Software for Social Network Analysis [Help File]. Harvard, MA: Analytic Technologies.

Brandes, U. / T. Erlebach (2005): Network Analysis: Methodological Foundations. Berlin, Heidelberg: Springer.

Brewer, D.D. (2000): Forgetting in the Recall-based Elicitation of Personal and Social Networks.

Social Networks 22(1): 29-43.

Burt, R.S. / T. Schøtt (1989): Relational Contents in Multiple Network Systems, pp. 185-213 in:

L.C. Freeman / D.R. White / A.K. Romney (eds.), Research Methods in Social Network Analysis. Fairfax, VA: George Mason University Press.

Butts, C.T. (2003): Network Inference, Error, and Informant (In)Accuracy: a Bayesian Approach.

Social Networks 25(2): 103-140.

Callaway, D.S. / M.E.J. Newman / S.H. Strogatz / D.J. Watts (2000): Network Robustness and Fragility: Percolation on Random Graphs. Physical Review Letters 85(25): 5468-5471.

Capobianco, M.F. / J.C. Molluzzo (1979): The Strength of a Graph and its Application to Organizational Structure. Social Networks 2(3): 275-283.

Conrath, D.W. / C.A. Higgins / R.J. McClean (1983): A Comparison of the Reliability of Questionnaire versus Diary Data. Social Networks 5(3): 315-322.

Corman, S.R. (1990): Computerized vs Pencil and Paper Collection of Network Data. Social Networks 12(4): 375-384.

Costenbader, E. / T.W. Valente (2003): The Stability of Centrality Measures when Networks are Sampled. Social Networks 25(4): 283-307.

Daniel, W.W. (1975): Nonresponse in sociological surveys: A review of some methods for handling the problem. Sociological Methods & Research 3(3): 291-307.

De Vaus, D. (2001): Research Design in Social Research. London: Sage.

Deseran, F.A. / L. Black (1981): Problems with Using Self Reports in Network Analysis: Some Empirical Findings in Rural Counties. Rural Sociology 46(2): 310-318.

Doreian, P. / K.L. Woodard (1992): Fixed List versus Snowball Selection of Social Networks.

Social Science Research 21: 216-233.

Doreian, P. / K.L. Woodard (1994): Defining and Locating Cores and Boundaries of Social Networks. Social Networks 16(4): 267-293.

Erdös, P. / A. Rényi (1960): On the evolution of random graphs. Publications of the Mathematical Institute of the Hungarian Academy of Sciences 5: 17-60.

Erickson, B.H. / T.A. Nosanchuk (1983): Applied Network Sampling. Social Networks 5(4): 367-382.

Erickson, B.H. / T.A. Nosanchuk / E. Lee (1981): Network Sampling In Practice: Some Second Steps. Social Networks 3(2): 127-136.

Frank, O. (1978): Sampling and Estimation in Large Social Networks. Social Networks 1(1): 91-101.

Frank, O. (2005): Network Sampling and Model Fitting, pp. 31-56 in: P.J. Carrington / J. Scott / S.

Wasserman (eds.), Models and Methods in Social Network Analysis. New York: Cambridge University Press.

Friedkin, N.E. (1981): The Development of Structure in Random Networks: an Analysis of the Effects of Increasing Network Density on Five Measures of Structure. Social Networks 3(1): 41-52.

Galaskiewicz, J. (1991): Estimating Point Centrality Using Different Network Sampling Techniques.

Social Networks 13(4): 347-386.

Goodman, L.A. (1961): Snowball Sampling. Annals of Mathematical Statistics 32(1): 148-170.

Granovetter, M. (1976): Network Sampling: Some First Steps. American Journal of Sociology 81(6):

1287-1303.

Granovetter, M. (1977): Reply to Morgan and Rytina. American Journal of Sociology 83(3): 727-729.

Hammer, M. (1980): Some Comments on the Validity of Network Data. Connections 3(1): 13-15.

Hlebec, V. / A. Ferligoj (2001): Respondent Mood and the Instability of Survey Network Measurements. Social Networks 23(2): 125-140.

Holme, P. / B.J. Kim / C.N. Yoon / S.K. Han (2002): Attack vulnerability of complex networks.

Physical Review E 65(056109): 1-14.

Huisman, M. / M.A.J. van Duijn (2005): Software for Social Network Analysis, pp. 270-316 in: P.J.

Carrington / J. Scott / S. Wasserman (eds.), Models and Methods in Social Network Analysis.

New York: Cambridge University Press.

Jansen, D. (2003): Einführung in die Netzwerkanalyse. Opladen: Leske + Budrich.

Johnson, J.C. / J.S. Boster / D. Holbert (1989): Estimating Relational Attributes from Snowball Samples through Simulation. Social Networks 11(2): 135-158.

Kenis, P. / V. Schneider (1991): Policy Networks and Policy Analysis: Scrutinizing a New Analytical Toolbox, pp. 25-59 in: B. Marin / R. Mayntz (eds.), Policy Networks. Empirical Evidence and Theoretical Considerations. Frankfurt: Campus.

Killworth, P.D. / H.R. Bernard (1976): Informant Accuracy in Social Network Data. Human Organization 35(3): 269-286.

Klau, G.W. / R. Weiskircher (2005): Robustness and Resilience, pp. 417-437 in: U. Brandes / T.

Erlebach (eds.), Network Analysis: Methodological Foundations. Berlin, Heidelberg: Springer.

Klovdahl, A.S. / Z. Dhofier / G. Oddy / J. O'Hara et al (1977): Social Networks in an Urban Area:

First Canberra Study. Australian and New Zealand Journal of Sociology 13(2): 169-171.

Koehly, L.M. / P. Pattison (2005): Random Graph Models for Social Networks: Multiple Relations or Multiple Raters, pp. 162-189 in: P.J. Carrington / J. Scott / S. Wasserman (eds.), Models and Methods in Social Network Analysis. New York: Cambridge University Press.

Laumann, E.O. (1979): Network Analysis in Large Social Systems: Some Theoretical and Methodological Problems, pp. 379-402 in: P.W. Holland / S. Leinhardt (eds.), Perspectives on Social Network Research. New York: Academic Press.

Laumann, E.O. / P.V. Marsden / D. Prensky (1983): The Boundary Specification Problem in Network Analysis, pp. 18-34 in: R.S. Burt / M.J. Minor (eds.), Applied Network Analysis: A Methodological Introduction. Beverly Hills: Sage.

Laumann, E.O. / P.V. Marsden / D. Prensky (1989): The Boundary Specification Problem in Network Analysis, pp. 61-87 in: L.C. Freeman / D.R. White / A.K. Romney (eds.), Research Methods in Social Network Analysis. Fairfax, VA: George Mason University Press.

Lucas, J.W. (2003): Theory-Testing, Generalization, and the Problem of External Validity.

Sociological Theory 21(3): 236-253.

Marschall, N. (2004): The Risks of Over-Simplifying a Network Analysis Data Set. Unpublished term paper for the University of Konstanz.

Marsden, P.V. (1990): Network Data and Measurement. Annual Review of Sociology 16: 435-463.

Morgan, D.L. / S. Rytina (1977): Comment on "Network Sampling: Some First Steps" by Mark Granovetter. American Journal of Sociology 83(3): 722-727.

Richards, W.D. (1985): Data, Models, and Assumptions in Network Analysis, pp. 109-128 in: R.D.

McPhee / P.K. Tompkins (eds.), Organizational Communication: Traditional Themes and New Directions. Beverly Hills: Sage.

Rothenberg, R.B. (1995): Commentary: Sampling in Social Networks. Connections 18(1): 104-110.

Schnell, R. / P.B. Hill / E. Esser (1999): Methoden der empirischen Sozialforschung. München, Wien: Oldenbourg.

Scott, J. (2000): Social Network Analysis: A Handbook. London, Thousand Oaks, New Delhi: Sage.

Stork, D. / W.D. Richards (1992): Nonrespondents in Communication Network Studies: Problems and Possibilities. Group & Organization Management 17(2): 193-209.

Trappmann, M. / H.J. Hummell / W. Sodeur (2005): Strukturanalyse sozialer Netzwerke.

Wiesbaden: VS.

Wasserman, S. / K. Faust (1994): Social Network Analysis: Methods and Applications. New York:

Cambridge University Press.

List of Data Sets

Bernard, H. / P. Killworth / L. Sailer (1980): Informant accuracy in social network data IV. Social Networks 2(3): 191-218. [Data set of the study described in this article; matrices for the

“fraternity” and the “HAM radio” investigations, contained in the UCINET software package.]

Knoke, D. / N.J. Kaufman (1992): Social Organization of the United States National Labor Policy Domain, 1981-1987. Minneapolis, MN: David Knoke, University of Minnesota, Dept. of Sociology [producer], 1992. Ann Arbor, MI: Interuniversity Consortium for Political and Social Research [distributor], 1992.

Laumann, E.O. / D. Knoke (1994): National Policy Domains of Health and Energy, 1971-1990. 2nd ICPSR version. Chicago, IL: Bernard J. McMullan and David Prensky [producers], 1994. Ann Arbor, MI: Interuniversity Consortium for Political and Social Research [distributor], 1997.

Pappi, F.U. / T. König / D. Knoke (1995): Entscheidungsprozesse in der Arbeits- und Sozialpolitik.

Frankfurt, New York: Campus. [Data set of the study described in this book in chapter 8;

matrices called “Broker Rollen zu den Agenten in der Bundesrepublik” and “Broker Rollen zu den Agenten in den USA”.]

List of Software

Borgatti, S. (2002): NetDraw 2.29: Graph Visualization Software. Harvard: Analytic Technologies.

Borgatti, S. (2003): KeyPlayer 1.45. Boston: Analytic Technologies.

Borgatti, S. / M.G. Everett / L.C. Freeman (2002): Ucinet for Windows 6.109: Software for Social Network Analysis. Harvard: Analytic Technologies.

Brandes, U. / D. Wagner et al (2002): Visone 1.1.1. Analysis and Visualization of Social Networks.

Stata Corporation (2003): Stata/SE 8.0. Statistics Software.

Sun Microsystems (2005): OpenOffice.org 2.0. Spreadsheet and Wordprocessing Software.

Wall, L. (2005): Perl 5.8.7. Programming Language.

Wei, C. (2000): SuperCool Random Number Generator 1.04.