• Keine Ergebnisse gefunden

Interactive features may improve the usability of matrix visualization when specific as-pects, such as groups or connectivity, should be explored in the data [HFM07; HF06].

Relational data, such as computer- or social networks, can be modeled as graphs. These graphs can be visualized as matrices by simply using their adjacency matrices. The matrix cells can be colored to show binary, categorical or continuous attributes for each edge, e.g., the edge weight [Ber81, p. 33]. Matrix visualizations are particularly suitable in cases where the associated graph is dense [GFC05].

Static and One-Dimensional Matrix Data. Matrix visualizations provide a highly scal-able visual representation of graphs [Lan+11; GFC05]. They can reveal important aspects of graph structure if they are appropriately sorted and rendered. However, matrix repre-sentations are less intuitive than node-link diagrams. Thus, they need to be supplemented by additional visualizations and interaction techniques to improve understanding. In [HFM07], matrix visualizations and node-link visualizations are combined in an interac-tive system. The matrix is used to provide an overview representing very dense areas of the graph, and a node-link view shows details for selected parts that are globally sparse.

Semantic zoom interaction can help navigate matrices which do not fit into the available

screen space. In [Elm+08b; AH04] zooming and dynamic aggregation techniques support the navigation process in large matrices. Matrices, and thus matrix visualizations, occur in many real-world analysis tasks. Matrix visualizations have been used in the analysis of social networks [HF06], and gene regulatory networks [DWW12]. Time series can ef-ficiently be summarized in a matrix visualization [Sip+12]. Visualizations of similarity matrices are frequently used in the analysis of non-numeric attributes, such as in the pairwise comparison of text documents [Beh+12b].

Time-Dependent and Multivariate Matrix Data. Many analysis tasks involve multiple, heterogeneous matrices. For example, a social network is a time-dependent graph whose nodes correspond to entities and edges and/or their attributes correspond to relations such as friendship or a message (count). Both nodes and edges may change over time.

Each attribute dimension yields a matrix with a single value/dimension encoded in each cell. Each time step may give rise to a different attribute matrix. Many matrix visualizations were developed for one-dimensional and static matrix entries and do not support dynamic and complex matrix data well. One approach to handling time-dependency in graphs is [Bur+11], where graph states are represented as consecutive narrow stripes, in which vertices are arranged vertically on each side. Directed edges connect vertices from left to right to show the graph evolution. In [Bre+10], the interactive visualization of pairs of matrices was addressed. Specifically, one matrix contains weight values, and the other contains target values in a correspondence matrix representation of molecular data, and interaction allows cross-filtering in both matrices. In [Beh+12a] time-series data is presented in a triangular matrix, where the matrix cells are statistical aggregates over all possible subintervals.

Sequential and Non-Sequential Data Comparison. Much work exists that studies visu-ally analyzing and comparing sequential (ordering, ranking) data. One instance of this sequential data is the computed linearization of a matrix, which sequentially aligns all vertices in a graph. The notion of sequential data per se is very broad and comprises many applications. The article of Gleicher et al. [Gle+11] surveys and structures the solution space for visual comparisons of different data types.

Time series are an important instance of sequential data. Time series visualization is concerned with visual mappings for series of measurements, typically given by quan-titative, equally-spaced consecutive values [Aig+11]. The comparison of two or more sequential data sets is a fundamental problem in many applications. In fact, many time series visualization techniques were designed for comparison tasks, such as dense pixel-based approaches for comparing large numbers of time series [KAK95]. The elements of a series or sequence can also be symbolic, as e.g., in DNA sequences. The analysis of

sequences of values may include relationships among them. An example is sequences of email messages sharing reply/forward relationships [Ker03].

Techniques exist which allow comparing data which is inherently non-sequential, by finding a linear mapping of data elements, on which then sequence visualization can be applied. Examples include the TreeJuxtaposer [Mun+03] system, which compares pairs of hierarchies side-by-side by finding correspondences between tree nodes mapped in sequential order (e.g., by a dendrogram). Another example is given in [HW08], where pairs of hierarchies are compared by linear (icicle) mappings with bundled connectors showing element relationships. A further example is the TimeArcTrees [GBD09] approach for comparing sequences of directed graphs. It is based on a linear mapping of nodes, a sequence of which is shown with nodes aligned for comparability.

To compare different matrix ordering solutions, we are interested to compare for dif-ferences in the positions of elements among sets of sequences. Our approach, presented in Section 3.4.2, is inspired by the Scatter Plot Matrix technique [Cha+83], allowing com-paring pairwise combinations of variables in high-dimensional data. Matrix structures have been exploited previously for comparison of relational data, e.g., in [BN11; GHS10;

SM07]. Small-multiple views of graphs for comparison based on clustering and projection have been proposed in [LGS09]. In Section 3.4.2 we present an approach that combines a matrix visualization with a custom glyph, based on a radial network layout, to compare the differences between pairs of sequences with permutations of its data elements.

Text Data Comparison. In recent years, visualization of text data has been gaining increased interest by researchers, who are developing techniques for efficient display of document collections, as well as single documents. For example, the visual analytics toolVISRA[Oel+10] combines readability feature selection with document visualization techniques based onLiterature Fingerprinting[KO07],TileBars[Hea95] andSeesoft[ESS92]

to evaluate the readability of the input text. In the domain of news analysis, several tools exist that deal with summarization and visualization of news content. Newsmap [Wes12] is a well-known treemap visualization of data gathered by Google News. Other popular news aggregators include Yahoo! News and Europe Media Monitor [EMM12]. The TextMap website, based on Lydia [LKS05], is an entity search engine, which provides information about people and places extracted from the news sources. These systems have limited visualization capabilities that would allow the user to understand the content differences among different sources that provide news reports on the same real world event. In the area of knowledge discovery and data mining, ongoing research efforts exist that deal with meme-tracking [LBK09] and refining causality [Sno+11]. In this field, the main goal is to find out how the information propagates through networks and how network processes cause a specific behavior in the network, by analyzing the appearance of short phrases in document nodes. Researchers working on web indexing and crawling have developed

Figure 3.1Tasks for the Single Matrix Analysis and Multi Matrix Analysis. Both analysis levels are naturally intertwined since e.g., ranking and clustering questions require the definition of similarity which is inherently dependent on a single matrix’s ordering.

methods for identifying near-duplicates [MJS07], i.e. redundant web documents that differ only in a small portion.