• Keine Ergebnisse gefunden

CET : A Tool for Creative Exploration of Graphs

N/A
N/A
Protected

Academic year: 2022

Aktie "CET : A Tool for Creative Exploration of Graphs"

Copied!
4
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

CET: A Tool for Creative Exploration of Graphs

Stefan Haun1, Andreas Nurnberger1, Tobias Kotter2,

Kilian Thiel2, and Michael R. Berthold2

1 Data and Knowledge Engineering Group,

Faculty of Computer Science, Otto-von-Guericke- University, Germany http://www.findke.ovgu.de

2 Nycomed-Chair for Bioinformatics and Information Mining University of Konstanz, Germany

http://www.inf.uni-konstanz.de/bioml

Abstract. We present a tool for interactive exploration of graphs that integrates advanced graph mining methods in an interactive visualization framework. The tool enables efficient exploration and analysis of comple~

graph structures. For flexible integration of state-of-the-art graph mining methods, the viewer makes use of the open source data mining platform KNIME. In contrast to existing graph visualization interfaces, all parts of the interface can be dynamically changed to specific visualization re- quirements, including the use of node type dependent icons, methods for a marking if nodes or edges and highlighting and a fluent graph that allows for iterative growing, shrinking and abstraction of (sub )graphs.

1 Introduction

Today's search is still concerned mostly with keyword-based searches and the closed discovery of facts. Many tasks, however, can be solved by mapping the underlying data to a graph structure and searching for structural features in a network, e.g. the connection between certain pages in Wikipedia 1 or the en- vironment of a specific document. Exploring a hyperlink structure in a graph representation enables these tasks to be fulfilled much more efficiently. On the other hand, graph visualization can handle quite large graphs, but is rather static, i.e. the layout and presentation methods calculate the graph visualiza- tion once and are well suited for interactions, such as adding or removing nodes.

One of the famous graph layout methods, the Spring Force Layout, can yield very chaotic results when it comes to small changes in the graph, leading to a completely different layout if just one node is removed. Since a user relies on the node positions during interaction with the graph, such behavior is not desirable.

With the Creative Exploration Toolkit (CET), we present a user interface with several distinct features:

support of interactive graph visualization and exploration, integration of a modular open source data analytics system,

1 http:\www.wikipedia.org

Ersch. in: Machine learning and knowledge discovery in databases : European Conference, ECML PKDD 2010, Barcelona, Spain, September 20-24, 2010, Proceedings, Part III / ed. by José Luis Balcázar ... - Berlin [u.a.] : Springer, 2010. - S. 587-590. - (Lecture Notes in Computer Science ; 6323). - ISBN

978-3-642-15938-1

Konstanzer Online-Publikations-System (KOPS) URN: http://nbn-resolving.de/urn:nbn:de:bsz:352-236936

(2)

588

- easy configuration to serve specific user requirements.

In the following sections, we will describe these features in more detail.

2 The Creative Exploration Toolkit

The Creative Exploration Toolkit (CET) is the user interface that visualizes the graph and allows interaction. The global design, shown in Figure 1, consists of

- a dashboard at the top, where the controls are located,

- a logging area, below, to show information on running processes and the tool status,

- a sidebar on the right which displays detailed information about a node, - and the workspace in the center, which is used for visualization.

We currently use the Stress Minimization Layout [3] to determine the initial graph layout, which enables the user to interact with the graph: Nodes can be moved to create certain arrangements, nodes can be selected, and nodes can be expanded by double-clicking them. Additionally, the user may issue keyword- based queries. The corresponding results consists of graphs and can be visualized as well. Subsequent query results are added to the graph, enabling the user to explore the graph itself and the structures between the query results.

While CET takes care of graph visualization and presentation, special seman- tics are not supported. For example, the shortest path between two nodes is displayed by highlighting all nodes on the path. However, the user interface is not aware of the path-property, but only displays the highlight attribute of the

.,.

... -

Fig. 1. Screenshot of the Creative Exploration Toolkit (CET)

(3)

589

nodes, while the actual calculation takes place in the underlying data analy- sis platform described in the next section. The user interface is therefore very flexible when it comes to tasks from different domains.

As described in the next section, any KNIME workflow may be called from inside the user interface. However, this is also meant for design and development purposes. Finalized workflows can be integrated more directly into the UI to provide a more natural and convenient user experience. In our current setup, we integrate calls to a shortest path calculation and queries to the BioMine2 database. More workflow and interaction schemes will follow in the course of future work.

3 The KNIME Information Mining Platform

KNIME [1], the Konstanz Information Miner, was initially developed by the Chair for Bioinformatics and Information Mining at the University of Konstanz, Germany. KNIME is released under an open source license (GPL v3) and can be downloaded free of charge3. KNIME is a modular data exploration platform that enables the user to visually create data flows (often referred to as pipelines), selectively execute some or all analysis steps, and later investigate the results through interactive views on data and models. The KNIME base version al- ready incorporates hundreds of processing nodes for data I/O, preprocessing and cleansing, modeling, analysis and data mining as well as various interactive views, such as scatter plots, parallel coordinates and others. It integrates all analysis modules of the well known Weka data mining environment and addi- tional plugins allow, among others, R-scripts4 to be run, offering access to a vast library of statistical routines. Within the frame of the EU FP7 project "BISON", KNIME was extended to also allow the flexible processing of large graphs. Com- bined with the already existing nodes, KNIME can therefore be used to model complex network processing and analysis tasks.

CET offers a very generic access to KNIME, enabling the user to make arbi- trary calls without adapting the user interface. CET can be configured to directly call a KNIME workflow via a pre-configured button. CET also provides a list of all available workflows plus a list of parameters for a selected work, which can be edited by the user. Essentially, all information that would be sent by the user interface can be provided to start a KNIME workflow. The result is then visualized in the graph. New analysis methods can therefore be integrated easily into CET by simply adding a new workflow providing the corresponding functionality.

Figure 2 shows an example of a workflow computing the network diameter.

In this workflow, firstly all nodes with a certain feature value are filtered, i.e. to take only those into account that have been selected and marked by the user.

Secondly degree filters are applied on nodes and edges to filter unconnected

2 http:\www.cs.helsinki.fi\group\biomine\

3 http:\www.knime.org

4 http:\www.r-project.org

(4)

590

Feat ... e Value

Re:cetve Btsotlllet Filter r-tode Deoree fHter fdoe Degree filter

send BiloNet

shortest path ~

Fig. 2. An example KNIME workflow for calculating the network diameter which is called from CET

nodes. The shortest paths of all node pairs are subsequently computed and a feature is assigned consisting of the path length to those nodes of the longest of shortest paths. Finally the graph is sent back to the CET.

4 Conclusion and Future Work

We demonstrate a novel user interface for generic graph visualization with spe- cial emphasis on extensibility by integration with data and graph analysis. The presented interface allows for easy interaction with the visualized graphs. This setup is particularly interesting for researchers in the area of Data Mining and Network Analysis, as it is very simple to plug in new approaches and visualize the results, even if there is interaction involved.

Extensions of the CET aim towards the integration of more workfiows, thus adding to the available interaction and analysis features. We will also further improve graph visualization by incorporating constraint-based graph layout (for a first discussion see [2]).

Acknowledgement. The work presented here was supported by the European Commission under the 7th Framework Programme FP7-ICT-2007-C FET-Open, contract no. BISON-211898.

References

1. Berthold, M.R., Cebron, N., Dill, F., Gabriel, T.R., Kotter, T., Meinl, T., Ohl, P., Sieb, C., Thiel, K., Wiswedel, B.: KNIME: The Konstanz Information Miner. In:

Data Analysis, Machine Learning and Applications - Proceedings of the 31st Annual Conference of the Gesellschaft fur Klassifikation e:V., Studies in Classification, Data Analysis, and Knowledge Organization, pp. 319-326. Springer, Heidelberg (2007) 2. Haun, S., Nitsche, M., Nurnberger, A.: Interactive Visualization of Continuous Node

Features in Graphs. In: Proc. of Workshop on Explorative Analytics of Information Networks (EIN), part of ECML/PKDD 2009, pp. 98-106 (2009)

3. Koren, Y., Qivril, A.: The Binary Stress Model for Graph Drawing. In: Tollis, I.G., Patrignani, M. (eds.) GD 2008. LNCS, vol. 5417, pp. 193-205. Springer, Heidelberg (2009)

Referenzen

ÄHNLICHE DOKUMENTE

The platform used in the DR project has the ability to do this; as can be seen by the spatial data, such as roads and cadastral that were displayed together with the attributed

As to the resources used to conduct the research, the key resources employed in the qualitative systematic analysis method are the Framework Directive, 77 the General Data

Figure 6.8 shows the example control flow graph after the Bezier algorithm made the polyline between B0 and B1 look smooth..

Figure 1: Different edge aggregation methods applied to K3,3: (a) the node link diagram, (b) a hierarchical method, (c) density based edge rendering, (d) force directed edge

We present the Creative Exploration Toolkit (CET), which consists of a state-of-the-art user interface for graph visu- alization designed towards explorative tasks and support tools

In order to analyze the proposed method we have visualized the performance of a bicyclist on a course (Schienerberg) with different parameters, see Fig.. Interac- tive variation

To achieve these ideas, we selected KEGG as our data resource, which had information on genes and pathways. gPathways is a resulting application which, allows user to browse

We present the results of a collaboration of visualization experts and computational linguists which aimed at the re-design of the visualization component in the Web user