VISlib – a Reusable Source-Code Library - Point-based visualization of molecular dynamics data

and will allow for a slowly but steadily decreasing failure rate. The ideal situation is a non-periodic bathtub curve as baseline representing the effort invested into the framework which becomes stable over time. This is sketched out in the bottom diagram in Figure 80. Additional effort almost exclusively results from inserting research-grade prototype modules which obtain good stability through correct interfacing with the system. The quality of these modules is not relevant for the quality of the system as a whole. Early publications will be more expensive as the framework will still be under heavy development. However, when the system reaches a certain degree of stability and usefulness, publications will require signif-icantly decreased development effort.

Stable system architecture obviously requires some effort up front. This is why SE is not very popular: it means shifting a lot of effort to very early stages of the development process, and the ensuing positive effect takes a relatively long time to manifest. What we need is just the right dose of SE. The framework needs to be cleanly designed and developed. Many other parts, however, like well-known basic algorithms required for some ground truth and baseline, can be implemented and integrated by anyone with a decent understanding. This can even be accom-plished by involving students or interns. To control the influences of the different parts of the system, they simply need to be known first. As trivial as this sounds, this first step, installing a rudimentary configuration management, is often reduced to the simple use of version control systems for source code, e.g. Subversion¹⁰. As useful, even important, such a version control system is in general, for a software system’s development of the complexity discussed here, more elaborate configura-tion management funcconfigura-tions are required, e.g. test configuraconfigura-tion definiconfigura-tion, auto-matic testing, and approval or rejection of new code, which cannot be delivered by these tools.

4.2 VISlib – a Reusable Source-Code Library 145

quality. While public class libraries like boost aim at maximizing generality, which often comes at the cost of slightly suboptimal performance, for many software applications, like in interactive visualization, runtime performance is of utmost importance and speed-optimized implementations of classes that could also be found in common libraries are required. These implementations are not generic enough to be widely used, but they are often valuable assets for the academic group they originate from. As there usually is a strong coherence in the working areas of the members of the group, these implementations are important as basis and ground truth for on-going research work. Thus, such algorithms must not be lost in a single prototype program, but should be collected in coherent libraries. To success-fully collect and reuse valuable source code within a research group, several condi-tions must be met:

First, when working on a new visualization application, even if it is just a small prototype, the developer must be aware of the possibility of reusing code and creating reusable code from the start (i.e. following the concepts described below throughout the whole development time; not just cleaning-up later).

Second, parts that have been identified for possible reuse must be collected in the research group’s repository and therefore require some additional work: it is crucial that these implementations are encapsulated in a cleanly defined interface following the concept of information hiding [Par72] (which is inherently supported by modern programming languages) and optimally stamp coupling [PJ88] only.

Such design requires some thoughts put into them and the resulting implementa-tion is usually slightly larger than just for the present applicaimplementa-tion.

As third condition, all classes collected for reuse must follow some coding standards and, most importantly, must be reasonably documented. As it is the very nature of research groups to have high fluctuation of personnel, it is essential that the source code is readable, maintainable, self-contained and works out-of-the-box, to guarantee efficient reuse and to make the work reproducible [Wil06]. For the same reason, a complete API documentation, that particularly highlights assump-tions made for performance reasons, is indispensable. Tools like doxygen¹¹ are rea-sonable utilities in this context.

The fourth condition is to be aware of platform independency. Optimized implementations often require direct access to operating system functions, making even migrating between two Linux distributions non-trivial. Hiding different im-plementations behind interfaces greatly alleviates porting applications, which is a major problem once platform-dependent, quickly developed applications have reached some size. Implementing the same functionality for all platforms at once helps to ensure the same behaviour on all operating systems.

11 www.doxygen.org/ (last visited 30.12.2011)

Figure 81: Schematic illustration of the proposed development process: Instead of monolithic prototype implementation reusable classes are identified and integrated into a local reposi-tory. Only for these classes additional effort is required guaranteeing high source code quality, like thorough documentation.

If these conditions are met, or at least if the developers are aware of these is-sues, a light SE process to only engineer classes to be reused when necessary can be installed. The process is illustrated in Figure 81. Increased development efforts (green boxes) are only required for identifying reusable parts and when writing the reusable code. This effort decreases in time as the number of classes to be added to the repository also decreases. Establishing such a process across (competing) insti-tutes would require extensive management effort and will incur problematic as-pects. As much as this would be beneficial for all, it is not required. Instead each research group can set up such an internal repository.

This class repository must be organised sensibly to aid the whole process.

Otherwise it will not be more than a mere storage of source code text files. If struc-tured properly, a central storage point and documented classes also implement a self-organizing knowledge management supporting the on-going development. To do so, the source repository should be divided into multiple libraries with clean dependencies. A base library is required to mostly deal with fundamental data structures, which can be used for data exchange through the interfaces of further reusable modules. At least facade definitions are required here. The remaining classes can be designed in tiers above this core, which are all interoperable, but become more and more specialized for the specific task. Thus, the whole library is organized into modules of high cohesion which can be used independently.

4.2 VISlib – a Reusable Source-Code Library 147

Figure 82: Module structure of the VISlib. The modules, which are implemented as static libraries, are organized in three tiers building-up on each other.

Within the VIS research group such a library, namely the VISlib, has been in-stalled and has proven to be practical in everyday work. The library was started by collecting valuable code pieces from graduate students of the group. After the inter-faces have been made coherent and comprehensively documented, these classes form the inner core of the library, called base. Some example classes of this module are String, Exception, Array, RawStorage, Pair, SmartPtr, and others. These classes serve as base classes for derived implementations as well as basic data types. For example, all exceptions thrown from implementations within the library are de-rived from Exception, which provides basic operations to get the exception’s mes-sage and origin. RawStorage is a basic type capsuling an unformatted memory block.

It, however, provides convenience methods for byte-addressable but typed access to sub-portions of the block, as well as content-preserving size changes. Although all of these classes are, in some form, available in other libraries, these base definitions are still required for consistent and interoperable interfaces throughout the library.

The structure of the whole library can be seen in Figure 82. The individual modules of the VISlib are organized in three tiers, which get, from left to right, more and more specific. Tier 1 only contains base, as this module defines all required types and utility functions.

Building on this base module, four further modules are constructed in tier 2:

sys, providing platform-independent interfaces to operating-system-specific im-plementations, like Threads, their synchronization mechanisms, e.g. CriticalSection or Events, Files, including optimized memory mapped data transfer (Mem-mappedFile), or PerformanceCounter to access a high-precision clock. All these clas-ses are utility clasclas-ses only wrapping the corresponding calls to the operating sys-tem functions. The main problem is providing the same functionality on all sup-ported platforms, in this case, Windows and several Linux distributions. Sometimes, only the commonly available functions are implemented and some functions will throw NotImplementedException or NotSupportedException on some operating systems (which is documented). Some classes, like RegistryKey, are only available on

some operating systems (Windows only in this case). Closely related is the net module, which contains corresponding classes for low-level network communica-tion, e.g. a Socket interface definition compatible with IPv4, IPv6, and InfiniBand.

Utility classes to access DNS, implementing a simple TcpServer or a SimpleMessage increase the usefulness of this module.

Another second-tier module is the math library, which contains math-related classes in the context of visualization and computer graphics. These range from simple classes like Vector, Point, Quaternion, Matrix, Line, Cuboid, etc. to more sophisticated and specialized classes like FastMap (dimensionality reduction), ForceDirected (graph layout) or pcautil (collection of utility functions for principal components analysis). All classes are implemented as templates, allowing for dif-ferent type instantiations (i.e. most of the time float or double, sometimes also integral types) as well as for different memory layout and storage: e.g. a Matrix can be stored row-major or column-major to make its internal representation match requirements of API functions of other libraries. All basic types, e.g. Vector or Point, can be assigned raw memory pointers, which will then be interpreted accordingly.

The graphics module is the last second-tier module. This is a collection of utility classes and interfaces for generic drawing operations. The two third-tier modules building on graphics are gl and d3d, which specialise the provided classes either for OpenGL or for DirectX. Apart from type classes, like ColourRGBAu8, and utilities, like FpsCounter or Cursor2DRectLasso, two of the probably most important classes are Camera and CameraParameters. The latter one stores parameters of a camera in 3D space, like position, direction, and field-of-view, including advanced parameters for stereo and Powerwall support, like image-space tiling or stereo disparity. The Camera class acts as facade object to this parameter object and pro-vides operations to evaluate these values. E.g. providing a scene bounding box, the camera can automatically adjust its near and far clipping planes accordingly. The two derived camera classes in the gl and d3d modules are able to compute matrices from the camera settings in the corresponding memory layouts. CameraParameters allows for easy serialization of its content, provides update events, and can be nected to classes derived from AbstractCameraController to be changed in a con-sistent way. On example of such a controller is the CameraRotate2D which trans-lates movements of a normal mouse cursor to rotations of the camera. Instead of the usual monolithic camera interaction codes usually found in research prototypes, a programmer can compose many interaction patterns by simply connecting the corresponding classes, and essentially only putting mouse coordinates into the system.

The last module of the library which will be discussed here is cluster, a tier three module building upon net and containing special utility functions to support distributed rendering on CPU compute clusters. The most important classes here implement a simple message passing across cluster nodes, implement rudimentary parallel rendering for OpenGL and Direct3D, and provide an udp-based Discov-eryService to collect cluster nodes on a software service-based search.

Im Dokument Point-based visualization of molecular dynamics data sets (Seite 144-149)