An overview of existing numerical linear algebra libraries

A great number of libraries for linear algebra have been developed during the last decades. One of the first ones were theBasic Linear Algebra Subprograms (BLAS), [LRKK79], for dense matrices and

2although there are efforts to emulate this conformity relationship in C++, see [MS00]

vectors in Fortran (and later on in C). BLAS is optimized forsaxpy (scalar alpha x plus y) operations like

z:=αx+y, α∈R, x, y, z ∈Rⁿ, (2.2) orgaxpy (generalized A x plus y) whereα is replaced by an×nmatrixA. Sparse matrix libraries like the NISTSparse BLAS, [RP] for C andSPARSKIT [Saa90] for Fortran were the next step of evolution.

They offer compressed storage formats for sparse matrices and several numerical methods based upon these data structures. The Fortran package SMMP [BD93] e.g. even offers sparse matrix-matrix multiplication for the compressed row storage format.

These libraries were built for special purposes, and their performance can hardly be beat by other approaches. However, they are not suitable for all applications, because of their rather monolithic design, and their algorithms being restricted to a single type of data structure only. Extensions and modifications were difficult and hardly possible without rewriting large parts of the code.

For example, the CBLAS library, which is an implementation of BLAS for C, comes with nearly every source file in four variants: one for single precision real values, one for double precision real values, one for single precision complex values and one for double precision complex values. This replication of source text bloats up the code and is prone to errors. Mind also that changes to just one algorithm that is equivalent for all these types (whether it be single or double, real or complex) must be made ineach of those files. This is of course due to the lack of native generic programming in C, which in C++ simply can be realized by templates.

In the 1990’s, with the appearance of C++ and its great success, also new numerical libraries have been devised. Packages likeLAPACK++(Linear Algebra PACKage in C++, [DPW93]),SparseLib++

[DLN⁺94] and MV++ [Poz97] were the first steps towards object-oriented numerics. However, they mostly were mere translations from their C predecessors, only encapsulating the functionality in C++

classes. Hardly any of these made use of the new C++ features like templates, operator overloading or iterators.

It was only the next generation of numerical linear algebra libraries that incorporated these tech-niques. Some of them shall be presented in the following.

Blitz++

The Blitz++ library [Vel98], [Vel01] offers multidimensional arrays, including vectors, matrices and tensors. It is restricted to dense arrays, so sparse matrices are not supported. Nevertheless it imple-ments a very interesting approach to handle arbitrary expressions of arrays, such as

z:=−αv+w−x+βy, α, β ∈R, v, w, x, y, z ∈Rⁿ. (2.3) Carrying out this operation with normal BLAS operations would consume three consecutive calls to a saxpy function like in (2.2) or a specialized function. In C++ with overloading the operators *, + and -we were able to write without any further circumstances

z = -alpha * v + w - x + beta * y;

However, this would result in the need for five temporary vectors, and five loops being generated.

This is the reason why conventional approaches in C++ are so slow compared to Fortran. The classical solution to this problem (in Fortran and C) would need specialized functions for each possible expression. This is the reason, why in Blitz++ theexpression template technique is used.

The idea is that the result of an operation like x+y is not a vector, but an instance of a new type, a class template that models a vector expression. This type just stores the expression tree and provides a member function likeoperator[](int i), which evaluates the expression at entry i. No assignment and no computation is done until the program execution reaches the assignment operator

=. Then only one loop is generated, which iterates over all entries i of the expression calling the operator []each time. Obviously, no temporary vectors need to be introduced.

Expression templates were pioneered by Todd Veldhuizen [Vel95] and have proved to be nearly as fast as Fortran codes, sometimes (for small vector sizes) even faster. In any case, this programming technique offers a greater flexibility.

The lack of sparse matrix formats forbids the use of Blitz++ for our intention of implementing algebraic multigrid. However, it provides an easy interface and powerful data structures for finite differences and structured grids (stencils). This is due to the simplifications that can be applied in these cases. Several other libraries and packages like EXPDE (see [Pfl01]) orPOOMAgo in the same direction.

POOMA

POOMA [Old02] is a toolkit that was developed at the Los Alamos National Library, and which, like Blitz++ targets dense vector, matrix and tensor computations. It also offers data structures that contain grids for finite difference methods. Sparse matrix formats and linear algebra methods are not directly supported.

Its main benefit however is the easy and automatic parallelization. Containers (for vectors, etc.) can be split into patches and then be distributed over several processors.

POOMA also uses intensively techniques like template metaprogramming to supply the compiler with as much information as possible already at compile time. The included Portable Expression Template Engine framework (PETE, [CCH⁺]) implements expression templates for POOMA’s data structures. PETE can also be used to add expression template functionality to other array/vector classes.

TNT

The Template Numerical Toolkit (TNT, [Poz02]) is intended to be the successor of LAPACK++ and SparseLib++. It offers some dense matrix and vector formats, whereas the support for sparse matrices is only rudimentary yet. Until now, templates were used only for specifying the entry type, and since no iterators were used for abstraction, algorithms have to be written for every container type separately.

Moreover, sparse matrix-matrix multiplication is not supported.

MTL

TheMatrix Template Library (MTL, [Sie99]) by Jeremy Siek surely is one of the most modern matrix libraries available. It offers several types of matrices, including dense, banded and sparse. The user can choose the type of matrix by specifying four template parameters: element type, shape, storage and orientation. For example, the line

typedef matrix<double,rectangle<>,compressed<>,row major>::type SparseMatrix;

defines SparseMatrixto be a type that stores a rectangular sparse matrix with double precision real entries which are stored in the classical compressed row storage format.

Since the element type can be chosen via template parameter, and the data structure does not have any requirements concerning the element type, one is not restricted to simple scalar types like double orcomplex<double>, one may also use a matrix type here, which would lead to a matrix of matrices.

As Blitz++, the MTL makes intensive use of template metaprogramming which can be seen as an own language on template level, and which allows the user to specify every option that is known at compile time, and then in turn allows the compiler to optimize the according data structures and algorithms, see also Section 3.5

Furthermore, the MTL consequently separates data structures and algorithms, using iterators as an intermediate abstraction layer.

With the MTL comes the Iterative Template Library (ITL) that supplies the user with several numerical linear algebra methods like Conjugate Gradient, GMRES, BiCGStab, etc.

Having all these advantages in mind, there are a few things to criticize. The construction of the matrix classes with generators and the choice of several parameters shadows that, for example the compressed row/column format is mainly implemented by thegeneric comp2D<>class which in turn is quite large and monolithic and seems to be built only for this single purpose, and is therefore hardly reusable.

Another issue is the lack of possibilities to add additional functionality to existing data structures.

For example, we would like to add a view of the matrix that in rowionly gives us the strong neighbours Si (see Definition 5.2.22).

Furthermore, the sparse matrix-matrix multiplication that the MTL provides is not suited for our purposes, since it requires the size of the resulting matrix to be known in advance. Otherwise it returns an error, at least in the current implementation.

Boost uBLAS

TheBoost [Boo] project was started by members of the C++ Standards Committee Library Working Group as an extension of the Standard Template Library (STL). Nowadays, according to their web site, thousands of programmers contribute to it. It is a collection of several libraries that are built for a wide range of purposes, among which are e.g. graph data structures and algorithms, quarternions, regular expressions, etc. The focus of Boost is however the generic programming paradigm – most libraries intensively use the template facilities of C++.

The Boost library for numerical linear algebra is theuBLAS [uBL] library. It was mainly designed to implement the BLAS functionality on C++ level, with extensions such as several matrix types (such as sparse, symmetric, banded, triangular matrices, etc.). Expression templates are used to efficiently evaluate vector and matrix expressions.

However, its algorithms only allow matrices and vectors to be filled with scalar values. The according traits only cooperate with the C/C++ builtin types and refuse to work when matrices or vectors are in turn used as entry types.

Looking at uBLAS from a software design standpoint, one still observes very little code reusage.

For example, iterator classes are defined for each matrix class separately. Moreover, every matrix type is an own class repeating standard member functions.

Im Dokument Generic Programming and Algebraic Multigrid for Stabilized Finite Element Methods (Seite 17-20)