Introduction

The PSBLAS library, developed with the aim to facilitate the parallelization of computationally intensive scientific applications, is designed to address parallel implementation of iterative solvers for sparse linear systems through the distributed memory paradigm. It includes routines for multiplying sparse matrices by dense matrices, solving block diagonal systems with triangular diagonal entries, preprocessing sparse matrices, and contains additional routines for dense matrix operations. The current implementation of PSBLAS addresses a distributed memory execution model operating with message passing.

The PSBLAS library is internally implemented in the Fortran 95 [14] programming language, with reuse and/or adaptation of some existing Fortran 77 software, and a handful of C routines. A similar approach has been advocated by a number of authors, e.g. [13]. Moreover, the Fortran 95 facilities for dynamic memory management and interface overloading greatly enhance the usability of the PSBLAS subroutines. In this way, the library can take care of runtime memory requirements that are quite difficult or even impossible to predict at implementation or compilation time. In the current release we rely on the availability of the so-called allocatable extensions, specified in TR 15581. Strictly speaking they are outside the Fortran 95 standard; however they have been included in the Fortran 2003 language standard, and are available in practically all Fortran 95 compilers on the market, including the GNU Fortran compiler from the Free Software Foundation (as of version 4.2). The presentation of the PSBLAS library follows the general structure of the proposal for serial Sparse BLAS [7,8], which in its turn is based on the proposal for BLAS on dense matrices [12,4,5].

The applicability of sparse iterative solvers to many different areas causes some terminology problems because the same concept may be denoted through different names depending on the application area. The PSBLAS features presented in this document will be discussed referring to a finite difference discretization of a Partial Differential Equation (PDE). However, the scope of the library is wider than that: for example, it can be applied to finite element discretizations of PDEs, and even to different classes of problems such as nonlinear optimization, for example in optimal control problems.

The design of a solver for sparse linear systems is driven by many
conflicting objectives, such as limiting occupation of storage
resources, exploiting regularities in the input data, exploiting
hardware characteristics of the parallel platform. To achieve an
optimal communication to computation ratio on distributed memory
machines it is essential to keep the *data locality* as high as
possible; this can be done through an appropriate data allocation
strategy. The choice of the preconditioner is another very important
factor that affects efficiency of the implemented application. Optimal
data distribution requirements for a given preconditioner may conflict
with distribution requirements of the rest of the solver. Finding the
optimal trade-off may be very difficult because it is application
dependent. Possible solutions to these problems and other important
inputs to the development of the PSBLAS software package have come from
an established experience in applying the PSBLAS solvers to
computational fluid dynamics applications.