mumpsuserguide资源-CSDN下载

需积分: 10 71 浏览量 2013-07-30 09:39:03 上传评论收藏 515KB PDF 举报

标题中的"Mumps user guide"表明本文档是关于MUMPS（MUltifrontal Massively Parallel Solver）软件包的用户指南。MUMPS是一个用于解决大型稀疏线性方程组的软件包，特别适用于并行计算环境。它支持多种编程语言接口，包括Fortran90和C语言。在描述中提到的"主要用于大型稀疏矩阵"，强调了MUMPS在处理大型稀疏矩阵问题上的主要用途。大型稀疏矩阵是指矩阵中大部分元素为零的矩阵，常见于工程、科学计算、网络分析等领域。MUMPS通过专门的算法和数据结构，有效地解决了传统直接求解方法对于这类矩阵求解效率低下的问题。标签"Mumps c++ guide"意味着本文档将提供MUMPS软件包在C++环境下的使用指南，虽然主要内容是基于Fortran90和C语言的接口，但C++用户可以通过相应的封装或接口函数使用MUMPS的功能。 MUMPS4.9.2的用户指南中详细描述了数据结构、参数、调用序列和错误诊断信息。它还包含使用MUMPS的示例程序，以帮助用户理解和掌握如何在实际问题中应用该软件。 MUMPS的主要功能包括： 1. 输入矩阵结构：MUMPS支持用户指定矩阵的稀疏结构，这是求解稀疏矩阵问题的前提条件。 2. 预处理功能：在求解之前进行必要的预处理，可以改善求解过程的稳定性和效率。 3. 后处理设施：求解之后进行的一系列操作，如结果的检验和优化。 4. 解转置系统：在某些情况下，可能需要求解矩阵的转置系统的解，MUMPS支持这一功能。 5. 降维/压缩问题：在接口上使用Schur补数和减少/压缩的右侧项，可以处理特定类型的矩阵问题。 6. 算术版本：MUMPS提供了不同算术版本，以支持不同的计算精度需求。 7. 工作主机处理器：定义了在并行计算环境中，如何选择和使用工作主机处理器。 8. 顺序版本：除了并行版本外，MUMPS还提供了顺序执行的版本。 9. 共享内存版本：支持在共享内存架构上的并行计算。 10. 外存设施：对于超大规模的稀疏矩阵，MUMPS支持将矩阵数据存放在外部存储设备上，以减少内存使用。在调用顺序部分，文档描述了MUMPS中各个例程调用的顺序，以及如何根据问题的需要选择合适的例程。输入输出参数部分，文档列举了MUMPS在调用过程中的各种参数设置，包括： 1. 版本号：标识当前使用的MUMPS版本。 2. 控制三个主要阶段：分析（Analysis）、因子分解（Factorization）、求解（Solve）。 3. 并行控制：如何在并行计算环境中分配和控制计算任务。 4. 矩阵类型：定义输入矩阵的类型，如对称或非对称等。 5. 集中式组装矩阵输入：当矩阵元素完全组装在单个处理器上时的输入方法。 6. 元素矩阵输入：适用于分布式内存环境的输入方式。 7. 分布式组装矩阵输入：在多处理器之间分配矩阵组装信息的输入方法。 8. 缩放：对矩阵元素进行缩放的选项。 9. 指定排序：定义矩阵元素的排序方式。 10. 具有减少（或压缩）右侧项的Schur补数：适用于特定计算需求的参数设置。 11. 外存操作：控制是否使用外部存储，以及如何管理外部存储。 12. 工作空间参数：设置求解过程中所需的工作空间大小。 13. 右侧项：在求解线性方程组时的右侧向量参数。用户指南通过这些内容帮助用户详细理解MUMPS软件包，从而更有效地应用于科学和工程计算领域，解决大型稀疏矩阵问题。

资源推荐

资源详情

资源评论

MUltifrontal Massively Parallel Solver

(MUMPS 4.9.2)

Users’ guide

∗

November 5, 2009

Abstract

This document describes the Fortran 90 and C user interface to MUMPS 4.9.2 We describe in detail

the data structures, parameters, calling sequences, and error diagnostics. Example programs using MUMPS

are also given.

∗

Information on how to obtain updated copies of MUMPS can be obtained from the Web pages

http://mumps.enseeiht.fr/ and http://graal.ens-lyon.fr/MUMPS/

Contents

1 Introduction 4

2 Main functionalities of MUMPS 4.9.2 5

2.1 Input matrix structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.3 Post-processing facilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.4 Solving the transposed system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.5 Reduce/condense a problem on an interface (Schur complement, reduced/condensed RHS) 7

2.6 Arithmetic versions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.7 The working host processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.8 Sequential version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.9 Shared memory version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.10 Out-of-core facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 Sequence in which routines are called 9

4 Input and output parameters 11

4.1 Version number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4.2 Control of the three main phases: Analysis, Factorization, Solve . . . . . . . . . . . . . 11

4.3 Control of parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

4.4 Matrix type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4.5 Centralized assembled matrix input: ICNTL(5)=0 and ICNTL(18)=0 . . . . . . . . . . 13

4.6 Element matrix input: ICNTL(5)=1 and ICNTL(18)=0 . . . . . . . . . . . . . . . . . . 14

4.7 Distributed assembled matrix input: ICNTL(5)=0 and ICNTL(18)6=0 . . . . . . . . . . 14

4.8 Scaling: ICNTL(8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.9 Given ordering: ICNTL(7)=1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.10 Schur complement with reduced (or condensed) right-hand side: ICNTL(19), ICNTL(26) 15

4.11 Out-of-core (ICNTL(22)6= 0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.12 Workspace parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.13 Right-hand side and solution vectors/matrices . . . . . . . . . . . . . . . . . . . . . . . 18

4.14 Writing a matrix to a ﬁle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5 Control parameters 20

6 Information parameters 28

6.1 Information local to each processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

6.2 Information available on all processors . . . . . . . . . . . . . . . . . . . . . . . . . . 30

7 Error diagnostics 32

8 Calling MUMPS from C 34

8.1 Array indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

8.2 Issues related to the C and Fortran communicators . . . . . . . . . . . . . . . . . . . . 36

8.3 Fortran I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

8.4 Runtime libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

8.5 Integer, real and complex datatypes in C and Fortran . . . . . . . . . . . . . . . . . . . 37

8.6 Sequential version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

9 Scilab and MATLAB interfaces 37

10 Examples of use of MUMPS 39

10.1 An assembled problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

10.2 An elemental problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

10.3 An example of calling MUMPS from C . . . . . . . . . . . . . . . . . . . . . . . . . . 41

1 Introduction

MUMPS (“MUltifrontal Massively Parallel Solver”) is a package for solving systems of linear equations of

the form Ax = b, where A is a square sparse matrix that can be either unsymmetric, symmetric positive

deﬁnite, or general symmetric. MUMPS is direct method based on a multifrontal approach which performs

a direct factorization A = LU or A = LDL

depending on the symmetry of the matrix. We refer the

reader to the papers [3, 4, 7, 18, 19, 22, 21, 9] for full details of the techniques used. MUMPS exploits both

parallelism arising from sparsity in the matrix A and from dense factorizations kernels.

The main features of the MUMPS package include the solution of the transposed system, input of

the matrix in assembled format (distributed or centralized) or elemental format, error analysis, iterative

reﬁnement, scaling of the original matrix, out-of-core capability, detection of null pivots, basic estimate

of rank deﬁciency and null space basis, and computation of a Schur complement matrix. MUMPS offers

several built-in ordering algorithms, a tight interface to some external ordering packages such as PORD

[

27], SCOTCH [25] or METIS [23] (strongly recommended), and the possibility for the user to input

a given ordering. Finally, MUMPS is available in various arithmetics (real or complex, single or double

precision).

The software is written in Fortran 90 although a C interface is available (see Section 8). The parallel

version of MUMPS requires MPI [

28] for message passing and makes use of the BLAS [13, 14], BLACS,

and ScaLAPACK [

11] libraries. The sequential version only relies on BLAS.

MUMPS is downloaded from the web site almost four times a day on average and has been run on very

many machines, compilers and operating systems, although our experience is really only with UNIX-

based systems. We have tested it extensively on parallel computers from SGI, Cray, and IBM and on

clusters of workstations.

MUMPS distributes the work tasks among the processors, but an identiﬁed processor (the host) is

required to perform most of the analysis phase, to distribute the incoming matrix to the other processors

(slaves) in the case where the matrix is centralized, and to collect the solution. The system Ax = b is

solved in three main steps:

1. Analysis. The host performs an ordering (see Section

2.2) based on the symmetrized pattern

A + A

, and carries out the symbolic factorization. A mapping of the multifrontal computational

graph is then computed, and symbolic information is transferred from the host to the other

processors. Using this information, the processors estimate the memory necessary for factorization

and solution.

2. Factorization. The original matrix is ﬁrst distributed to processors that will participate in the

numerical factorization. Based on the so called elimination tree [

24], the numerical factorization

is then a sequence of dense factorization on so called frontal matrices. The elimination tree also

expresses independency between tasks and enables multiple fronts to be processed simultaneously.

This approach is called multifrontal approach. After the factorization, the factor matrices are kept

distributed (in core memory or on disk); they will be used at the solution phase.

3. Solution. The right-hand side b is broadcasted from the host to the working processors that

compute the solution x using the (distributed) factors computed during factorization. The solution

is then either assembled on the host or kept distributed on the working processors.

Each of these phases can be called separately and several instances of MUMPS can be handled

simultaneously. MUMPS allows the host processor to participate to the factorization and solve phases,

just like any other processor (see Section

2.7).

For both the symmetric and the unsymmetric algorithms used in the code, we have chosen a

fully asynchronous approach with dynamic scheduling of the computational tasks. Asynchronous

communication is used to enable overlapping between communication and computation. Dynamic

scheduling was initially chosen to accommodate numerical pivoting in the factorization. The other

important reason for this choice was that, with dynamic scheduling, the algorithm can adapt itself at

execution time to remap work and data to more appropriate processors. In fact, we combine the main

features of static and dynamic approaches; we use the estimation obtained during the analysis to map

some of the main computational tasks; the other tasks are dynamically scheduled at execution time. The

main data structures (the original matrix and the factors) are similarly partially mapped during the analysis

phase.

2 Main functionalities of MUMPS 4.9.2

We describe here the main functionalities of the solver MUMPS. The user should refer to Sections

and 5 for a complete description of the parameters that must be set or that are referred to in this

Section. The variables mentioned in this section are components of a structure mumps

par of type

[SDCZ]MUMPS

STRUC (see Section 3) and for the sake of clarity, we refer to them only by their

component name. For example, we use ICNTL to refer to mumps

par%ICNTL.

2.1 Input matrix structure

MUMPS provides several possibilities for inputting the matrix. The selection is controlled by the

parameters

ICNTL(5) and ICNTL(18).

The input matrix can be supplied in elemental format and must then be input centrally on the host

(ICNTL(5)=1 and ICNTL(18)=0). For full details see Section

4.6. Otherwise, it can be supplied in

assembled format in coordinate form (ICNTL(5)=0), and, in this case, there are several possibilities (see

Sections

4.5 and 4.7):

1. the matrix can be input centrally on the host processor (ICNTL(18)=0);

2. only the matrix structure is provided on the host for the analysis phase and the matrix entries are

provided for the numerical factorization, distributed across the processors:

• either according to a mapping supplied by the analysis (ICNTL(18)=1),

• or according to a user determined mapping (ICNTL(18)=2);

3. it is also possible to distribute the matrix pattern and the entries in any distribution in local triplets

(ICNTL(18)=3) for both analysis and factorization (recommended option for distributed entry).

By default the input matrix is considered in assembled format (ICNTL(5)=0) and input centrally on

the host processor (ICNTL(18)=0).

2.2 Preprocessing

A range of symmetric orderings to preserve sparsity is available during the analysis phase. In addition

to the symmetric orderings, the package offers pre-processing facilities: permuting to zero-free diagonal

and prescaling. When all preprocessing options are activated, the preprocessed matrix A

preproc

that will

be effectively factored is :

preproc

= P D

A Q

, (1)

where P is a permutation matrix applied symmetrically, Q

is a (column) permutation and D

and D

are diagonal matrices for (respectively row and column) scaling. Note that when the matrix is symmetric,

preprocessing is designed to preserved symmetry.

Preprocessing highly inﬂuences the performance (memory and time) of the factorization and solution

steps. The default values correspond to an automatic setting performed by the package which depends on

the ordering packages installed, the type of the matrix (symmetric or unsymmetric), the size of the matrix

and the number of processors available. We thus strongly recommend the user to install all ordering

packages to offer maximum choice to the automatic decision process.

• Symmetric permutation : P

The symmetric permutation can be computed either sequentially, or in parallel. The

ICNTL(28)

parameter is responsible for setting the strategy.

In the case where the symmetric permutation is computed sequentially, the ordering method is set

by the

ICNTL(7) parameter which offers a range of ordering options including the approximate

minimum degree ordering (AMD, [

2]), an approximate minimum degree ordering with automatic

quasi-dense row detection (QAMD, [

1]), an approximate minimum ﬁll-in ordering (AMF), an

ordering where bottom-up strategies are used to build separators by J¨urgen Schulze from University

of Paderborn (PORD, [

27]), the SCOTCH package [25], and the METIS package from Univ. of

Minnesota [23]. A user-supplied ordering can also be provided and the pivot order must be set by

the user in PERM

IN (see Section 4.9).

剩余45页未读，继续阅读

评论收藏

内容反馈

tomkboya

粉丝: 0

mumps user guide

最新资源

mumps user guide

cache数据库脚本语言（MUMPS）教程

MUMPS_4.10.0.tar.gz

MUMPS_5.1.2.tar.gz

Mumps很牛逼和古老的语言工具

eMcellent-toolkit:使用解析的 MMUMPS 数据执行简洁功能的工具包

An MPI-CUDA Implementation for Massively Parallel Incompressible

user guide

user-guide

IronPort user guide

TMRTool user guide

User Guide

Mumps安装包 mumps-4.10.0.tar

M(MUMPS)语言

Windows下MUMPS大型线性方程组求解器使用案例（地球物理）

Cache 数据库相关----脚本MUMPS语言

MUMPS_5.1.1.tar.gz_MUMPS_MUMPS C++

DB2 user guide

ImageJ user guide

RTX51 user' guide

vtk user's guide 1-5章

bitbake user guide

Caché数据库和M语言_mumps语言教程_ensemblecache_m语言医疗_医疗_cachem语言_

Mumps JSP taglib-开源

lpopt3.12.8+MUMPS4.10.0+metis4.0.3

MUMPS Database and Language:ANSI标准MUMPS-开源

Caché技术手册Caché Technology Guide

winedev user's guide

pringMVC user guide

multisim10 user guide

时序异常检测--指数平滑检测异常点

AI浏览器自动化插件

最新资源