(Blas Lapack) F7
(Blas Lapack) F7
MPI
Crash Course: FORTRAN 77+ Fixed Source Format
• FORTRAN 77+ is used in these notes to refer to the • FORTRAN 77+ has a strict source format known as the
dialect of FORTRAN 77 used by LAPACK and ”fixed source format” (removed in later standards)
ScaLAPACK developers. • Columns are used for different things:
– Straight FORTRAN 77 is quite arcane and most compilers have – 1: Comment column
implemented a set of extensions. Set your editor to expand
– 2-5: Label columns
• FORTRAN has been the language of choice for scientific tabs to spaces.
– 6: Continuation column Use 3 as tabstop (two tabs takes
and engineering for a long time, partly because it: – 7-72: Statement columns you to colunm 7)
– Has an extensive compiler support for multi-dimensional arrays – 73-: Truncated (silently)
– Has restrictions in the language to allow aggressive compiler Label Statements Truncated
optimizations
– Has language support (in FORTRAN 90 and onwards) for
dynamic memory management, derived types, object orientation,
operator overloading, generic interfaces, array expressions,
distributed arrays (co-arrays), etc Comment
Continuation
.NOT. !
• Tridiagonal matrices
– Diagonal Storage
Standard Packed Storage Format Rectangular Full Packed Storage Format
Matrix Indices Memory Placement Matrix Indices Memory Placement
11 * * * * * * * * 0 * * * * * * * * 11 66 76 86 96 0 9 18 27 36
21 22 * * * * * * * 1 9 * * * * * * * 21 22 77 87 97 1 10 19 28 37
31 32 33 * * * * * * 2 10 17 * * * * * * 31 32 33 88 98 2 11 20 29 38
41 42 43 44 * * * * * 3 11 18 24 * * * * * 41 42 43 44 99 3 12 21 30 39
51 52 53 54 55 * * * * 4 12 19 25 30 * * * * 51 52 53 54 55 4 13 22 31 40
61 62 63 64 65 66 * * * 5 13 20 26 31 35 * * * 61 62 63 64 65 5 14 23 32 41
71 72 73 74 75 76 77 * * 6 14 21 27 32 36 39 * * 71 72 73 74 75 6 15 24 33 42
81 82 83 84 85 86 87 88 * 7 15 22 28 33 37 40 42 * 81 82 83 84 85 7 16 25 34 43
91 92 93 94 95 96 97 98 99 8 16 23 29 34 38 41 43 44 91 92 93 94 95 8 17 26 35 44
0 1 – DESCA(1): (DTYPE) 1
DESCA(2): (CTXT) BLACS context
11 12 13 14 15 11 12 15 13 14
0 21 22 25 23 24 DESCA(3): (M) Number of rows in global matrix
21 22 23 24 25
31 32 33 34 35 51 52 55 53 54 DESCA(4): (N) Number of columns in global matrix
41 42 43 44 45 31 32 35 33 34 DESCA(5): (MB) Row blocking factor
1
51 52 53 54 55 41 42 45 43 44 DESCA(6): (NB) Column blocking factor
DESCA(7): (RSRC) Row index of owner of A(1, 1)
5 x 5 matrix, 2 x 2 blocks 2 x 2 process grid point of view DESCA(8): (CSRC) Column index of owner of A(1, 1)
DESCA(9): (LLD) Leading dimension of the local matrix
PBLAS - Example ScaLAPACK
• Parallel version of DGEMM • SCAlable LAPACK (distributed memory)
– CALL PDGEMM( TRANSA, TRANSB, – http://www.netlib.org/scalapack/ Official ScaLAPACK releases
M, N, K,
ALPHA, A, IA, JA, DESC_A,
B, IB, JB, DESC_B,
BETA, C, IC, JC, DESC_C )
• Notice:
– PBLAS has interfaces that take descriptions of submatrices
– BLAS, on the other hand, takes submatrices implicitly
Utilities: INFOG2L
• SUBROUTINE INFOG2L(GRINDX, GCINDX, DESC, NPROW, NPCOL,
MYROW, MYCOL, LRINDX, LCINDX, RSRC, CSRC)
• Given a global matrix element (GRINDX, GCINDX), returns
the corresponding local matrix element (LRINDX, LCINDX)
and coordinates of processor that owns that element
(RSRC, CSRC).
• Arguments:
– GRINDX, GCINDX Global matrix element
– DESC Descriptor of matrix
– NPROW, NPCOL Grid size
– MYROW, MYCOL My coordinates
– LRINDX, LCINDX Local matrix element (output)
– RSRC, CSRC Owner of element (output)