1.6 Final Thoughts: 1 Parallel Programming Models 49
1.6 Final Thoughts: 1 Parallel Programming Models 49
The distributed shared-memory programming models are best when long-term avail-
ability is possible and there is an appropriate match to the algorithms or applications
involved. Clearly, programming models and their associated execution models will
have to evolve to be able to reach sustained petaflops levels of computing, which will
in time move to computational resources known as clusters.
Finally, we will put together a series of examples of “working” code for many
of the programming models discussed in this chapter. These will be designed around
small computational kernels or simple applications in order to illustrate each model.
The examples will be available at the Center for Programming Models for Scalable
Parallel Computing website13 .
Acknowledgments
This work was supported by the Mathematical, Information, and Computational Sci-
ences Division subprogram of the Office of Advanced Scientific Computing Re-
search, Office of Science, U.S. Department of Energy, under Contract W-31-109-
ENG-38 with Argonne National Laboratory and under Contract W-7405-ENG-82 at
Ames Laboratory. The U.S. Government retains for itself, and others acting on its be-
half, a paid-up, non-exclusive, irrevocable worldwide license in said article to repro-
duce, prepare derivative works, distribute copies to the public, and perform publicly
and display publicly, by or on behalf of the Government. We thank the members of
the Center for Programming Models for Scalable Parallel Computing [8] who have
helped us better understand many of the issues of parallel software development and
the associated programming models. We thank Brent Gorda, Angie Kendall, Gail W.
Pieper, Douglas Fuller, and Professor Gary T. Leavens for reviewing the manuscript.
We also thank the book series editors and referees for their many helpful comments.
References
1. Alphaserver SC user guide, 2000. Bristol, Quadrics Supercomputer World Ltd.
2. R. Armstrong, D. Gannon, A. Geist, K. Keahey, S. R. Kohn, L. McInnes, S. R. Parker,
and B. A. Smolinski. Toward a common component architecture for high-performance
scientific computing. In Proceedings of the 8th High Performance Distributed Computing
(HPDC’99), 1999. URL: http://www.cca-forum.org.
3. S. Balay, W. D. Gropp, L. C. McInnes, and B. F. Smith. PETSc users manual. Technical
Report ANL-95/11 - Revision 2.1.0, Argonne National Laboratory, 2001.
4. R. Bariuso and A. Knies. SHMEM’s User’s Guide. SN-2515 Rev. 2.2, Cray Research,
Inc., Eagan, MN, USA, 1994.
5. M. Bull. OpenMP 2.5 and 3.0. In Proceedings of the Workshop on OpenMP Applications
and Tools, WOMPAT 2004, Houston, TX, May 17-18 2004. (Invited talk).
6. P. M. Burton, B. Carruthers, G. S. Fischer, B. H. Johnson, and R. W. Numrich. Converting
the halo-update subroutine in the MET Office unified model to Co-Array Fortran. In
13
At this URL: http://www.pmodels.org/ppde.
1 Parallel Programming Models 51
24. M. Folk, A. Cheng, and K. Yates. HDF5: A file format and I/O library for high perfor-
mance computing applications. In Proceedings of Supercomputing’99 (CD-ROM). ACM
SIGARCH and IEEE, Nov. 1999.
25. FORTRAN 77 Binding of X3H5 Model for Parallel Programming Constructs. Draft Ver-
sion, ANSI X3H5, 1992.
26. P. C. Forum. PCF Parallel FORTRAN Extensions. FORTRAN Forum, 10(3), September
1991. (Special issue).
27. Global Array Project. URL: http://www.emsl.pnl.gov/docs/global.
28. W. D. Gropp. Learning from the success of MPI. In B. Monien, V. K. Prasanna, and S. Va-
japeyam, editors, High Performance Computing – HiPC 2001, number 2228 in Lecture
Notes in Computer Science, pages 81–92. Springer, Dec. 2001.
29. W. D. Gropp, S. Huss-Lederman, A. Lumsdaine, E. Lusk, B. Nitzberg, W. Saphir, and
M. Snir. MPI—The Complete Reference: Volume 2, The MPI-2 Extensions. MIT Press,
Cambridge, MA, 1998.
30. W. D. Gropp, E. Lusk, and A. Skjellum. Using MPI: Portable Parallel Programming with
the Message Passing Interface, 2nd edition. MIT Press, Cambridge, MA, 1999.
31. W. D. Gropp, E. Lusk, and T. Sterling, editors. Beowulf Cluster Computing with Linux.
MIT Press, 2nd edition, 2003.
32. W. D. Gropp, E. Lusk, and R. Thakur. Using MPI-2: Advanced Features of the Message-
Passing Interface. MIT Press, Cambridge, MA, 1999.
33. R. Hempel and D. W. Walker. The emergence of the MPI message passing standard for
parallel computing. Computer Standards and Interfaces, 21(1):51–62, 1999.
34. High Performance Fortran Forum. High Performance Fortran language specification.
Scientific Programming, 2(1–2):1–170, 1993.
35. J. M. D. Hill, B. McColl, D. C. Stefanescu, M. W. Goudreau, K. Lang, S. B. Rao, T. Suel,
T. Tsantilas, and R. H. Bisseling. BSPlib: The BSP programming library. Parallel Com-
puting, 24(14):1947–1980, Dec. 1998.
36. C. A. R. Hoare. Communicating sequential processes. Communications of the ACM,
21(8):666–677, Aug. 1978.
37. J. Hoeflinger. Towards industry adoption of OpenMP. In Proceedings of the Workshop
on OpenMP Applications and Tools, WOMPAT 2004, Houston, TX, May 17–18 2004.
Invited Talk.
38. F. Hoffman. Writing hybrid MPI/OpenMP code. Linux Magazine, 6(4):44–48, April
2004. URL: http://www.linux-mag.com/2004-04/extreme 01.html.
39. Y. Hu, H. Lu, A. L. Cox, and W. Zwaenepoel. OpenMP for networks of SMPs. In
Proceedings of the 13th International Parallel Processing Symposium, April 1999.
40. P. Hyde. Java Thread Programming. SAMS, 1999.
41. IEEE Standard for Information Technology-Portable Operating System Interface
(POSIX). IEEE Standard No.: 1003.1, 2004.
42. W. Jiang, J. Liu, H.-W. Jin, D. K. Panda, W. D. Gropp, and R. Thakur. High performance
MPI-2 one-sided communication over InfiniBand. Technical Report ANL/MCS-P1119-
0104, Mathematics and Computer Science Division, Argonne National Laboratory, 2004.
43. G. Jost, J. Labarta, and J. Gimenez. What multilevel parallel programs do when you are
not watching: A performance analysis case study comparing MPI/OpenMP, MLP, and
nested OpenMP. In Proceedings of the Workshop on OpenMP Applications and Tools,
WOMPAT 2004, pages 29–40, Houston, TX, May 17-18 2004. (Invited talk).
44. P. Keleher, A. L. Cox, S. Dwarkadas, and W. Zwaenepoel. TreadMarks: Distributed
shared memory on standard workstations and operating systems. In Proceedings of the
Winter 94 Usenix Conference, pages 115–131, January 1994.
1 Parallel Programming Models 53