Cannon's algorithm
In computer science, Cannon's algorithm is a distributed algorithm for matrix multiplication for two-dimensional meshes first described in 1969 by Lynn Elliot Cannon.[1][2]
It is especially suitable for computers laid out in an N × N mesh.[3] While Cannon's algorithm works well in homogeneous 2D grids, extending it to heterogeneous 2D grids has been shown to be difficult.[4]
The main advantage of the algorithm is that its storage requirements remain constant and are independent of the number of processors.[2]
The Scalable Universal Matrix Multiplication Algorithm (SUMMA)[5] is a more practical algorithm that requires less workspace and overcomes the need for a square 2D grid. It is used by the ScaLAPACK, PLAPACK, and Elemental libraries.
Algorithm overview
When multiplying two N×N matrices A and B, we need N×N processing nodes P arranged in a 2d grid. Initially pi,j is responsible for ai,j and bi,j.
row i of matrix a is circularly shifted by i elements to the left. col j of matrix b is circularly shifted by j elements up. Repeat n times: p[i][j] multiplies its two entries and adds to running total. circular shift each row of a 1 element left circular shift each col of b 1 element up
See also
References
- ↑ Lynn Elliot Cannon, A cellular computer to implement the Kalman Filter Algorithm, Technical report, Ph.D. Thesis, Montana State University, 14 July 1969.
- 1 2 Gupta, H.; Sadayappan, P.: Communication Efficient Matrix-Multiplication on Hypercubes, dbpubs.stanford.edu
- ↑ 4.2 Matrix Multiplication on a Distributed Memory Machine, www.phy.ornl.gov
- ↑ Ph.D. Research, graal.ens-lyon.fr. The thesis itself is not available from the archived link.
- ↑ Robert A. van de Geijn and Jerrell Watts, SUMMA: scalable universal matrix multiplication algorithm, Concurrency: Practice and Experience. Volume 9, Issue 4, pages 255–274, April 1997.