DragonFly On-Line Manual Pages

Search: Section:  

HPL_pdtrsv(3)                HPL Library Functions               HPL_pdtrsv(3)


HPL_pdtrsv - Solve triu( A ) x = b.


#include "hpl.h" void HPL_pdtrsv( HPL_T_grid * GRID, HPL_T_pmat * AMAT );


HPL_pdtrsv solves an upper triangular system of linear equations. The rhs is the last column of the N by N+1 matrix A. The solve starts in the process column owning the Nth column of A, so the rhs b may need to be moved one process column to the left at the beginning. The routine therefore needs a column vector in every process column but the one owning b. The result is replicated in all process rows, and returned in XR, i.e. XR is of size nq = LOCq( N ) in all processes. The algorithm uses decreasing one-ring broadcast in process rows and columns implemented in terms of synchronous communication point to point primitives. The lookahead of depth 1 is used to minimize the critical path. This entire operation is essentially ``latency'' bound and an estimate of its running time is given by: (move rhs) lat + N / ( P bdwth ) + (solve) ((N / NB)-1) 2 (lat + NB / bdwth) + gam2 N^2 / ( P Q ), where gam2 is an estimate of the Level 2 BLAS rate of execution. There are N / NB diagonal blocks. One must exchange 2 messages of length NB to compute the next NB entries of the vector solution, as well as performing a total of N^2 floating point operations.


GRID (local input) HPL_T_grid * On entry, GRID points to the data structure containing the process grid information. AMAT (local input/output) HPL_T_pmat * On entry, AMAT points to the data structure containing the local array information.


HPL_pdgesv (3). HPL 2.1 October 26, 2012 HPL_pdtrsv(3)

Search: Section: