||In many parallel programs, run-time data redistribution is usually required to enhance data locality and reduce remote memory access on the distributed memory multicomputers. For the heterogeneous computation environment, irregular data redistributions can be used to adjust data assignment. Since data redistribution is performed at run-time, there is a performance trade-off between the efficiency of the new data distribution for a subsequent phase of an algorithm and the cost of redistributing array among processors. Thus, efficient methods for performing data redistribution are of great importance for the development of distributed memory compilers for data-parallel programming languages.|
For the regular data redistribution, two approaches are presented in this dissertation, indexing approach and packing/unpacking approach. In the indexing approach, we propose a generalized basic-cycle calculation (GBCC) technique to efficiently generate the communication sets for a BLOCK-CYCLIC(s) over P processors to BLOCK-CYCLIC(t) over Q processors data redistribution. In the packing/unpacking approach, we present a User-Defined Types (UDT) method to perform BLOCK-CYCLIC(s) to BLOCK-CYCLIC(t) redistribution, using MPI user-defined datatypes. This method reduces the required memory buffers and avoids unnecessary movement of data. For the irregular data redistribution, in this dissertation, an Essential Cycle Calculation (ECC) method will be presented.
The above methods are originally developed for one dimension array. However, the multi-dimension array can also be performed by simply applying these methods dimension by dimension starting from the first (last) dimension if array is in column-major (row-major).