Towards an Algorithmic Skeleton Framework for the Programming of the Intel Xeon Phi Processor
The Intel R Xeon PhiTM is the first processor based on Intel’s MIC (Many Integrated Cores) architecture. It is a co-processor specially tailored for data-parallel computations, whose basic architectural design is similar to the ones of GPUs (Graphics Processing Units), leveraging the use of many integrated low computational cores to perform parallel
computations. The main novelty of the MIC architecture, relatively to GPUs, is its
compatibility with the Intel x86 architecture. This enables the use of many of the tools commonly available for the parallel programming of x86-based architectures, which may lead to a smaller learning curve. However, programming the Xeon Phi still entails aspects intrinsic to accelerator-based computing, in general, and to the MIC architecture, in particular.
In this thesis we advocate the use of algorithmic skeletons for programming the Xeon Phi. Algorithmic skeletons abstract the complexity inherent to parallel programming,
hiding details such as resource management, parallel decomposition, inter-execution
flow communication, thus removing these concerns from the programmer’s mind. In
this context, the goal of the thesis is to lay the foundations for the development of a
simple but powerful and efficient skeleton framework for the programming of the Xeon
Phi processor. For this purpose we build upon Marrow, an existing framework for the
orchestration of OpenCLTM computations in multi-GPU and CPU environments. We extend
Marrow to execute both OpenCL and C++ parallel computations on the Xeon Phi.
We evaluate the newly developed framework, several well-known benchmarks, like
Saxpy and N-Body, will be used to compare, not only its performance to the existing
framework when executing on the co-processor, but also to assess the performance on the Xeon Phi versus a multi-GPU environment.