Performance analysis of OpenMP scheduling type on embarrassingly parallel matrix multiplication algorithm
Journal
Lecture Notes on Data Engineering and Communications Technologies
ISSN
23674512
Date Issued
2018-01-01
Author(s)
Qun N.H.
Khalib Z.I.A.
Raof R.A.A.
DOI
10.1007/978-3-319-59427-9_94
Abstract
The paper investigates the effect of different OpenMP schedule types towards matrix multiplication algorithm which has embarrassingly parallel loop. OpenMP schedule type and chunk sizes were meant for fine tuning the behavior of parallel loop iterations. However, the most suitable schedule type and chunk size which give optimum parallel performance in this particular kind of loop could only be identified via benchmark. By default, static schedule type would be best suited for embarrassingly parallel loop with equal workload per iteration as it divides the chunk of work equally among threads and hence better load balance and lower overhead. This paper shows static schedule type is not necessarily the best candidate. All the schedule types give well load balance. This implies that the compiler is capable of assigning relatively equal workload among threads, despite the explicitly defined schedule type. Benchmark allows one to make trade-offs in OpenMP directive selection.