Library Support for Resource Constrained Accelerators

  • Laust Brock-Nannestad and Sven Karlsson. Library Support for Resource Constrained Accelerators. In Luiz DeRose, Bronis R. de Supinski, Stephen L. Olivier, Barbara M. Chapman, and Matthias S. Müller, editors, Using and Improving OpenMP for Devices, Tasks, and More, volume 8766 of Lecture Notes in Computer Science, pages 187-201. IEEE, 2014. doi:10.1007/978-3-319-11454-5_14
    [BibTeX] [Abstract]

    Accelerators, and other resource constrained systems, are increasingly being used in computer systems. Accelerators provide power efficient performance and often provide a shared memory model. However, it is a challenge to map feature rich APIs, such as OpenMP, to resource constrained systems. In this paper, we present a lightweight system where an accelerator can remotely execute library functions on a host processor. The implementation takes up 750 bytes but can replace arbitrary library calls leading to significant savings in memory foot print. We evaluate with a set of SPLASH-2 applications and show that the impact on execution time is negligible when compared to GCCs OpenMP implementation.

    @incollection{2014-09-BROCK-NANNESTAD,
    author = {Laust Brock-Nannestad and Sven Karlsson},
    title = {{Library Support for Resource Constrained Accelerators}},
    booktitle = {{Using and Improving OpenMP for Devices, Tasks, and More}},
    series = {Lecture Notes in Computer Science},
    volume = {8766},
    editor = {Luiz DeRose and Bronis R. de Supinski and Stephen L. Olivier and Barbara M. Chapman and Matthias S. M{\"u}ller},
    date = {2014-09-28/2014-09-30},
    pages = {187-201},
    publisher = {IEEE},
    doi = {10.1007/978-3-319-11454-5_14},
    abstract = {Accelerators, and other resource constrained systems, are increasingly being used in computer systems. Accelerators provide power efficient performance and often provide a shared memory model. However, it is a challenge to map feature rich APIs, such as OpenMP, to resource constrained systems. In this paper, we present a lightweight system where an accelerator can remotely execute library functions on a host processor. The implementation takes up 750 bytes but can replace arbitrary library calls leading to significant savings in memory foot print. We evaluate with a set of SPLASH-2 applications and show that the impact on execution time is negligible when compared to GCCs OpenMP implementation.},
    year = {2014}
    }

This entry was posted in Dissemination. Bookmark the permalink.

Comments are closed.