Monthly Archives: September 2014

The Relation Between Diamond Tiling and Hexagonal Tiling

The Relation Between Diamond Tiling and Hexagonal Tiling

  • Tobias Grosser, Sven Verdoolaege, Albert Cohen, and (Saday) P. Sadayappan. The relation between diamond tiling and hexagonal tiling. Parallel processing letters, 24(3), 2014. doi:10.1142/S0129626414410023
    [BibTeX]
    @Article{2014-09-GROSSER,
    author = {Tobias Grosser and Sven Verdoolaege and Albert Cohen and P. (Saday) Sadayappan},
    title = {The Relation Between Diamond Tiling and Hexagonal Tiling},
    journal = {Parallel Processing Letters},
    date = {2014-09},
    year = {2014},
    volume = 24,
    number = 3,
    doi = {10.1142/S0129626414410023},
    publisher = {World Scientific Publishing Co.}
    }

Posted in Dissemination | Leave a comment
Library Support for Resource Constrained Accelerators

Library Support for Resource Constrained Accelerators

  • Laust Brock-Nannestad and Sven Karlsson. Library Support for Resource Constrained Accelerators. In Luiz DeRose, Bronis R. de Supinski, Stephen L. Olivier, Barbara M. Chapman, and Matthias S. Müller, editors, Using and Improving OpenMP for Devices, Tasks, and More, volume 8766 of Lecture Notes in Computer Science, pages 187-201. IEEE, 2014. doi:10.1007/978-3-319-11454-5_14
    [BibTeX] [Abstract]

    Accelerators, and other resource constrained systems, are increasingly being used in computer systems. Accelerators provide power efficient performance and often provide a shared memory model. However, it is a challenge to map feature rich APIs, such as OpenMP, to resource constrained systems. In this paper, we present a lightweight system where an accelerator can remotely execute library functions on a host processor. The implementation takes up 750 bytes but can replace arbitrary library calls leading to significant savings in memory foot print. We evaluate with a set of SPLASH-2 applications and show that the impact on execution time is negligible when compared to GCCs OpenMP implementation.

    @incollection{2014-09-BROCK-NANNESTAD,
    author = {Laust Brock-Nannestad and Sven Karlsson},
    title = {{Library Support for Resource Constrained Accelerators}},
    booktitle = {{Using and Improving OpenMP for Devices, Tasks, and More}},
    series = {Lecture Notes in Computer Science},
    volume = {8766},
    editor = {Luiz DeRose and Bronis R. de Supinski and Stephen L. Olivier and Barbara M. Chapman and Matthias S. M{\"u}ller},
    date = {2014-09-28/2014-09-30},
    pages = {187-201},
    publisher = {IEEE},
    doi = {10.1007/978-3-319-11454-5_14},
    abstract = {Accelerators, and other resource constrained systems, are increasingly being used in computer systems. Accelerators provide power efficient performance and often provide a shared memory model. However, it is a challenge to map feature rich APIs, such as OpenMP, to resource constrained systems. In this paper, we present a lightweight system where an accelerator can remotely execute library functions on a host processor. The implementation takes up 750 bytes but can replace arbitrary library calls leading to significant savings in memory foot print. We evaluate with a set of SPLASH-2 applications and show that the impact on execution time is negligible when compared to GCCs OpenMP implementation.},
    year = {2014}
    }

Posted in Dissemination | Leave a comment
Model-Based Platform Composition for Embedded System Design

Model-Based Platform Composition for Embedded System Design

  • Nicolas Hili, Christian Fabre, Ivan Llopard, Sophie Dupuy-Chessa, and Dominique Rieu. Model-Based Platform Composition for Embedded System Design. In Proceedings of IEEE 8th International Symposium on Embedded Multicore Many-core Systems-on-Chip (MCSoC-14), pages 157-164, University of Aizu, Japan, 2014. doi:10.1109/MCSoC.2014.31
    [BibTeX] [Abstract]

    Platforms are widely used to design embedded systems. They have numerous advantages: separation from its application, industrial rationalization, standardization, division of large development teams. However, their design complexity is growing dramatically due to several sources: the intricate combination of parallelism and heterogeneity in modern architectures; the quest for ever low power consumption and the diversity of sensors/actuators required by modern applications. This complexity prevents straightforward platform design in one step and calls for gradual design by composition and improvement over existing components. However, there is no systematic way of composing them, and there is no clear concept suitable for platform composition. In this paper, we propose two atomic ways of composing platforms, increment and assembly, that allow designers to build platforms gradually thanks to two concepts called world and container.

    @InProceedings{2014-09-HILI,
    author = {Nicolas Hili and Christian Fabre and Ivan Llopard and Sophie Dupuy-Chessa and Dominique Rieu},
    title = {{Model-Based Platform Composition for Embedded System Design}},
    booktitle = {{Proceedings of IEEE 8th International Symposium on Embedded Multicore Many-core Systems-on-Chip (MCSoC-14)}},
    date = {2014-09-23/2014-09-25},
    pages = {157-164},
    address = {University of Aizu, Japan},
    doi = {10.1109/MCSoC.2014.31},
    abstract = {Platforms are widely used to design embedded systems. They have numerous advantages: separation from its application, industrial rationalization, standardization, division of large development teams. However, their design complexity is growing dramatically due to several sources: the intricate combination of parallelism and heterogeneity in modern architectures; the quest for ever low power consumption and the diversity of sensors/actuators required by modern applications. This complexity prevents straightforward platform design in one step and calls for gradual design by composition and improvement over existing components. However, there is no systematic way of composing them, and there is no clear concept suitable for platform composition. In this paper, we propose two atomic ways of composing platforms, increment and assembly, that allow designers to build platforms gradually thanks to two concepts called world and container.},
    year = {2014}
    }

Posted in Dissemination | Leave a comment
Parallel Background Subtraction in Video Streams using OpenCL on GPU Platforms

Parallel Background Subtraction in Video Streams using OpenCL on GPU Platforms

  • Grzegorz Szwoch. Parallel Background Subtraction in Video Streams using OpenCL on GPU Platforms. In 18th IEEE Conference SPA 2014: Signal Processing: Algorithms, Architectures, Arrangements, and Applications, pages 54-59, Poznan,Poland, 2014.
    [BibTeX] [Abstract] [Download PDF]

    Implementation of the background subtraction algorithm using OpenCL platform is presented. The algorithm processes live stream of video frames from the surveillance camera in on-line mode. Processing is performed using a host machine and a parallel computing device. The work focuses on optimizing an OpenCL algorithm implementation for GPU devices by taking into account specific features of the GPU architecture, such as memory access, data transfers and work group organization. However, the algorithm is intended to be used on any OpenCL compliant devices, including DSP and FPGA platforms. Various optimizations of the algorithm are presented and tested using a number of devices with varying processing power. The main aim of the work is to determine which optimizations are essential for ensuring on-line video processing in the surveillance system.

    @InProceedings{2014-09-SZWOCH,
    author = {Grzegorz Szwoch},
    title = {{Parallel Background Subtraction in Video Streams using OpenCL on GPU Platforms}},
    booktitle = {{18th IEEE Conference SPA 2014: Signal Processing: Algorithms, Architectures, Arrangements, and Applications}},
    date = {2014-09-22/2014-09-24},
    pages = {54-59},
    address = {Poznan,Poland},
    url = {http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber = 7067270&filter%3DAND(p_IS_Number%3A7067255)},
    abstract = {Implementation of the background subtraction algorithm using OpenCL platform is presented. The algorithm processes live stream of video frames from the surveillance camera in on-line mode. Processing is performed using a host machine and a parallel computing device. The work focuses on optimizing an OpenCL algorithm implementation for GPU devices by taking into account specific features of the GPU architecture, such as memory access, data transfers and work group organization. However, the algorithm is intended to be used on any OpenCL compliant devices, including DSP and FPGA platforms. Various optimizations of the algorithm are presented and tested using a number of devices with varying processing power. The main aim of the work is to determine which optimizations are essential for ensuring on-line video processing in the surveillance system.},
    year = {2014}
    }

Posted in Dissemination | Leave a comment
VIPPE: Native Simulation and Performance Analysis Framework for Multi-Processing Embedded Systems

VIPPE: Native Simulation and Performance Analysis Framework for Multi-Processing Embedded Systems

  • Luis Diaz, Eduardo González, Eugenio Villar, and Pablo Sánchez. VIPPE: Native Simulation and Performance Analysis Framework for Multi-Processing Embedded Systems. In Proceedings of the JCE-Sarteco 2014, pages 1-7, Valladolid, Spain, 2014.
    [BibTeX] [Download PDF]
    @InProceedings{2014-09-DIAZ,
    author = {Luis Diaz and Eduardo Gonz\'{a}lez and Eugenio Villar and Pablo S\'{a}nchez},
    title = {{VIPPE: Native Simulation and Performance Analysis Framework for Multi-Processing Embedded Systems}},
    booktitle = {{Proceedings of the JCE-Sarteco 2014}},
    pages = {1-7},
    date = {2014-09-17/2014-09-19},
    address = {Valladolid, Spain},
    url = {http://www.researchgate.net/publication/267265900_VIPPE_Native_simulation_and_performance_analysis_framework_for_multi-processing_embedded_systems},
    year = {2014}
    }

Posted in Dissemination | Leave a comment