Monthly Archives: October 2015

Experimental demonstration of extended depth-of-field f/1.2 visible High Definition camera with jointly optimized phase mask and real-time digital processing

Experimental demonstration of extended depth-of-field f/1.2 visible High Definition camera with jointly optimized phase mask and real-time digital processing

  • Marie-Anne Burcklen, Frederic Diaz, François Leprêtre, Joel Rollin, Anne Delboulbé, Mane-Si Laure Lee, Brigitte Loiseaux, Allan Koudoli, Simon Denel, Philippe Millet, Francois Duhem, Fabrice Lemonnier, Hervé Sauer, and Francois Goudail. Experimental demonstration of extended depth-of-field f/1.2 visible High Definition camera with jointly optimized phase mask and real-time digital processing. Journal of the european optical society – rapid publications, 10:1-6, 2015. doi:10.2971/jeos.2015.15046
    [BibTeX] [Abstract]

    Increasing the depth of field (DOF) of compact visible high resolution cameras while maintaining high imaging performance in the DOF range is crucial for such applications as night vision goggles or industrial inspection. In this paper, we present thppe end-to-end design and experimental validation of an extended depth-of-field visible High Definition camera with a very small f-number, combining a six-ring pyramidal phase mask in the aperture stop of the lens with a digital deconvolution. The phase mask and the deconvolution algorithm are jointly optimized during the design step so as to maximize the quality of the deconvolved image over the DOF range. The deconvolution processing is implemented in real-time on a Field-Programmable Gate Array and we show that it requires very low power consumption. By mean of MTF measurements and imaging experiments we experimentally characterize the performance of both cameras with and without phase mask and thereby demonstrate a significant increase in depth of field of a factor 2.5, as it was expected in the design step.

    @Article{2015-10-BURCKLEN,
    author = {Marie-Anne Burcklen and Frederic Diaz and Fran\c{c}ois Lepr\^{e}tre and Joel Rollin and Anne Delboulb\'e and Mane-Si Laure Lee and Brigitte Loiseaux and Allan Koudoli and Simon Denel and Philippe Millet and Francois Duhem and Fabrice Lemonnier and Herv\'e Sauer and Francois Goudail},
    title = {{Experimental demonstration of extended depth-of-field f/1.2 visible High Definition camera with jointly optimized phase mask and real-time digital processing}},
    journal = {Journal of the European Optical Society - Rapid publications},
    date = {2015-10-21},
    year = {2015},
    volume = {10},
    pages = {1-6},
    doi = {10.2971/jeos.2015.15046},
    publisher = {EOS},
    abstract = {Increasing the depth of field (DOF) of compact visible high resolution cameras while maintaining high imaging performance in the DOF range is crucial for such applications as night vision goggles or industrial inspection. In this paper, we present thppe end-to-end design and experimental validation of an extended depth-of-field visible High Definition camera with a very small f-number, combining a six-ring pyramidal phase mask in the aperture stop of the lens with a digital deconvolution. The phase mask and the deconvolution algorithm are jointly optimized during the design step so as to maximize the quality of the deconvolved image over the DOF range. The deconvolution processing is implemented in real-time on a Field-Programmable Gate Array and we show that it requires very low power consumption. By mean of MTF measurements and imaging experiments we experimentally characterize the performance of both cameras with and without phase mask and thereby demonstrate a significant increase in depth of field of a factor 2.5, as it was expected in the design step.}
    }

Posted in Dissemination | Leave a comment
Determining surface roughness of semifinished products using computer vision and machine learning

Determining surface roughness of semifinished products using computer vision and machine learning

  • Valentin Koblar, Martin Pečar, Klemen Gantar, Tea Tušar, and Bogdan Filipič. Determining surface roughness of semifinished products using computer vision and machine learning. In Proceedings of the 18th International Multiconference Information Society (IS 2015), volume A, pages 51-54, 2015.
    [BibTeX] [Abstract] [Download PDF]

    In the production of components for various industries, including automotive, monitoring of surface roughness is one of the key quality control procedures since achieving appropriate surface quality is necessary for reliable functioning of the manufactured components. This study deals with the task of determining the surface roughness of semifinished products and proposes a computer-vision-based method for this purpose. To automate the design of the method, machine learning is used to induce suitable predictive models from the captured product images, and evolutionary computation to tune the computer vision algorithm parameters. The resulting method allows for accurate online determination of roughness quality classes and shows a potential for online prediction of roughness values.

    @InProceedings{2015-10-KOBLAR,
    title = {{Determining surface roughness of semifinished products using computer vision and machine learning}},
    author = {Valentin Koblar and Martin Pe\v{c}ar and Klemen Gantar and Tea Tu\v{s}ar and Bogdan Filipi\v{c}},
    booktitle = {{Proceedings of the 18th International Multiconference Information Society (IS 2015)}},
    volume = {A},
    pages = {51-54},
    date = {2015-10},
    url = {http://www.copcams.eu/wp-content/uploads/2015/10/Koblar_etal_IS2015_Vol.A_51-54.pdf},
    abstract = {In the production of components for various industries, including automotive, monitoring of surface roughness is one of the key quality control procedures since achieving appropriate surface quality is necessary for reliable functioning of the manufactured components. This study deals with the task of determining the surface roughness of semifinished products and proposes a computer-vision-based method for this purpose. To automate the design of the method, machine learning is used to induce suitable predictive models from the captured product images, and evolutionary computation to tune the computer vision algorithm parameters. The resulting method allows for accurate online determination of roughness quality classes and shows a potential for online prediction of roughness values.},
    year = {2015}
    }

Posted in Dissemination | Leave a comment
OpenCL Implementation of Mean Shift Algorithm with Performance Comparisons on Discrete and Embedded

OpenCL Implementation of Mean Shift Algorithm with Performance Comparisons on Discrete and Embedded

  • Özge Ünel and Toygar Akgün. OpenCL Implementation of Mean Shift Algorithm with Performance Comparisons on Discrete and Embedded . Advanced Concepts for Intelligent Vision Systems (ACIVS 2015), 2015.
    [BibTeX]
    @Misc{2015-10-UNEL,
    author = {\"{O}zge \"{U}nel and Toygar Akg\"{u}n},
    title = {{OpenCL Implementation of Mean Shift Algorithm with Performance Comparisons on Discrete and Embedded }},
    howpublished = {Advanced Concepts for Intelligent Vision Systems (ACIVS 2015)},
    address = {Catania, Italy},
    date = {2015-10-26/2015-10-29},
    year = {2015}
    }

Posted in Dissemination | Leave a comment
Smart Image Sensor for Advanced Use and New Applications

Smart Image Sensor for Advanced Use and New Applications

  • Michael Tchagaspanian. Smart Image Sensor for Advanced Use and New Applications. Advanced Concepts for Intelligent Vision Systems (ACIVS 2015), 2015.
    [BibTeX] [Abstract]

    Multimedia applications such as video compression, image processing, face recognition, run now on embedded platforms. The huge computing power needed is provided by the evolution of the transistor density and by using specialized accelerators. These accelerators are supported by multimedia instructions set. Using these complex instructions can be a nightmare for the engineer because there are many ways to program it, quality of the compiler support can be random depending on the couple compiler/platform and worse, performances can be data dependent. Using libraries can be an option if such libraries exist and provide enough performances. In this talk, I’ll illustrate the difficulty to generate binary code for this application domain by practical example of code generation. Then I’ll show a tool deGoal which is developed in house to resolve these problems.

    @Misc{2015-10-TCHAGASPANIAN,
    author = {Michael Tchagaspanian},
    title = {{Smart Image Sensor for Advanced Use and New Applications}},
    howpublished = {Advanced Concepts for Intelligent Vision Systems (ACIVS 2015)},
    address = {Catania, Italy},
    date = {2015-10-26/2015-10-29},
    year = {2015},
    abstract = {Multimedia applications such as video compression, image processing, face recognition, run now on embedded platforms. The huge computing power needed is provided by the evolution of the transistor density and by using specialized accelerators. These accelerators are supported by multimedia instructions set.
    Using these complex instructions can be a nightmare for the engineer because there are many ways to program it, quality of the compiler support can be random depending on the couple compiler/platform and worse, performances can be data dependent. Using libraries can be an option if such libraries exist and provide enough performances.
    In this talk, I'll illustrate the difficulty to generate binary code for this application domain by practical example of code generation. Then I'll show a tool deGoal which is developed in house to resolve these problems. }
    }

Posted in Dissemination | Leave a comment
An Adaptive Framework for Imaging Systems

An Adaptive Framework for Imaging Systems

  • Andreas Erik Hindborg, Lars Frydendal Bonnichsen, Nicklas Bo Jensen, Laust Brock-Nannestad, Christian W. Probst, and Sven Karlsson. An Adaptive Framework for Imaging Systems. Advanced Concepts for Intelligent Vision Systems (ACIVS 2015), 2015.
    [BibTeX] [Abstract]

    Computer vision and video processing systems handle large amounts of data with varying spatial and temporal resolution and multiple imaging modalities. The current best practice is to design video processing systems with an overcapacity, which avoids underperforming in the general case, but wastes resources. In this work we present an adaptive framework for imaging systems that aims at minimizing waste of resources. Depending on properties of the processed images, the system dynamically adapts both the implementation of the processing system and properties of the underlying hardware.

    @Misc{2015-10-HINDBORG,
    author = {Andreas Erik Hindborg and Lars Frydendal Bonnichsen and Nicklas Bo Jensen and Laust Brock-Nannestad and Christian W. Probst and Sven Karlsson},
    title = {{An Adaptive Framework for Imaging Systems}},
    howpublished = {Advanced Concepts for Intelligent Vision Systems (ACIVS 2015)},
    address = {Catania, Italy},
    date = {2015-10-26/2015-10-29},
    year = {2015},
    abstract = {Computer vision and video processing systems handle large amounts of data with varying spatial and temporal resolution and multiple imaging modalities. The current best practice is to design video processing systems with an overcapacity, which avoids underperforming in the general case, but wastes resources. In this work we present an adaptive framework for imaging systems that aims at minimizing waste of resources. Depending on properties of the processed images, the system dynamically adapts both the implementation of the processing system and properties of the underlying hardware. }
    }

Posted in Dissemination | Leave a comment
Binary Code Generation for Multimedia Application on Embedded Platforms

Binary Code Generation for Multimedia Application on Embedded Platforms

  • Henri-Pierre Charles. Binary Code Generation for Multimedia Application on Embedded Platforms. Advanced Concepts for Intelligent Vision Systems (ACIVS 2015), 2015.
    [BibTeX] [Abstract]

    Multimedia applications such as video compression, image processing, face recognition, run now on embedded platforms. The huge computing power needed is provided by the evolution of the transistor density and by using specialized accelerators. Theses accelerators are supported by multimedia instructions set. Using theses complex instructions can be a nightmare for the engineer because there is many way to program it, quality of the compiler support can be random depending on the couple compiler/platform and worse, performances can be data dependent. Using libraries can be an option if such library exist and provide enough performances. In this talk I’ll illustrate the difficulty to generate binary code for this application domain by practical example of code generation. Then I’ll show a tool deGoal which is developed in house to resolve these problems.

    @Misc{2015-10-CHARLES,
    author = {Henri-Pierre Charles},
    title = {{Binary Code Generation for Multimedia Application on Embedded Platforms}},
    howpublished = {Advanced Concepts for Intelligent Vision Systems (ACIVS 2015)},
    address = {Catania, Italy},
    date = {2015-10-26/2015-10-29},
    year = {2015},
    abstract = {Multimedia applications such as video compression, image processing, face recognition, run now on embedded platforms. The huge computing power needed is provided by the evolution of the transistor density and by using specialized accelerators. Theses accelerators are supported by multimedia instructions set.
    Using theses complex instructions can be a nightmare for the engineer because there is many way to program it, quality of the compiler support can be random depending on the couple compiler/platform and worse, performances can be data dependent. Using libraries can be an option if such library exist and provide enough performances.
    In this talk I'll illustrate the difficulty to generate binary code for this application domain by practical example of code generation. Then I'll show a tool deGoal which is developed in house to resolve these problems. }
    }

Posted in Dissemination | Leave a comment
PENCIL: a Platform-Neutral Compute Intermediate Language for Accelerator Programming

PENCIL: a Platform-Neutral Compute Intermediate Language for Accelerator Programming

  • Riyadh Baghdadi, Ulysse Beaugnon, Albert Cohen, Tobias Grosser, Michael Kruse, Chandan Reddy, Sven Verdoolaege, Mohammed Javed Absar, Sven Van Haastregt, Alexey Kravets, Anton Lokhmotov, Robert David, Elnar Hajiyev, Adam Betts, Alastair Donaldson, and Jeroen Ketema. PENCIL: a Platform-Neutral Compute Intermediate Language for Accelerator Programming. In Proceedings of The 24th International Conference on Parallel Architectures and Compilation Techniques (PACT 2015), pages 138-149, San Francisco, California, USA, 2015. doi:10.1109/PACT.2015.17
    [BibTeX] [Abstract] [Download PDF]

    Programming accelerators such as GPUs with low-level APIs and languages such as OpenCL and CUDA is difficult, error-prone, and not performance-portable. Automatic parallelization and domain specific languages (DSLs) have been proposed to hide complexity and regain performance portability. We present PENCIL, a rigorously-defined subset of GNU C99—enriched with additional language constructs—that enables compilers to exploit parallelism and produce highly optimized code when targeting accelerators. PENCIL aims to serve both as a portable implementation language for libraries, and as a target language for DSL compilers. We implemented a PENCIL-to-OpenCL backend using a state-of-the-art polyhedral compiler. The polyhedral compiler, extended to handle data-dependent control flow and non-affine array accesses, generates optimized OpenCL code. To demonstrate the potential and performance portability of PENCIL and the PENCIL-to-OpenCL compiler, we consider a number of image processing kernels, a set of benchmarks from the Rodinia and SHOC suites, and DSL embedding scenarios for linear algebra (BLAS) and signal processing radar applications (SpearDE), and present experimental results for four GPU platforms: AMD Radeon HD 5670 and R9 285, NVIDIA GTX 470, and ARM Mali-T604.

    @InProceedings{2015-10-BAGHDADI,
    author = {Riyadh Baghdadi and Ulysse Beaugnon and Albert Cohen and Tobias Grosser and Michael Kruse and Chandan Reddy and Sven Verdoolaege and Mohammed Javed Absar and Sven Van Haastregt and Alexey Kravets and Anton Lokhmotov and Robert David and Elnar Hajiyev and Adam Betts and Alastair Donaldson and Jeroen Ketema},
    title = {{PENCIL: a Platform-Neutral Compute Intermediate Language for Accelerator Programming}},
    booktitle = {{Proceedings of The 24th International Conference on Parallel Architectures and Compilation Techniques (PACT 2015)}},
    date = {2015-10-18/2015-10-21},
    address = {San Francisco, California, USA},
    url = {http://www.ketema.eu/publ/pencil.pdf},
    abstract = {Programming accelerators such as GPUs with low-level APIs and languages such as OpenCL and CUDA is difficult, error-prone, and not performance-portable. Automatic parallelization and domain specific languages (DSLs) have been proposed to hide complexity and regain performance portability. We present PENCIL, a rigorously-defined subset of GNU C99—enriched with additional language constructs—that enables compilers to exploit parallelism and produce highly optimized code when targeting accelerators. PENCIL aims to serve both as a portable implementation language for libraries, and as a target language for DSL compilers.
    We implemented a PENCIL-to-OpenCL backend using a state-of-the-art polyhedral compiler. The polyhedral compiler, extended to handle data-dependent control flow and non-affine array accesses, generates optimized OpenCL code. To demonstrate the potential and performance portability of PENCIL and the PENCIL-to-OpenCL compiler, we consider a number of image processing kernels, a set of benchmarks from the Rodinia and SHOC suites, and DSL embedding scenarios for linear algebra (BLAS) and signal processing radar applications (SpearDE), and present experimental results for four GPU platforms: AMD Radeon HD 5670 and R9 285, NVIDIA GTX 470, and ARM Mali-T604.},
    doi = {10.1109/PACT.2015.17},
    pages = {138-149},
    year = {2015}
    }

Posted in Dissemination | Leave a comment
Using Transactional Memory to Avoid Blocking in OpenMP Synchronization Directives

Using Transactional Memory to Avoid Blocking in OpenMP Synchronization Directives

  • Lars Bonnichsen and Artur Podobas. Using Transactional Memory to Avoid Blocking in OpenMP Synchronization Directives. In Christian Terboven, Bronis R. de Supinski, Pablo Reble, Barbara M. Chapman, and Matthias S. Müller, editors, OpenMP: Heterogenous Execution and Data Movements, Proceedings of the 11th International Workshop on OpenMP (IWOMP), pages 149-161, Aachen, Germany, 2015. Springer. doi:10.1007/978-3-319-24595-9_11
    [BibTeX] [Abstract]

    OpenMP applications with abundant parallelism are often characterized by their high-performance. Unfortunately, OpenMP applications with a lot of synchronization or serialization-points perform poorly because of blocking, i.e. the threads have to wait for each other. In this paper, we present methods based on hardware transactional memory (HTM) for executing OpenMP barrier, critical, and taskwait directives without blocking. Although HTM is still relatively new in the Intel and IBM architectures, we experimentally show a 73 % performance improvement over traditional locking approaches, and 23 % better than other HTM approaches on critical sections. Speculation over barriers can decrease execution time by up-to 41 %. We expect that future systems with HTM support and more cores will have a greater benefit from our approach as they are more likely to block.

    @InProceedings{2015-10-BONNICHSEN,
    author = {Lars Bonnichsen and Artur Podobas},
    editor = {Christian Terboven and Bronis R. de Supinski and Pablo Reble and Barbara M. Chapman and Matthias S. M{\"u}ller},
    title = {{Using Transactional Memory to Avoid Blocking in OpenMP Synchronization Directives}},
    booktitle = {{OpenMP: Heterogenous Execution and Data Movements, Proceedings of the 11th International Workshop on OpenMP (IWOMP)}},
    date = {2015-10-01/2015-10-02},
    year = {2015},
    publisher = {Springer},
    address = {Aachen, Germany},
    pages = {149-161},
    doi = {10.1007/978-3-319-24595-9_11},
    abstract = {OpenMP applications with abundant parallelism are often characterized by their high-performance. Unfortunately, OpenMP applications with a lot of synchronization or serialization-points perform poorly because of blocking, i.e. the threads have to wait for each other. In this paper, we present methods based on hardware transactional memory (HTM) for executing OpenMP barrier, critical, and taskwait directives without blocking. Although HTM is still relatively new in the Intel and IBM architectures, we experimentally show a 73 % performance improvement over traditional locking approaches, and 23 % better than other HTM approaches on critical sections. Speculation over barriers can decrease execution time by up-to 41 %. We expect that future systems with HTM support and more cores will have a greater benefit from our approach as they are more likely to block.}
    }

Posted in Dissemination | Leave a comment