Ranga Siriwardena's Blog

Posts

Showing posts from October, 2010

Beyond CUDA - OpenCL

OpenCL is a standardized, cross platform, parallel computing API which allows to development of portable parallel applications for the systems with heterogeneous devices. It also based on C language like CUDA and it standardizes the parallel programming approach for rapidly growing various parallel computing platforms. For an example if we develops an application with CUDA then it is harder to use that application with different hardware platform other than Nvidia. But OpenCL eliminates this issue of vendor specificity with a more complex platform and device management model. With this multi vendor portability OpenCL device management model, kernel compilation model and kernel launch are really harder than the CUDA counterpart. But this standardization model will leads to more improvements on parallel computing with heterogeneous devices. References: OpenCL Programming Guide for the CUDA Architecture Version 3.1, 2010. NVIDIA CUDA Programming Guide Version 2.3.1, 2009.

Compute Unified Device Architecture (CUDA)

Nvidia introduced their new general purpose parallel computing architecture called CUDA in year 2006. Which provides a new parallel programming model and instruction set architecture for Nvidia GPUs. Also it comes with a software environment that allows programmer to use C as high level programming language and solve computationally demanding problems in a more efficient way. A hierarchy of thread groups, barrier synchronization and shared memories are the three key abstractions provided by CUDA that are simply exposed to the developer as a minimal set of language extensions. They give thread parallelism and fine-grained data parallelism, nested within task parallelism and coarse grained data parallelism. And also this abstractions helps the developer to partition the task into coarse subtasks which can be solved independently in parallel, and then into finer pieces that can be solved cooperatively in parallel. With CUDA large numbers of processor cores can be used to transparently sc

GPGPU - Next generation of high performence computing

Over the past few years GPU (Graphics Processing Unit) becomes competitive computing hardware against the CPU (Central Processing Unit) because of its rapid increasing performance and capabilities. Recent improvements of GPU’s highly parallel programming capabilities lead to mapped wide variety of general purpose complex application with tremendous performance improvements. This attempt on GPU is also called as General Purpose Computation on Graphics Processor ( GPGPU ) and this feature leads GPU to the next generation of high performance computing Despite the relatively recent introduction of General Purpose GPU hardware, various kinds of applications have begun to take the advantage of GPU. GPU’s high performance techniques proven that it can be used successfully in diverse variety of application areas such as image processing, video processing, scientific computing, bioinformatics, computer vision, neural networks, database operations etc. Because of this reason many researches ar