Using graphics processors for
The CUDA parallel computing platform and
application programming interface (API) allows
fast JPEG image compression on GPUs.
Image compression plays a vitally important
part in many imaging systems by reducing the
amount of data needed to store and/or transmit
image data. While many different methods
exist to perform such image compression, perhaps the most well-known and well-adopted of
these is the baseline JPEG standard.
Originally developed by the Joint Photographic Experts Group (JPEG; https://jpeg.
org), a working group of both the International
Standardization Organization (ISO, Geneva,
Switzerland; www.iso.org) and the International Electrotechnical Commission (IEC,
Geneva, Switzerland; www.iec.ch), the baseline JPEG standard is a lossy form of compression based on the discrete
cosine transform (DCT).
Although a lossless-version of the standard
does exist, it has not been
widely adopted. However,
since the baseline JPEG
standard can achieve 15: 1 compression with
little perceptible loss in image quality, such
image compression is acceptable in many
image storage and transmission systems.
In the past, JPEG image compression was
performed on either host PCs or digital signal
processors (DSPs). Today, with the advent of
graphics processors such as the TITAN and
GEFORCE series of graphics processors from
NVIDIA (Santa Clara, CA, USA; www.nvidia.
com) that contain hundreds of processor cores,
image compression can be performed much
faster (Figure 1). Using the company’s Compute Unified Device
developers can now use
an application programming interface (API) to
build image compression
applications using C/C++.
Because CUDA provides a software
abstraction of the GPUs underlying hardware
and the baseline JPEG compression standard
can be somewhat parallelized, the baseline
JPEG compression process can be split into
threads that act as individual programs, working in the same memory space and executing
Before an image can be compressed, however, it must be transferred to the CPU’s host
memory and then to the GPU memory. To
capture image data, standards such as GenI-
Cam from the European Machine Vision
Association (EMVA; Barcelona, Spain; www.
emva.org) provide a generic interface for GigE
Vision, USB3 Vision, CoaXPress, Camera
Figure 1: With 1536 CUDA cores (processors)
and a 1GHz base clock rate, the GeForce GTX
680 from NVIDIA can JPEG-encode raw Bayer
data camera images at a rate of 500Million
Philippe Candelier, Chief
Technical Officer (CTO), NorPix
(Montréal, QC, Canada; www.