![]() |
|
![]() |
![]() |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
IP Cores for accelerating JPEG2000
by Cantineau Olivier, BARCO SILEX
Louvain-la-Neuve Belgium Abstract : This paper presents BARCO SILEX’s IP solutions for accelerating picture compression and decompression with the recent JPEG2000 algorithm. The algorithm is briefly explained and the structure of the IP’s is detailed. Finally, implementation and performance results are exposed for various FPGA and ASIC technologies. INTRODUCTION JPEG2000 [1] is the latest algorithm from the JPEG normalization group for still picture compression. It is based on wavelet technology and is very different from its predecessor. It features a large set of capabilities that will allow it to be adopted in a wide spectrum of applications, even extending to video encoding. In return this compression scheme requires much more computational power than its classical JPEG predecessor, which make software implementations poor candidates for applications requiring very small encoding times. To reach high performance applications, BARCO SILEX has developed two JPEG2000 accelerator IP cores: the BA112JPEG2000E encoder and the BA111JPEG2000D decoder. These are targeted at achieving all computationally intensive tasks of the JPEG2000 algorithm. Coupled to a host CPU, these cores allow building a complete JPEG2000 encoding or decoding solution. This paper describes the structure of the BARCO SILEX JPEG2000 IP cores. The cores have been developed for sustained high speed processing and feature a state-of-the-art pipelined parallel architecture. Several entropy encoders are implemented in parallel in order to increase the overall pixel throughput. The first section will give an overview of the JPEG2000 algorithm. The second section will detail the architecture of the IP cores. Finally the third section gives implementation and performance results on FPGA and ASIC technologies. THE JPEG2000 ALGORITHM JPEG2000 is based on an algorithm offering a wide range of tools for compressing and representing images. These are suitable for a large spectrum of applications such as: Internet streaming, medical imaging, digital cameras,… This algorithm encompasses various capabilities:
For supporting this rich set of features, the JPEG2000 algorithm implements two consecutive processing stages that are explained in the following sections. First Stage: Wavelet-Based Compression For applying JPEG2000 compression, the image can be divided into rectangular tiles of any size that will each undergo 2-D wavelet transform [2] as illustrated in Figure 1. Wavelet transform is an iterative decorrelating operation that decomposes a tile into a series of smaller pictures (subbands). Each subband contains tile information limited to a given frequency range (including low-pass). One level of wavelet decomposition allows building four subbands from the low-pass subband derived during the previous decomposition steps. ![]() Figure 1: Illustration of wavelet decomposition of a square tile (“L” means result of low-pass filtering in a given direction – horizontal or vertical -; “H” means result of high-pass filtering in a given direction; combining both letters yields the four 2-D filtering combinations) From Figure 1 subbands 1HL, 1LH, 1HH, 1LL are the result of the wavelet decomposition applied on the complete tile; subbands 2HL, 2LH, 2HH, 2LL are the result of the wavelet decomposition applied on subband 1LL; etc. This process groups information from the same frequency range together, allowing selectively weighting the quantization of these data. Each subband can undergo separate quantization by a programmable factor for lossy compression. Bypassing the quantization yields lossless operation. The resultant quantized subbands are further divided into smaller rectangular blocks (code blocks) which are separately entropy encoded. This process is achieved by a Modeler and an MQ-coder, which is an adaptive Arithmetic Encoder. The Modeler examines all bit planes of the current code block, starting from the most significant non-zero bit plane. It scans the current bit plane in a zigzag order with three passes per plane. During each pass, it computes a context to the current bit. The context reflects the predominant value of the neighboring bits. The adaptive Arithmetic Encoder finally encodes each scanned bit using a probability value derived from the associated context. The Arithmetic Encoder updates its probability tables after each bit encoding. The Modeler also computes compression metrics reflecting the image distortion involved by reconstructing the code block only with its currently encoded portion. This information is used by the second stage as described below. Second Stage (Tier-2): Packet Selection And Reordering The codestream generated by the arithmetic encoder, together with the distortion metrics, allows the JPEG2000 post-processing stage to selectively build the final bitstream for a given compression ratio and progression order. This stage will organize the resultant packets to minimize the overall distortion while trying to attain the specified compression ratio. This allows a precise control of the generated compressed file size while maintaining a good image quality. Moreover JPEG2000 standardizes various orders of packet inclusion in the bitstream. This allows many bitstream progressivities (e.g. by resolution or by quality) that enable fast preview of a picture with a first portion of the bitstream and further image refinements by decoding subsequent parts of the compressed file. DESCRIPTION OF THE PROPOSED JPEG2000 IP CORES Due to its powerful capabilities, JPEG2000 requires more computational resources than the classical JPEG for achieving similar encoding and decoding speeds. Hardware solutions are then required for high-speed applications. BARCO SILEX has developed two IP cores able to accelerate the JPEG2000 encoding and decoding operations. These cores perform all computationally intensive tasks of the JPEG2000 algorithm by integrating the following operations: wavelet transform, quantization and arithemtic coding. The cores are designed as accelerating coprocessors in a complete JPEG2000 encoding or decoding system. Indeed the Tier-2 part of the JPEG2000 algorithm is more suitably executed by a software routine running on a host processor. Figure 2 shows a block diagram of the BA112JPEG2000E IP core designed by BARCO SILEX. This illustrates the main functional modules and a simplified view of the interfaces. Pixel data are input through the Pixel Interface and compressed streams are made available at the Compressed Interfaces together with distortion metrics. The core features a simple generic CPU Interface suited for interfacing it as a bus peripheral to various processors. The next sub-sections describe the modules constituting the BA112JPEG2000E core as depicted in Figure 2.
Figure 2: Block diagram of the BA112JPEG2000E IP core
The 1-D decomposition filters are based on a state-of-the-art lifting scheme [3], reducing the gate count. These engines also implement symmetric border extensions as specified by JPEG2000 in order to reduce border artifacts.
Table 1: Performance Summary
1) Worst case condition, foundry library (fsa0a_a), Tj=125°C, pre-layout timing CONCLUSION BARCO SILEX introduces its BA112JPEG2000E and BA111JPEG2000D IP cores targeted at high-speed JPEG2000 encoding and decoding. These cores give access to the large capabilities of the JPEG2000 Standard. Indeed this Standard defines an algorithm able to offer a wide spectrum of features such as progressive bitstream, precise rate control, region of interest, high-quality lossless and lossy compressions. This rich set of advantages leads JPEG2000 to be an important actor in the compression world. However due to its computational complexity, hardware platforms are required to reach high-speed applications characterized by timings compatible with real-time video encoding. BARCO SILEX takes up this challenge with its BA111JPEG2000D and BA112JPEG2000E IP cores. These are highly optimized solutions for ASIC and FPGA technologies, acting as efficient accelerators in a complete JPEG2000 compression system. For more information about BARCO SILEX IP cores visit www.barco-silex.com. REFERENCES [1] ISO/IEC 15444-1 Information Technology – JPEG 2000 image coding system – Part 1 : Core coding system [2] S. Mallat, A Theory for Multiresolution Signal Decomposition: The Wavelet Representation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.11, no. 7, pp 761-765, July 1989. [3] W. Sweldens, Wavelets and the Lifting Scheme: a 5-Minute Tour, Z. Angew. Math. Mech., vol. 76, no. 2, pp 41-44, 1996.
|
![]() |
![]() |
![]() |
Home | Feedback | Register | Site Map |
![]() |
All material on this site Copyright © 2017 Design And Reuse S.A. All rights reserved. |