By Nicolas Fau, Guy Lecurieux Lafayette, R-Interface
Summary
R-Interface’s LDPC decoder platform provides to all Wireless and Wireline hardware designers an off-the shelf, full standard support, easy-to-integrate and proven solution for the Wimax Mobile (802.16e), Wifi (802.11n), 10Gbit Ethernet (802.3an), DVB and lot of new emerging standards such as space, optical, storage applications and high data rate power line communication. In this article we will focus on the Mobile Wimax standard as an example of a LDPC decoder implementation for Base Station.
1. Introduction
Mobile Wimax is coming! Today’s demonstrations show it is no longer a technology we simply talk about. But since there is still a lot to do to enable high throughput and high quality of transmission in very disturbed environment. 802.16e standard known as the Mobile Wimax standard integrates various coding schemes in the Physical layer specification including the most efficient ones, the LDPC. In this article we will present the physical layer baselines, we will then focus on the error correcting codes to finally detail our implementation.
2. Phy layer and LDPC
The physical layer is the most basic network layer; the level one in both OSI model of computer networking and TCP/IP reference model. It performs services requested by the data link layer and determines the bit rate in bit/s, also known as channel capacity, the maximum throughput or connection speed, modulation schemes and channel coding classes.
Fig1 shows the most critical sub-blocks implemented in the Wimax Mobile Physical layer: the MIMO(mutiple-in and multiple-out), the OFDMA(Orthogonal Frequency -Division Multiple Access) modulations and the LDPC(Low Density Parity Check) codec as FEC(Forward Error correction) decoder. Physical layer chipsetMIMO codecMobile OFDMAModemLDPC codecControllerPhy Sub blocksR-InterfaceR-Interface

 Fig1: PHY layer in Wireless mobile environment 
Those three technologies appear today in new standards to enable new application and services that legacy modulation schemes could not fill. But why have we been waiting so long when OFDM and LDPC are nearly half a century old theoretical approaches? Well, their complexity and thus the required amount of gates made it impossible to implement them in a reasonable sized chipset. Now that we are easily talking about 45 Nanometer cell libraries the deal has changed and such technologies can show up all together.
Let’s make a brief description and point out the benefits of the three of them:
- The OFDMA (Orthogonal frequency division multiplex Access) is a modulation technology that enables multi digital signals on different radio frequency carrier simultaneously. One of its main characteristics is its great resistance to multiple paths distortions and this is one of the main gears to wideband transmissions. We talk about multiple paths when the received signal at the antenna comes from different directions. Same emitted signal took various and multiple paths to reach the antenna, it reflected into the mountain, buildings, moving vehicles, to get to the destination many times with different attenuations and distortions. We face the same problem inside the buildings when the emitted signal reflects against the walls to get to the destination at various times and with various attenuations. Thanks to its frequency spreading coupled with the concatenation of the cyclic prefix and an optimal channel analysis the OFDM is able to face such attenuations and distortions. Multiple access is achieved in OFDMA by assigning subsets of sub-carriers to individual users. This allows simultaneous low data rate transmission from several users.
The OFDM has been introduced for the first time into wide market technologies with the DVB-T standard in the 90s. It is now part of many standards like DVB-H, 802.11a/g/n, and 802.16e.
- The OFDMA modulation techniques are now coupled with MIMO (Multi Input Multi Output) coding schemes to increase channel capacity. It consists in using space diversity to reduce the effect of the transmission paths. Indeed now that multi-path distortions are no more a problem the signal is voluntarily sent several times by distant antennas to make it use different paths and thus lower the probability to see him unrecoverable at reception. Multiple antennas at reception are also used with similar characteristics except that it is not that obvious to stick too many antennas to the same device. Anyway MIMO is making the systems more reliable and then permit to higher the capacity of the canal or higher the distance of transmissions. It is part of the new standards like 802.11n and 802.16e.
- Now that echoes effects and other distortions have been lowered there is to correct the errors and give to the application an error free bit stream or close to. The first two techniques permit to deliver to the Forward Error Decoder a value and a confidence associated to this value on every bit transmitted.
Thanks to the redundancy added to the useful data at emission the decoder is able to turn a 10-2 Packet Error Rate (PER) stream into a 10-6 PER and even better stream. To do so with realistic silicon implementation there is to think about LDPC codes. Low-Density Parity Check codes were first developed by Gallager in the 1960s and though the performance of these types of codes was quite remarkable they were to remain largely unnoticed for the next few decades. The reason generally given for their neglect was that in the 1960s a practical hardware implementation would have seemed unrealistically complex. However since the rediscovery of LDPC codes in the 1990s there has been a lot of research aimed at finding and creating good LDPC codes. LDPC codes have even been designed with a Bit-Error Rate (BER) performance that is within 0.0045 dB of the Shannon Limit.
That makes them the best Forward Error Correction code to reach mobile Wimax channel higher capacity.
3. R-Interface LDPC Simulation platform for high quality IP
As mentioned above the error correcting codes or channel coding are the physical layer last operations and permit to correct a great amount of errors brought by the transmission distortions, Doppler effects and echoes.
R-Interface developed a multi-standard software simulation platform to analyze LDPC codes capacity and determine the best decoding algorithm for a given matrix or set of matrices. Based on academic researches our platform is able to compare a large set of decoding techniques tested in a very large amount of transmission environment.
When focusing the 802.16e standard the software treats the whole standard 6 ratios R (1/2, 2/3A, 2/3B, 3/4A, 3/4B, 5/6), 19 Codes lengths (576 to 2304 bits) and modules. An example of performance is illustrated in fig 2:

 Fig2: 802.16e LDPC Decoder performance
 
The software simulation forms the basis of the evaluation to characterize the implemented architecture of the decoder outlined in the following part of the article.
4. R-Interface LDPC-WimaxMob optimized and flexible Core
In terms of hardware implementations LDPC codes look set to challenge Turbo codes as the coding scheme of choice for the future. This is due to the fact that an LDPC decoder is an order of magnitude less complex that a Turbo decoder with a similar BER performance and it is inherently parallel in nature. There is also no need for complex interleaving as the interleaver is distributed in the code. These factors make LDPC codes ideally suited to communication applications that require a fast, low power, high performance encoder/decoder solution like Mobile Wimax 802.16e implementations.
R-Interface in partnership with the Inria research laboratories developed an IP Core implementation of an LDPC decoder. The implemented core has been optimized for Wimax Mobile 802.16e standard giving the optimal decoding throughput with a flexible implementation complexity. The hardware architecture is able to handle low complexity and low power consumption requirement for the mobile handheld receivers. Also high throughput and low latency base stations features can be achieved with the same generic architecture and different implementation. Any desired throughput can be reached by instantiating a various number of Processing Modules (PM) working in parallel. The number of iteration is programmable by the application. The figure below shows the top architecture for the LDPC decoder within the downlink PHY layer:

 Fig3: LDPC decoder top view implementation 
The key features of the LDPC-WimaxMob are:
- Full standard Support ( 6 ratios, 19 codes lenghts)
- Modular architecture for low power or high performance (any number of PM instantiation)
- Easy system integration with several models(C, SystemC, Matlab,VHDL)
- Synthesizable on FPGA and ASIC
Example of implementation using a Virtex 5 FPGA from Xilinx
The LDPC IP is available as EDIF or VHDL netlist. The code is synthesizable on any FPGA or ASIC. For example below are the results on the Virtex 5 from Xilinx.
Table1 gives some results in term of complexity and performance in 3 different implementations:
- 1 PM
- 6 PM
-  8 PMs. 
 
Figures in black have to be multiplied by the system clock frequency and divided by the number of iterations to get the throughput in Mbit/sec. The more iterations are performed the better chances the algorithm has to converge and give an error free result.
| Nb Processing Modules (PM) | FFs | Memory | Ratio: 1/2 | Ratio: 2/3A | Ratio: 2/3B | Ratio: 3/4A | Ratio: 3/4B | Ratio: 5/6 | 
| 1 | 2.8K | 160Kbits | 0,65 5.2 Mbit/s 2.6 Mbit/s | 0,74 5.92 Mbit/s 3.94 Mbit/s | 0,74 5.92 Mbit/s 3.94 Mbit/s | 0,79 6,32 Mbit/s 4.74 Mbit/s | 0,79 6,32Mbit/s 4.74 Mbit/s | 0,84 6,72 Mbit/s 5.60 Mbit/s | 
| 6 | 14K | 160Kbits | 3,71 29,68 Mbit/s 14.84 Mbit/s | 3,71 29,68 Mbit/s 19.78 Mbit/s | 3,71 29,68 Mbit/s 19.78 Mbit/s | 4,39 35,12 Mbit/s 26,34 Mbit/s | 4,39 35,12 Mbit/s 26,34 Mbit/s | 4,39 35,12 Mbit/s 29.26 Mbit/s | 
| 8 | 20K | 300Kbits | 4.88 39 Mbit/s 19.5 Mbit/s | 5.66 45.28 Mbit/s 30.18 Mbit/s | 5.66 45.28 Mbit/s 30.18 Mbit/s | 5.66 45.28 Mbit/s 30.18 Mbit/s | 5.66 45.28 Mbit/s 30.18 Mbit/s | 6.42 51,36 Mbit/s 42.8 Mbit/s | 
Table1 : Complexity and Performance results of the LDPC decoder
Input stream(Red Figure)= Black Figure * System Clock Frequency/Number of iteration
Output Stream(Green Figure)= Red Figure * Ratio
Figures in red and green give as an example the Mbit/sec throughput for 20 iterations and 160MHz system clock. The red figures are the decoder input stream rate while the green ones are the output payload stream rate after redundancy has been suppressed.
Hardware implementations and simulations have been performed on FPGA Altera and Xilinx targets and compare with the algorithmic fixed-point simulations.
Conclusion
The high throughput specified by all new communication standards and the unknown transmission environment mean systems require very flexible and strong error correction schemes. LDPC is so far the strongest available and will become more and more a mandatory solution. R-Interface has developed a generic platform and simulation environment to deliver optimized and quality LDPC decoders solution to the market for Base Station and terminals. Today R-Interface is using its generic platform to deliver cores for Mobile Wimax (802.16e), Wifi (802.11n), 10 Gbit Ethernet(802.3an), DVB-S2, Power Line Communication and more coming.
For further information please contact us at + 33 4 91 05 50 96, contact@r-interface.com, www.r-interface.com
