|
|||
Extending the SoC Architecture of 3G terminal to Multimedia Applications
FengNiu & LeiTang & ShaojunWei, Datang Microelectronics Technology Co., LTD.
Beijing, China Abstract : Presently, seeking appropriate balance between optimal supporting target applications and the broadly applicable characteristic has become the central problem of SoC design, accordingly Datang Microelectronics Technology Co., LTD., Beijing, China, is now focusing on SoC platform development facing not only TD-SCDMA but other application series such as network multimedia as a pioneer. In this paper, a typical research case will be described on how to extend an existing SoC architecture of baseband processor on 3G terminal to the player on network multimedia compressed with MPEG-4, especially on how to build an effectively multi-layer and shared memory system to cope with high-density computation and transfer tasks. Finally we use the CoCentric System Studio, one of Synopsys' ESL tools, helping us verify our idea and get our goal architecture. 1.Introduction In order to answer the chance and challenge of SoC design, many corporations are developing more universal SoC that can adapt to several or maybe more application products and customers. With more and more different functions being integrated, inherent parallel communication and parallel data stream transaction are increasing rapidly in application system. And along with the increasing parallel feature, using more than one mini and special processor cores will be as a natural architecture of advanced SoC. As a result, the performance of many application on SoC will only be restricted with properly high bandwidth, low latency time, correspondence mode between processors and the capability of integrating multi-processor on one chip. As a brave pioneer, Datang Microelectronics Technology Co., LTD. Beijing, China, is now focusing on SoC platform based hardware designs and software applications. And the software applications include not only TD-SCDMA, one of the 3G standards, but also other application series such as network multimedia. Now we have an existing architecture of SoC which mainly facing baseband processor on the 3G-telecommunication terminal, but we also want our ultimate SoC be able to satisfy the processing demands of playing network multimedia compressed with MPEG-4. So extending the current architecture is necessary. Fig. 1 shows the diversion between the two applications. Fig.1 Extending the 3G terminal SoC to Network Multimedia In this paper, we will describe how to extend the existing SoC architecture of 3G terminal to network multimedia player, and how to build an effectively multi-layer and shared memory system to cope with the high density compute and mass transfer tasks. As follows, analysis of the SoC architecture for 3G and the challenges to current architecture brought with new application will be introduced in section 2 and 3; and in section 4, extending the existing architecture for MPEG-4 will be described as an emphasis; we also use the CoCentric System Studio verify our idea and get the simulation result in section 5, then draw a conclusion in section 6. 2. Analysis of the current architecture for TD-SCDMA To develop a baseband processor platform for TD-SCDMA handset, following points need to pay attention to:
Fig.2: Analysis of the architecture of TD-SCDMA baseband for MPEG-4 application So we got architecture for the baseband processor of TD-SCDMA terminal. A simple framework is shown in Fig. 2 with the key modules to system performance analysis. We can see that system is divided into two partitions: one is ARM subsystem with one ARM926ejs core to deal with user's application programmes and OS tasks, the other is DSP subsystem including two ZSP540 cores each with 450MIPS peak value performance under 100MHZ system clock and two corresponding tightly coupled memories, which is used to process the baseband digital signal on the physical layer. Besides that, the two subsystems are connected by two DMA, one of Synopsys' DesignWare IPs, each with one AHBlite bus, and the two DMA modules can sure work for one certain subsystem. In addition, a shared memory module is used for the storage of exchanging data between the two subsystems. Finally the memory controller interface module (MemCtl) in ARM subsystem has only one AHB interface, thus an ICM module is needed to arbitrate the simultaneous requests from different AHB buses. 3. Challenges to the architecture with multimedia application Now let's discuss our new application, which is called Multimedia On Network Storage (MONS). That's a home multimedia system, being based on TV displaying, while a great deal of high-quality program materials can be obtained by system navigator on the broadband network. What's more, it needs to support the MPEG-4 decoding function besides H263, MP3, JPEG data decompression and display. Some supposable requirements are following:
The frames to be decoded can be sorted into I frame, P frame and B frame, and every frame is divided again into 16x16 macro blocks. All the frames compressed need disposals of IDCT, inverse quantization, Zigzag scan, VLD and up-sampling for data format conversion of YUV420 to YUV422, while P frame and B frame need motion compensation which requires the just decompressed pictures to be reference frames. And furthermore, reference frames must be accessed by the motion compensation ZSP core with strong time limit, which results in that reference frames must be stored in chip, not out of chip. [2] From Fig.3, it can be concluded that two reference frames are at most needed during the decoding period. Otherwise, the ZSP core should access the memory out of chip via MemCtl instantly. By the way, the decompressed pictures should be sent to LCD or TV interface in real time. Fig.3 MPEG-4 decoding features Unfortunately the tightly coupled memories' size in DSP subsystem is not enough for reference frames to be stored in the SoC architecture of TD-SCDMA, moreover, ZSP cores cannot directly access the MemCtl interface in ARM subsystem. As shown in Fig. 2, the reasons may be as follows:
Now the principal task is to optimize the current architecture to be more suitable for MONS application, while minimizing modification price and keeping the adaptability to 3G terminal baseband processor are required. From section 3, by estimating the performance of the existing architecture for MPEG-4, we know that the capability of system storage and transmission is the bottleneck that needs to be eliminated. There are two ways to enhance the bandwidth of main memories. One is sharing the separate memories and making them be accessed by different processors simultaneously; the other is using hierarchy system with multi-level middle memories. And the middle memory is controlled by programme, not same as auto-searching mean of cache. Its thought way is making the memories and the pipeline nearly cooperate, further making the operands close to the processors and keeping the processors busy working. Comparing some architecture with multi-port or dual-port memory and ST GreenSIDE main memory [4], we developed an effectively multi-level and shared memory system as shown in Fig 4. It brings following advantages:
Fig.4: An effectively multi-layer and shared memory system In the multi-level and shared memory system, we also modified the configuration of DMA and MemCtl. Each DMA employs two AHB buses, while the MemCtl module is added several AHB slave ports. That's because:
An HW/SW co-simulation platform based on Synopsys Corporation’s CoCentric System Studio using SystemC was developed to test our extended architecture, which is shown in Fig. 5. [1][3] For accelerating the simulation efficiency, we only selected the key models to system performance on the virtual platform, such as DMA, ARM, ZSP, tightly coupled memory, MemCtl, LCDC, AHB bus, shared AHB memory, APB bus with I2S interface and ICTL. Fig.5: Experimental platform in CCSS Because we just wanted to know the capability of our architecture facing the most serious pressure, which means that MPEG-4 decompression programme is only needed to run on the platform. Similarly the data flow of MPEG-4 algorithm needs to be paid more attention to during system modeling. So we use the ZSP models with ISS with which the software can be loaded, run and debugged, and a pseudo ARM model like stimulant generator that is used for configuration of system initialization, DMA transfer, response of the interrupt. System frequency was still 100MHZ. As to the algorithm programme, we rescheduled the MPEG-4 flow in order to adapt the software to the hardware. For example, each frame with 1620 macro blocks is divided into several groups according the size of frame row, as substitutes of the frames to be transacted in the motion compensation process. In this way, the memory capacity needed by reference frame is reduced and saved. The real-time feature of application system can be satisfied so long as enough transmittability is applied. The software running on ZSP was coded in C, with some frequent computations such as IDCT coded in ZSP assembler instructions to improve the execution efficiency. From the simulation results, as part shown in Fig.6, we found that both the execution and the transfer efficiency of the system were improved with the extended SoC architecture.
Fig.6: Segment of simulation results 6. Conclusion This paper introduces the process of extending the SoC architecture of TD-SCDMA baseband processor to network multimedia application. Also an effectively multi-layer and shared memory system to meet the high-density computation and transfer tasks of MPEG-4 is described as an emphasis, which is useful to build a SoC architecture with high capacity of storage and transmission. What’s more, the new architecture remains applicable to the original application. The thought way of HW/SW co-design is used and clarified in this paper. An optimal SoC architecture must be oriented by its applications, at the same time, modifying the schedule of application algorithm within a possible limit to adapt it to the architecture may gain unexpected system performance. References [1] ARM Inc., AMBA Spectification (Rev. 2.0), www.arm.com, 1999 [2] Iain E. G. Richardson, H.264 and MPEG-4 Video Compression Video Coding for Next-generation Multimedia, John Wiley & Sons Ltd,, 2003 [3] Ric Hilderink Stefan Klostermann, Transaction Level Modeling of SoC platforms using SystemC, Design Automation and Test in Europe, 2002. [4] Remi Francard Mick Posner, Verification Methods Applied to the ST Microelectronics GreenSIDE Project, www.design-reuse.com, 2004 [5] Synopsys Inc., DesignWare DW_ahb_dmac Databook, www.synopsys.com, 2004 |
Home | Feedback | Register | Site Map |
All material on this site Copyright © 2017 Design And Reuse S.A. All rights reserved. |