10Gbps Multi-Link and Multi-Protocol PCIe 4.0 PHY IP for SMIC
Optimization Methodologies for Cycle-Accurate SystemC Models Converted from RTL VHDL
Syed Saif Abrar 1,2, Maksim Jenihhin 1, Jaan Raik 1
1{ saif | maksim | jaan}@ati.ttu.ee , Tallinn University of Technology, ESTONIA
2 saif.abrar@in.ibm.com , IBM, India
Abstract:
IP design-houses are hard-pressed by their customers to provide SystemC models of their portfolio IPs, despite already existing VHDL views. VHDL IPs can be translated to SystemC, ensuring correctness, quality and maintainability of the translated code. VHDL and SystemC are frequently co-simulated by architects as well as verification teams. This paper explores optimization scenarios that affect the cosimulation performance, resulting in 30% faster co-simulation. In addition to the plain VHDL-to-SystemC conversion, there are possibilities of alternate implementations for a SystemC model. This paper explores these alternate scenarios to get 25% better simulation-speed. The optimization methodologies in this paper are relevant to architects, designers, verification-teams, IP design-houses that need to provide high-speed simulation-models, and can be used for optimizing cosimulation tools as well system-level models.
1. INTRODUCTION
Consumer electronic devices are becoming increasingly complex, presenting lots of hardware and software design challenges. The traditional design approaches at Register Transfer Level (RTL) using VHDL or Verilog are no longer suitable, resulting in the search for a newer abstraction-level that enables unified development of the System-on-a-Chip (SoC) designs. This level has been called system level, behavioral level, C-level, algorithmic level, Electronic System Level (ESL), etc.
Electronic System Level (ESL) design is an emerging methodology to model the (sub-)system at a high-level of abstraction. Transaction Level Modeling (TLM) is the widely recognized and adopted approach to realize the ESL in practice. TLM models are applicable for a range of design-tasks, e.g. early software development, performance analysis, architecture exploration, HW/SW partitioning, etc. A big pain-point for the IP design-houses is the challenge of providing the TLM models of their IPs. Their customers are increasingly asking for SystemC models [1] along with the RTL description. Developing the SystemC-models manually has a potential of mismatch between RTL and SystemC, and is impractical in terms of time and effort. Hence, there is an urgent need of automated generation of SystemC models from existing RTL description.
The rest of the paper is organized as follows. Section 2 gives an overview of the related work. Section 3 describes the VHDL-to-SystemC translation methodology. Section 4 details the experiments conducted for simulation-speed optimization. Finally, section 5 concludes the paper along with future work.
2. RELATED WORK
Carbon Model Studio [3] is a commercial offering that allows creating configurable SystemC models from RTL VHDL or Verilog descriptions. It does not intend to create human-readable output. HIFSuite [4], [5] is a design and verification framework addressing manipulation and integration of heterogeneous design parts. The output result also does not consider human-readability and correspondence to the source VHDL. The approach supports equivalence checking to prove the correctness of the result.
Example of non-commercial and free solutions to generate SystemC from VDHL are VHDLParser by University of Tuebingen [6] and VH2SC by HT-Lab [7]. Both approaches consider mapping of the source VHDL to a limited set of SystemC constructs. These tools have demonstrated significant limitations, do not guarantee equivalence and they are not maintained anymore. VHDLParser dates to 2001 and addresses SystemC 1.0. Closed sources of the tools do not allow engineers to extend them to their needs.
The most relevant approach is published by OFFIS in [8]. It assumes creation of readable SystemC representations from VHDL that are targeted to be wrapped and simulated in the Simulink environment. The approach is claimed to support industrial designs, however only an illustrative example details are available [9]. This practical work also does not provide for equivalence checking mechanisms or results. There are known approaches for creating SystemC models from Verilog [10] and tools targeting creation of other C++ subsets [11], [12].
Cosimulation with SystemC has been of interest since long. Instruction-set-simulator (ISS) cosimulation is discussed first in [16]. Simulink cosimulation is discussed in [17].
Figure. 1. ITC b09 translated from VHDL to SystemC using the proposed approach
3. VHDL-TO-SYSTEMC TRANSLATION
VHDL to SystemC translation follows a set a guidelines discussed in [18]. Without any loss of generality, Fig. 1 shows the ITC99 benchmark design b09 translated from VHDL to SystemC using our methodology. Only the code of interest is shown in Fig. 1, and the omitted code is marked with a <snip> tag. Translation methodology is described below by set of rules, referring to the Line-numbers in the SystemC column to aid the explanations.
A. Handling multiple architecture definitions
VHDL enables multiple architectures for a single entity. The methodology names a SystemC module by concatenating VHDL entity and architecture names, as in Line-01.
B. Using constructors rather than SC_CTOR
VHDL allows model parametrization using generics. The methodology uses class constructor, as shown in Line-04, instead of SC_CTOR, to use parameters as VHDL generics.
C. Virtual Destructors
As it is also discussed in the guideline 7 in Effective C++, if the SystemC model has any virtual function then it should have a virtual destructor.
D. Mapping VHDL and SystemC port types
Both VHDL and SystemC have similar types of ports. As in Line-05 and Line-07, sc_in and sc_out for VHDL in and out ports respectively. Additionally, VHDL buffer port can be mapped to sc_inout port.
E. Naming the SystemC process
SystemC process names are required, whereas optional in VHDL. The methodology uses VHDL process name, if it exists, or derives it from the process line-number, as shown in Line-10.
F. Exploiting native C++ data-types for faster simulation
This is conventional wisdom to use C++ datatypes. As shown in Line-13, bool type is used instead of SystemC defined SC_bit data type. However, this approach must be used with care, to make sure that the left-out values (e.g. 'X'/'Z' in this case) are not used anywhere in the code.
G. Writing SystemC module constructor on-the-fly
SystemC module has a constructor with information, like sensitivity list, that is spread across VHDL code. Information for the SystemC constructor has to be gathered while parsing the VHDL module, accumulated on-the-fly and finally written to the SystemC file. This is shown in Line-20.
H. Translating the VHDL process
SystemC has SC_METHOD and SC_THREAD, similar to a VHDL process. A VHDL process with a wait statement is implemented as SC_THREAD, otherwise SC_METHOD might be preferred. As shown in Line-21, SC_METHOD is used, and the sensitivity of the VHDL-process becomes the sensitivity of SystemC translation, as shown in Line-22.
I. Preventing invocation of SystemC process at start-up
SystemC scheduler invokes every process (SC_METHOD or SC_THREAD) at the start of simulation, even in absence of any event! Use dont_initialize() to stop this default invocation, as shown in Line-23.
J. Using port-methods for clarity
SystemC uses same operator '=' for variables and ports. To remove ambiguity in understanding the translated source-code, use .read() and .write() methods for ports and '=' for variable assignments. As in Line-29, port is read using its .read() method.
K. Handling clock-edge sensitivity
VHDL and SystemC use varying notations to describe clock edge sensitivity. Line-36 shows the translation for positive-edge clock sensitivity. But the process is invoked on both the edges of clock. Efficient and recommended, but elaborate approach is to analyze the VHDL to determine the clock sensitivity of interest, and then use it in SystemC. Such scheme is better for an event-based simulator like SystemC.
L. Translating switch-cases
VHDL allows variables, logic-types as well as ports in the switch-case construct, whereas SystemC allows only integers. Hence, VHDL switch literal must be converted to an integer, if possible. Another approach is to use if-then-else constructs, as shown in Line-38 and Line-43, where the VHDL switch literal
- uses a range of values
- cannot be converted to an integer, e.g. strings
M. Chain of SystemC library calls
Sometimes it is necessary in practice to employ a chain of SystemC library calls to achieve the desired behavior, e.g. comparing a single-bit value from a port, as shown in Line-45.
In addition to the above details highlighted in Fig. 1, few other rules are followed for translation, described below:
N. Operator precedence differences
This consideration is extremely important to follow. VHDL and SystemC have different precedence for certain operators. As is common, use parenthesis for clarity and overwriting precedence.
O. Handling concurrent VHDL statements
Concurrent VHDL statements can be handled as:
a) Single SC_METHOD for all concurrent statements, sensitive to all the source RHS operations in these statements. Drawback is that all the statements are executed whenever any RHS operation changes, affecting the simulation performance.
b) Separate SC_METHOD for each statement, sensitive to only this statement's source variable. This reduces the simulation overhead, but increases the code-size.
P. Executing the SystemC model constructor
SystemC models instantiated inside a top-level module need their constructors to be executed, e.g. implementing a 4-bit adder from 4 instances of 1-bit adder.
SystemC code generated by the zamiaCAD implementation has following highly desirable qualities:
Human readability: This is an important consideration about the usage of the generated SystemC code: whether the SystemC code is only to be fed to a compiler or is it going to be maintained by human developers. Many VHDL to SystemC converters lack this feature and generate an obscure code which is not fit for human use. It goes a long way to decide the human relationship with the generated SystemC code.
Correspondence of the translated SystemC to VHDL: If a team of designers needs to maintain both the VHDL and SystemC code-bases, then it is appropriate to have a consistent view between the two. The translation mechanism must decide about this early on and take care of. Usage of the same module names, variables, constructs, etc. must be adhered too, unless SystemC does not support a feature inherently.
4. OPTIMIZATIONS FOR SIMULATION-SPEED
The SystemC-model available from the previous section is obtained by converting the VHDL model on a line-by-line basis. This plain translation approach does not take into account the simulation characteristics as well as implementation style of the SystemC model. This section explores various implementation aspects to improve the simulation-speed and discover potential optimization possibilities.
Table 1. Plasma VHDL simulation-profile
Module | Simulation time (%) | Remarks |
ALU | 31.4 | Highest sim-time |
Register-bank | 0.9 | Lowest sim-time |
A. Optimization for Co-simulation
This section analyzes the performance of co-simulating VHDL and SystemC models, as in [19]. The aim of this analysis is to find possible optimizations in VHDL and SystemC models, as well as to recommend an effective co-simulation approach.
Cosimulation plays an important role in SoC system-level-design (SLD)[2]. Architects initialize system-design at a higher abstraction level, e.g. Matlab, C, etc, then refine each component in a step-wise manner, while cosimulating with the rest of the system still at higher level. HW designers cosimulate their designs under development in a low level language e.g. VHDL/Verilog while the rest of the test environment, e.g. CPU, memory, bus, etc., is still in higher level language. SW developers typically cosimulate the instruction-set-simulator (ISS) at a higher level while the HW for which driver is being developed is simulated at a lower level. Hence co-simulation is important all over the SoC design-cycle.
Simulation profiling: Cosimulation performance-analysis begins with profiling the Plasma VHDL simulation, to get the simulation-time and its %-age taken by each VHDL module. Table-1 shows that the ALU module takes the highest (31.4%) simulation-time, whereas register-bank takes the lowest (0.9%) simulation-time.
Base performance analysis: VHDL-only simulation is taken as the base for performance analysis. The term VHDL-only implies the original Plasma-core in VHDL, without any module in SystemC for co-simulation. The base performance will be used to compare against the cosimulation performance. As shown in Figure-2, the time-taken varies linearly with the simulation-time, ensuring reliable and consistent results for varying simulation times.
Figure 2: VHDL-SystemC co-simulation analysis
Co-simulating ALU in SystemC: As the ALU module is profiled in Table-1 taking the highest %-age of simulation-time, it is a good candidate for co-simulation performance analysis. The ALU module is translated to SystemC, replacing the ALU module in VHDL, and co-simulated with the rest of the Plasma-core still in VHDL. Figure-2 shows the actual time-taken against the simulation-time. As expected, the variation of time-taken is linear. The co-simulation with ALU is taking more time than the VHDL-only simulation. The reason for lower performance of co-simulation is that both VHDL and SystemC simulation- engines are now being invoked to perform the co-simulation, increasing the simulation time.
Co-simulating register-bank in SystemC: The least %-age of simulation-time is spent in the register-bank. Hence, it is considered next for co-simulation performance-analysis. As earlier, the co-simulation setup consists of the register-bank translated to SystemC while the rest of the Plasma design is in VHDL. The ALU module is also reverted back to VHDL from SystemC. Figure-2 shows the actual time-taken against the simulation-time, for both register-bank in SystemC as well as in VHDL.
Significant and interesting observation in this case is that the register-bank in SystemC has better performance than the VHDL-only base-performance. The reason for better co-simulation performance of register-bank in SystemC is due to the event-based design of the SystemC kernel. Event-based design invokes the SystemC kernel only when there is an event to execute the register-bank functionality. As earlier shown in Table-1, the register-bank takes negligible simulation-time, implying that there are almost no events for the register-bank module. Hence the SystemC kernel is almost not invoked in this case, leading to a better performance.
B. Optimization for Combinational-statements
Combinational statements in VHDL are implemented outside of any VHDL process and are executed concurrently, in parallel. SystemC does not have this feature of concurrent statements, as all statements are executed within a SystemC process, either SC_METHOD or SC_THREAD. Since a VHDL combinational method does not have a 'wait' feature, only SC_METHOD is to be considered for converting combinational statements to SystemC.
Multiple combinational VHDL statements can be converted to SystemC SC_METHOD in the following 2 alternatives:
a) Single SC_METHOD for all combinational statements
This approach implements all VHDL combinational statements within a single SC_METHOD. The single SC_METHOD is made sensitive to all the source RHS elements in these statements. Table-2 shows an example of this approach. Advantage of this implementation is its simplicity and ease of implementation. Drawback is that all the statements in the SC_METHOD are executed even if only 1 RHS element changes.
Table 2. Single SC_METHOD
VHDL syntax | SystemC syntax |
architecture behav of E is begin X <= A and B; Y <= not A; Z <= X or Y; end behav; | SC_METHOD(P1); sensitive << A << B << X << Y; void P1(void) { X = A && B; Y = !A; Z = X || Y; } |
b) Separate SC_METHOD for each combinational statement
In this approach, each combinational statement is implemented in a separate SC_METHOD. This SC_METHOD is made sensitive only to the RHS elements appearing in source VHDL statement. Table-3 shows an example of this approach. Advantage of this approach is that only the relevant SystemC statements are executed, within a particular SC_METHOD. Limitation is the increase in the source-code size due to an SC_METHOD implemented for each combinational statement.
Table 3. Separate SC_METHOD's
VHDL syntax | SystemC syntax |
architecture behav of E is begin X <= A and B; Y <= not A; Z <= X or Y; end behav; | SC_METHOD(P_X); sensitive << A << B; void P_X(void) { X = A && B; } SC_METHOD(P_Y); sensitive << A; void P_Y(void) { Y = !A; } SC_METHOD(P_Z); sensitive << X << Y; void P_Z(void) { Z = X || Y; } |
Results:
To experiment the 2 alternative methods of translating combinational-statements in VHDL to SystemC, the 'shifter' module in the Plasma core is selected. The 'shifter' module has 12 combinational-statements, making it a good candidate to observe the differences in simulation-time. As is seen from the Figure-3, the multiple process approach takes about 25% lesser simulation-time, resulting in faster simulation-speed. This can be attributed to the fact that in multiple-process implementation, only a single statement is executed, resulting in optimized simulation-model.
Figure-3. Simulation-time for single and multiple process
C. Optimization for Events
Events are used in SystemC to synchronize actions among processes. A SystemC process can wait for multiple events, suspending its task while in wait-state. Another process triggers the event at some time during simulation. The waiting process wakes-up when it receives the notification of event-trigger and resumes its task. A process can be made to wait on a single-event or on multiple-events, as shown in Table-4. On the other hand, the event notification can be done immediately or in the next delta-cycle, as shown in Table-5.
Table 4. Events-waiting in SystemC
Single-event waiting | Multiple-event waiting |
SC_THREAD(th_single_ev); void th_single_ev(void) { wait(ev_single); statement-1; statement-2; ... } | SC_THREAD(th_mult_ev); void th_mult_ev(void) { wait(ev_mult_1); statement-1; wait(ev_mult_2); statement-2; ... } |
Table 5. Events-notification in SystemC
Immediate notification | Delta-cycle notification |
sc_event ev_do; void pr_imm() { statement-1; . . . ev_do.notify(); . . . } | sc_event ev_do; void pr_delta() { statement-1; . . . ev_do.notify(SC_ZERO_TIME); . . . } |
Results:
The modules ALU, DDR, DMA and MUX from the Plasma-core are implemented with the earlier alternatives; e.g. ALU implementation is referred to as:
- ALU SE: Single-event waiting, immediate notification
- ALU ME: Multiple-event waiting, immediate notification
- ALU ME-Z: Multiple-event waiting, delta-cycle notification
Figure-4 shows the simulation-performance for ALU, DDR, DMA and MUX modules, for the 3 alternatives. As seen in Figure-4, there is no significant variation in the simulation-performance for a given module.
The single-event implementation is simpler in terms of coding, understanding and debugging. Hence, this might be a preferred implementation alternative, though based on coding-aspect rather than the simulation-speed.
Figure 4: Simulation- performance for various events implementation
5. CONCLUSIONS
This research has discussed a methodology for translating RTL VHDL IPs to cycle-accurate SystemC models, and experimented with various optimization methodologies for the generated SystemC model. (1) Performance-analysis of co-simulating VHDL with SystemC-module taking the highest simulation-time lowers the simulation-performance. On the other hand, co-simulation with the module taking least simulation-time is 30% faster than VHDL-only simulation. (2) Implementation of VHDL combinational-statements using multiple SystemC-processes takes about 25% lesser simulation-time than implementation with a single SystemC-process. (3) SystemC synchronization strategies using single and multiple events take similar simulation-time, although single-event implementation is easier to develop and maintain. The optimization results obtained in this paper are relevant to a wide SystemC community, including architects, designers, verifiers as well as IP design-houses and EDA-vendors.
ACKNOWLEDGEMENTS
The work has been supported in part by the Estonian ICT project FUSETEST, by CEBE through the European Structural Funds, by Estonian SF grants 8478 and 9429 and by the Tiger University Program of the Information Technology Foundation for Education (HITSA).
REFERENCES
[1] IEEE 1666 Standard SystemC Language Reference Manual, 2011.
[2] ESL Models and their Application: Electronic System Level Design and Verification in Practice(Embedded Systems) B Bailey, G Martin. 2012.
[3] Carbon Design Systems, http://carbondesignsystems.com
[4] HIFSuite, EDA-Lab, http://hifsuite.edalab.it, 2012
[5] N. Bombieri, M. Ferrari, F. Fummi, et al., “HIFSuite: Tools for HDL Code Conversion and Manipulation,” in EURASIP Journal on Embedded Systems, vol. 1155, no. 10, 2010.
[6] VHDL-to-SystemC-Converter, Eberhard-Karls-University of Tübingen, http://www-ti.informatik.uni-tuebingen.de/~systemc/, 2012
[7] VH2SC, HT-Lab, http://www.ht-lab.com, 2012
[8] R. Görgen, J.H. Oetjens, W. Nebel, Automatic integration of hardware descriptions into system-level models, Proc. of IEEE DDECS 2012, Tallinn, pp.105-110
[9] R. Görgen, “VHDL-to-SystemC Transformation,” 2012, http://vhome.offis.de/ralphg/vhdl2sc.pdf
[10] W. Snyder, Verilator, Veripool, http://www.veripool.org/wiki/verilator
[11] Edwin Naroska, “FreeHDL,” http://www.freehdl.seul.org/, 2012
[12] OStatic, “VHDLC,” http://ostatic.com/vhdlc, 2012
[13] zamiaCAD Open-source HW Design Framework, http://zamiacad.sf.net
[14] Legacy SystemC co-simulation of multi-processor systems-on-chip. Benini, L. ; Bertozzi, D. ; Bruni, D. ; Drago, N. ; Fummi, F. ; Poncino, M. IEEE Int'l Conf Computer Design, 2002.
[15] An automated approach to SystemC/Simulink co-simulation. Mendoza, F.; Kollner, C.; Becker, J.; Muller-Glaser, K.D. IEEE Int'l Symp. System Prototyping (RSP), 2011
[16] Uniform SystemC Co-Simulation Methodology for System-on-Chip Designs. IEEE CyberC, 2012
[17] An automated approach to SystemC/Simulink co-simulation. Mendoza, F. ; Kollner, C. ; Becker, J. ; Muller-Glaser, K.D. IEEE Int'l Symp. On Rapid System Prototyping (RSP), 2011
[18] Extensible Open-Source Framework for Translating RTL VHDL IP Cores to SystemC, Syed Saif Abrar, Maksim Jenihhin, Jaan Raik; IEEE Int'l Symp. Design and Diagnostics of Electronic Circuits & Systems (DDECS), 2013
[19] Performance Analysis of Cosimulating Processor Core in VHDL and SystemC. Syed Saif Abrar, Shyam Kiran A., Maksim Jenihhin, Jaan Raik, C. Babu; IEEE Int'l Conf. Advances in Computing, Communications and Informatics (ICACCI),
Related Articles
- Open-source Framework and Practical Considerations for Translating RTL VHDL to SystemC
- Reducing Power Hot Spots through RTL optimization techniques
- SoC RTL Signoff: Divide & Conquer with Abstract Models
- Refactoring Hardware Algorithms to Functional Timed SystemC Models
- Leveraging system models for RTL functional verification
New Articles
- Quantum Readiness Considerations for Suppliers and Manufacturers
- A Rad Hard ASIC Design Approach: Triple Modular Redundancy (TMR)
- Early Interactive Short Isolation for Faster SoC Verification
- The Ideal Crypto Coprocessor with Root of Trust to Support Customer Complete Full Chip Evaluation: PUFcc gained SESIP and PSA Certified™ Level 3 RoT Component Certification
- Advanced Packaging and Chiplets Can Be for Everyone
Most Popular
- System Verilog Macro: A Powerful Feature for Design Verification Projects
- System Verilog Assertions Simplified
- Smart Tracking of SoC Verification Progress Using Synopsys' Hierarchical Verification Plan (HVP)
- Dynamic Memory Allocation and Fragmentation in C and C++
- Synthesis Methodology & Netlist Qualification
E-mail This Article | Printer-Friendly Page |