Congestion & Timing Optimization Techniques at 7nm Design
By Jaya Patel, Atul Kumar (eInfochips - An Arrow company)
Abstract:
In a VLSI design, floorplan is the crucial stage in which chip area, size and shape of the chip can be determined. Floorplan is iterative process. When designer is done with the floorplan, the next step is to run placement and optimization, after completion of placement and optimization designer would analyse the congestion map, cell density and timing reports, before moving to next stage i.e. Clock Tree Synthesis. This article discusses about the various approaches to reduce congestion and timing violation by modifying floorplan at block level.
1. Introduction
The design complexity is increasing as the number of transistors on a chip is increasing. So, in the VLSI design flow, Physical design plays a very important role. Floorplanning is the starting step in the physical design flow. The main concern of floorplanning is to find the appropriate location of the module on the layout surface on the basis of interconnectivity. While determining the locations, one important check involved is that there should not be any overlapping between two modules. Designer determines the size of the die at floorplan stage and creates wire tracks for placement of the standard cell. Designer does a power ground connection and determines the I/O pad/pin placement location. The designer should take care of sanity checks at floorplan stage. Main aim of the floorplan is: -
- Minimize the area.
- Minimize the total wire length.
- Improve routability.
- Minimize delay.
- Minimize cost
At placement and optimization stage, PnR Tool tries to optimize data path so that data arrival time can be minimized and worst negative slack (WNS) and total negative slack (TNS) could be reduced. Placement is the process of finding a suitable physical location for each cell in the design. The quality of routing in design is highly determined by the placement. If more number of cells are placed in a small area then the number of routing tracks available for routing is less than the required routing tracks, which may cause congestion in the design. There are several reasons for congestions:
- High standard cell density in a small area.
- Placement of standard cell near macros.
- High pin density at the edge of the macro.
- Bad floorplan.
Figure 1. Congestion near the port
Congestions near the port is shown in figure 1. The designer must take care of these issues at the beginning by applying placement blockages (soft, hard, partial blockage), keep-out margin, Scan chain Re-ordering, cell padding, macro padding and create bounds. Creating bound allows the user to define region-based placement.
2. Design Challenges
In this article, two challenges would be discussed: -
- Congestion near to the feedthrough port, most of the cells placed near the port and forms the crisscross path with another port.
- Another challenge is to Reg2out violation. In this case, 48 flip flop are placed at one location and interacting with all macros placed at left side and the port placed at left top of the design.
Case 1: fixing congestion near feedthrough port
This section describes a port swapping technique by which congestion can be reduced near the ports. While doing swapping of the port, designer has to take care of the other block also. If that port of the block is interacting with other blocks, then swapping of the ports will be done accordingly. In the figure 2, top ports that are pink in color represent feed_input ports and feed_output port on the right bottom side. Similarly, for blue color top right side is feed_output and right one is feed_input port. Instances that are highlighted in blue and pink color are registers.
Figure 2.feed_input and feed_output ports
In figure 3, Top right feed_input ports which are in blue color are interacting with feed_output port which is green in color which is placed on right edge of the block and we see there is one pipe line registers between them which is shown by yellow color line.
Figure 3. Interaction between feed_input and feed_output ports
Implementation:
Figure 4 After swapping feed_input and feed_output ports
As shown in figure 4, right side feed_input* ports are swapped with feed_output* ports to improve congestion by reducing criss-cross near the ports. After implementing this Congestion improved from 4.4% to 2.2%.
Case 2: Fixing Register to output (REG2OUT) setup violation-
At floorplan stage, Macro/IP placement, pin/port placement and power planning are performed. Only setup violations would be reported at placement stage, and hold violations will be reported after CTS stage in Pnr flow.
Path groups would be creating for the timing path. This enables reporting the timing results separately for each group, as well as set the options to focus the timing optimization on specific critical timing path groups.
Command (in Innovus) createBasicPathGroups – expanded can be used to create reg2reg, reg2Cgate, in2reg, reg2out, and in2out path groups.
Figure 5 : Different timing path groups
In the above figure path between flip-flop 1 and flip-flop 2 is called as reg2reg path and the path between flip-flop 2 and the output port is called as reg2out path. The reg2reg and reg2cgate default path groups are high effort path groups for optimization and the remaining path groups are low effort path groups. We can also create custom path groups according to the requirement. STA person would define constraints for I/O timing e.g. input delay, external delay etc. in Synopsys design constraint (SDC) file. Virtual clocks are defined to constraint the I/O timing paths. While doing PnR at block-level, I/O timing should be met even though internal timing has the higher priority. While doing timing optimization, tool will locate flip-flops according to timing requirements. As internal timing has higher priority, it might be possible that the tool would place flip-flops little away from the I/O port.
METHODOLOGY: -
To fix IO timing violation, we need to analyze the cause of the timing violations so that it can be fixed at the right stage. In this design, reg2out setup violation reported at placement stage. To meet the internal timing, tool placed those flops a little away from the output port. These violations need to be addressed at the placement stage itself. It was found that there was enough margin in previous timing paths (reg2reg). So flopbound could be created near the output port to fix the reg2out setup violations. Now there are few things that need to be identify: -
- location of the bound
- Size of the bound
- Which group should be created out of soft Guide, Guide, Region and Fence.
PnR tool, Innovus support four type of physical floorplan constraints namely soft guide, guide, region and fence.
Soft Guide: -
Soft guide constraints are similar to a guide constraint except that there are no fixed locations. The soft guide constraints are not as restrictive as a fence or region constraint, so some instances/cell might be placed further away if there are talking to other modules.
Guide:
It roughly defines an area within which the PnR tool place instances/cells of given module. Instances from modules that are not part of the guide might be placed inside the guide if the area of the guide is underutilized. Similarly, cells from the modules defined as the part of the guide may be placed outside if the area of the guide is over utilized. So basically, a guide is a soft indicant for the placement of the cell that may let some cells in and let some cells out.
Region:
The region constraint is also a loose constraint. However, it is stronger than the guide constraint. When region is created, it requires all the cells of the modules contained in the defined region to be placed in the specific region area. If there is an extra space in the region, other cells from the other modules might be placed inside.
Fence:
The fence constraint is the strongest floorplan constraint. It would not allow cells from other modules to be placed inside even if the area is underutilized. Additionally, the cells in a fenced area cannot be placed outside the defined fence area.
Now it is very important to choose which type of physical floorplan constraints need to be selected for creating the flop bound. Four types of floorplan constraints have been described above.
For this case, a region had been created as the flops would be placed together and at the specified location, moreover, other cells from other modules could also be placed inside region if some extra space would be there, hence the area can be utilized completely. The distance between the output port and the bound depends on the setup slack of violating timing paths and the margin available in the previous paths. Finding the appropriate location of the flip-flop bound is an iterative process.
First, the designer has to select one location based on parameter described earlier and checked the timing violations, if violation had not been reduced to the desired limit, then the flop bound would be shifted nearer to output port and then again check the timing report. If the slack of timing path is under the desired limit, that would be the final location of the flop bound. Cells those need to be pulled inside the bound would be selected before creating the region for the flops. Start by creating an instance group and assigning members to the group. Then region would be created using the group name.
Implementation:
In the design, a region of 48 flops has been created near the output port and it would prevent the spreading of those flops. Hence, reg2out setup violations can be reduced to the desired limit.
Specification of block
Dimension | 2126 x 976.68 |
Instances | 1.06 M |
Utilization | 17% |
Macro count | 220 |
Figure6: Interaction of 48flops with highlighted macros at first level
In the figure shown above, output port has been highlighted and placement of the registers has been shown before creating region.
Figure7. bound of 48 flops
In this figure, location of the region has been shown in the design. It makes sure that internal timing should not get violated while fixing reg2out timing violations. Reg2out setup timing violation reduced to desired limit after implementation of above methodology.
CONCLUSION:
Floorplanning is an important and critical step of the PnR flow. If the quality of floorplan is not remarkable, it can create different issues like congestion, timing violations, routing issues and IR drop etc. Few real design scenarios have been discussed here. Placement of feedthrough ports at the appropriate location is required to control the congestion and reduce the cell density at the specified area. For I/O timing violations, identifying real cause is essential to apply the required approach for fixing the violations.
Creating bound could be one approach to address the I/O timing violations at floorplan stage. This would also help in reducing the number of buffers/inverters inserted by the tool for the optimization in I/O timing paths. Size and location of the bound should be selected carefully by the designer. Multiple bounds can be created in the design as per the requirement.
“eInfochips, an Arrow company, is a leading global provider of product engineering and semiconductor design services. With over 500+ products developed and 40M deployments in 140 countries, eInfochips continues to fuel technological innovations in multiple verticals.
For more information connect with eInfochips today.”
REFERENCE
[1] https://support.cadence.com/
[2] A Practical Approach to VLSI System on Chip (SoC) Design. Author: Chakravarthi , Veena
[3] J. Bhasker and R. Chadha, Static timing analysis for nanometer designs: A practical approach. 2009.
ACKNOWLEDGEMENT
The authors would like to thank EInfochips Company for making the required resources accessible.
Authors
Jaya Patel Jaya Patel is a Physical design Engineer at eInfochips working in the backend design domain. She holds bachelor degree in Electronics and communication from Nirma University, Ahmedabad |
Atul Kumar Atul Kumar is Physical design engineer working with eInfochips(An Arrow company). He has more than 3 years of experience in ASIC design, includes various technology nodes like 7nm , 22nm and 16nm. He holds bachelor degree in Electronics and communication from IPEC. |
If you wish to download a copy of this white paper, click here
|
Related Articles
- Timing Optimization Technique Using Useful Skew in 5nm Technology Node
- Reducing Power Hot Spots through RTL optimization techniques
- Distorted Waveform Phenomena in 7nm Technology Node and its Impact on Signoff Timing Analysis
- Efficient SIMD and Algorithmic Optimization Techniques for H264 Decoder on Cortex A9
- How to exploit 17 tried and true DSP power optimization techniques for wireless applications
New Articles
- Quantum Readiness Considerations for Suppliers and Manufacturers
- A Rad Hard ASIC Design Approach: Triple Modular Redundancy (TMR)
- Early Interactive Short Isolation for Faster SoC Verification
- The Ideal Crypto Coprocessor with Root of Trust to Support Customer Complete Full Chip Evaluation: PUFcc gained SESIP and PSA Certified™ Level 3 RoT Component Certification
- Advanced Packaging and Chiplets Can Be for Everyone
Most Popular
- System Verilog Assertions Simplified
- System Verilog Macro: A Powerful Feature for Design Verification Projects
- UPF Constraint coding for SoC - A Case Study
- Dynamic Memory Allocation and Fragmentation in C and C++
- Enhancing VLSI Design Efficiency: Tackling Congestion and Shorts with Practical Approaches and PnR Tool (ICC2)
E-mail This Article | Printer-Friendly Page |