Adaptive Rate Control Algorithm
By Kunal Panchal & Harshit (Applied Micro)
Abstract
Ethernet over the last few years has evolved to provide high bandwidth over the aggregate Gigabit link. Next generation telecommunication networks are also shifting towards packet processed network which is enveloped by Ethernet. Citing higher demand for faster and wider Ethernet network, it has become absolutely eminent to study factors holding bandwidth efficiencies of these networks. This paper studies IEEE 802.3 Reconciliation sub layer (RS Layer) showing that there is a further scope of efficiency improvement by keeping check on one of the bandwidth limiting factor i.e. “Overheads of Ethernet frame”. Various observations are highlighted in the paper to demonstrate the effect of increase in overheads octets on data rate. Paper also portrays the solution adopted, “Adaptive Rate control Algorithm” at RS layer to overcome this challenge. This algorithm tends to improve efficiency under high Bandwidth occupancy over existing deployments without disturbing legacy systems.
I. Introduction:
Today, Ethernet is successfully deployed for short, medium and long haul communication. It is quickly replacing legacy data transmission systems in the world's telecommunications networks. Ethernet sits at bigger pie of this high demanding network. Thus it is imperative to extract all the available resources of Ethernet protocol to produce highly efficient network.
Following sections demonstrate OSI Model and describes part of Ethernet protocol IEEE 802.3 briefly where efficiency problem can be minimized.
Figure 1: IEEE 802.3 standard relationship to the ISO/IEC Open Systems Interconnection (OSI) reference model
Above OSI Reference Model emphasizes on the lower two layers i.e. Data Link and Physical Layer (PHY). Media Independent Interface (MII) at the bottom of Reconciliation Sub layer (RS Layer) is a transparent signal interface interconnecting MAC and PHY. MII was originally designed for fast Ethernet (i.e., 100 MBPS) as a standard interface which now supports faster rates like 10G, 40G and 100G along with RS Layer. MII supporting 10Gbps is known as XGMII. (Similarly 40G is XLGMII; 100G is CGMII, etc.). RS Layer maps the signals provided at the MII to the PLS service primitives. It participates in link fault detection and is also responsible for aligning data frames.
System Communicating over Ethernet divides a stream of data into frame as shown in Figure 2 where each frame contains source addresses, destination addresses and Frame Check Sequence (FCS) which is a 32 bit cyclic redundancy check value to detect errors in a received MAC frame. MAC encapsulates MAC Client frame into Packet to be transmitted serially to PHY.
Figure 2: Internal Structure of Ethernet Packet
Inter packet Gap (IPG) as shown in Figure 2 is the period between the transmissions of Ethernet Packets to provide inter packet recovery time to the receiving stations and physical medium. As per IEEE 802.3 Std, minimum IPG and minimum Average IPG Requirement must be preserved for a system to work efficiently.
Note that IPG is extended beyond its minimum requirement when all the devices on the network are idle i.e. no device is transmitting. This extended IPG consumes a considerable bandwidth over the link.
Study across globe reveals that it is impossible to achieve 100% data rate on Ethernet network due to its frame structure. Following Sections discusses one of the prominent efficiency limiting factor i.e. Overhead of Ethernet network and adopted solution to this problem.
II. Performance & Challenges Faced on Ethernet Based Network:
Bandwidth is theoretically the maximum amount of data that could be transmitted over the link and throughput is the actual amount of data that is transmitted. Over the Ethernet network, Data throughput remains uneven throughout the day. Idles are comparatively less at busiest (high data traffic) hours and the number of Idles increases as the data load on the network decreases.
Ethernet Network provides services to a large number of hosts. When large numbers of host are active, transmissions are deferred due to limited bandwidth which in turn suppresses the efficiency of the deployed communication system but when network is lightly loaded, bandwidth remains unoccupied.
Factors characterizing the performance of Ethernet Network are Efficiency and Data Rate.
At RS Layer, entire Ethernet frame acts a payload and the only overhead is IPG, henceforth Efficiency of Ethernet network at RS Layer can be given as a function of Packet size and IPG.
Efficiency at RS Layer (%) = 100 * (Packet Size) / (Packet size + Average IPG)
Where average IPG is average inter packet gap between packets arriving at a host over a period of time.
And, Maximum theoretical efficiency (%) = 100 * (Max Packet Size) / (Max Packet size + Min Average IPG)
As per IEEE 802.3 section 4.4.2, XGMII (10G Medium independent interface) has a minimum average IPG requirement of 12 octets and minimum IPG requirement of 5 Octets.
Max Ethernet Packet Size = 1526 octets
Max Efficiency (%) at RS Layer (10Gbps Ethernet network) = 100 * 1526 / (1526+12) = 99.2197 %
Data rate is the number of bits conveyed or processed per unit time. At RS Layer, it is a function of bandwidth, Packet size and IPG. Data rate can be given as:
Data Rate at RS Layer = Bandwidth * (Packet Size) / (Packet size + Average IPG)
And, Maximum Data Rate at RS Layer = Bandwidth * (Max Packet Size) / (Max Packet size + Min Average IPG)
Maximum Data Rate at RS Layer (10Gbps Ethernet network) = 10 * 1526 / (1526+12) = 9.92197 Gbps
Similarly Data rate and efficiency can also be calculated for higher Bandwidth.
III. Problem Faced on Ethernet Based Network
Over the time, Ethernet has largely replaced competing wired LAN technologies such as token ring, FDDI, and ARCNET. It has been refined to support higher data rate and longer link distance. Despite of so many advantages, its performance is hindered by few limiting factors. One of the major factors as discussed before is Overheads of Ethernet Frame. In this section, few observations on simulating an IEEE 802.3 10G RS Model are shared to demonstrate the effect of increase in overhead octets (IPG Octets as RS Layer is transparent to other overheads) on data rate.
IEEE 802.3 - 10 Gbps RS Model was developed for study and a snapshot of its simulation results is shared in Figure 3 where 1000 packets are driven. Here, Active Data Rate is input data rate and passive data rate is output data rate. Note that data rate is calculated as demonstrated in section II.
Figure 3: 10G RS Model Simulation Result
Developed Model was regressed extensively and few of its simulation results are shared in Table 1 where multiple packets with gradually increasing IPG are driven to the model. In order to observe the effect of increase in extended
IPGs on data rate, Packet size and no. of packets are kept fixed. Active data rate means input of model and Passive data refers as output of model.
No. of Packet | IPG(Bytes) | Active Data Rate (Gbps) | Passive Data Rate (Gbps) |
20 | 12 | 9.541985 | 9.545628 |
20 | 20 | 9.259259 | 9.26269 |
20 | 50 | 8.333333 | 8.336112 |
20 | 75 | 7.692308 | 7.694675 |
20 | 100 | 7.142857 | 7.144899 |
20 | 150 | 6.25 | 6.251563 |
20 | 200 | 5.555556 | 5.55679 |
20 | 250 | 5 | 5.001 |
Table 1: RS Model Simulation Results with fixed packet length (250 bytes)
Simulation results in Table 1 summarizes that data rate is significantly reduced with increase in IPG octets.
As illustrated above, RS Layer maintains effective data rate i.e. passive data rate against active data rate, by sometime inserting and sometime deleting idles to align Start Control Character on Lane 0. RS Layer maintains Deficit Idle Count (DIC) that represents the cumulative count of idle characters deleted or inserted. Decision of insertion and deletion is constrained by bounding DIC to a minimum Value of 0 and maximum value of 3.
Above mentioned observations deduces that, improvement in efficiency can be achieved by keeping check on number of idles transmitted. Following section presents the adopted solution to improve efficiency of the Ethernet Network.
IV. Adopted Solution: Adaptive Rate Control Algorithm
While speculating the efficiency of Ethernet Network, a progressive optimal solution was developed. This adopted solution i.e. “Adaptive Rate Control Algorithm”, to overcome the bandwidth limiting factors at RS layer and to improve efficiency of the network is discussed ahead.
A. Load Diagnosis:
When network is heavily loaded i.e. during peak hours, allocated bandwidth is shared by large number of active users. Link remains idle for a short duration as huge amount of data from large number of users is to be transmitted over the aggregate gigabit link. As a result, Transmissions are deferred due to limited bandwidth which in turn reduces the Quality of Service (QoS). But when network is lightly loaded, there are not much active users due to which link remains idle for long duration i.e. long IPG.
The study focuses on adaptive rate control algorithm developed in order to achieve high throughput over allocated bandwidth and improve QoS when there is high load on the network. Traffic Rate Threshold value serves as a benchmark for the adopted solution; beyond which network is considered to be heavily loaded. Adaptive Rate Control Algorithm operates when input data rate is above Traffic Rate Threshold, shown in Figure 4.
Figure 4: ARCA Operating Region
Selection of a threshold value is application dependent e.g. threshold value may be low for a Commercial Application and high for a network security application in which we are not often intended to disturb traffic.
B. Implementation:
Note that the bandwidth consumed by idles is significant at peak hours. Adaptive Rate Control Algorithm incorporated within RS layer delete idles when network is heavily loaded which allows higher output data rate to be transmitted against input data rate, improving network efficiency. Number of Idles deleted are compensated by inserting idles in order to maintain effective data rate.
Conditions that should be taken care while idles deletion are:
- Data frame should not be corrupted while deleting idles.
- Average and Minimum IPG requirement of the system to provide inter packet recovery time should be preserved and importance of IPG should not be defeated.
- Apart from Idles, no other information should be deleted like signal/sequence ordered sets.
Figure 5: An Engine
Sample Adaptive Rate control Algorithm has been designed for reference. It queues the incoming data up to a FIFO_DEPTH 4/8/16 (One depth can be N number of bytes). Alongside, algorithm calculates the input data rate to determine the need to insert or delete idles. Once the traffic rate threshold is known for the network, and if calculated input data rate is above threshold (in operating region as shown in figure 4), an idle insertion/deletion circuitry is placed to detect right opportunity to insert/delete idles. Necessity of idle insertion/deletion is governed by Resolution Logic, discussed ahead with an example.
The depth of the queue i.e. FIFO_DEPTH is network dependent, will be discussed in following Sections. Queue, Resolution Logic and Idle insertion/deletion circuitry together are referred as an Engine as shown in Figure 5.
Flowchart of the Algorithm is portrayed in Figure 6.
Figure 6 : ARCA Flowchart
Resolution Logic shown in Figure 6, takes the decision to insert/delete idles depending on the following conditions:
- Idle Deletion: If average input IPG (cal_avg_ipg) > Minimum IPG Requirement (min_ipg).
- Idle Insertion: If Depth -1 idles are deleted from the queue and cal_avg_ipg > Minimum IPG Requirement (min_avg_ipg).
- No Operation: If Depth -1 idles are deleted from the queue and cal_avg_ipg <= min_avg_ipg.
Algorithm’s operation is described below.
- Algorithm continuously calculates input data rate (cal_avg_rate) and average IPG (cal_avg_ipg) over the sliding window of length T (time).
- If the cal_avg_rate is greater than the traffic rate threshold value (traffic_rate_thres) and, also, Minimum IPG Requirement (min_ipg) between packets is satisfied, algorithm checks for idle octets in the queue (shown in Figure 5: An Engine).
- If the queue is fully filled by consecutive idles, i.e. say 4 idles in the queue of depth 4, and if cal_avg_ipg>min_ipg, idle deletion takes place, i.e. one of the 4 consecutive idle in the queue will be deleted and remaining will be sent to output. While doing this, conditions for idle deletion as described above are taken care.
- From next cycle, it will check for 3 consecutive idle in the queue to delete one of 3 idles as per aforesaid logic. Subsequently, it will keep on deleting one idle per iteration until 3 (Depth-1) idles are deleted from queue.
- In order to compensate the deleted idles, Algorithm now looks for the opportunity to insert idles in the traffic when cal_avg_ipg > min_avg_ipg. Once all 3 (Depth-1) deleted idles are inserted, the process of idle deletion/insertion continues. Until all deleted idles are compensated, no further idle deletion takes place.
Note: During this process if cal_avg_rate goes below traffic_rate_thres, compensating traffic rate by re-inserting deleted idles is not required as data frames are spaced apart by enough no. of idles.
Also same exercise can be done in a cumulative fashion to improve the throughput of a network. Cascaded engines will further improve efficiency under higher bandwidth occupancy.
C. Effect on Performance:
Adaptive rate control algorithm with two engines each with FIFO_DEPTH = 4 are integrated with IEEE 802.3 10G RS Model as shown in Figure 7.
Figure 7: Integrated RS Model with Cascaded Engines
Few observations on simulating integrated 10G RS Model shown above are shared in Table 2 where multiple packets with gradually increasing IPG (reducing data traffic rate) are driven to the model. Data rate and improved performance of the system with Adopted solution are also tabulated.
Note: Same stimulus as in section III is driven. Input data rate is above traffic rate threshold value.
No. of Packet | IPG(Bytes) | Input Data Rate (Gbps) | Input Efficiency (%) | Output Data Rate (Gbps) | Output Efficiency (%) | Improved Performance (mbps) |
20 | 12 | 9.541985 | 95.41985 | 9.567547 | 95.67547 | 25.562 |
20 | 20 | 9.259259 | 92.59259 | 9.276438 | 92.76438 | 17.179 |
20 | 50 | 8.333333 | 83.33333 | 8.352823 | 83.52823 | 19.49 |
20 | 75 | 7.692308 | 76.92308 | 7.713669 | 77.13669 | 21.361 |
20 | 100 | 7.142857 | 71.42857 | 7.153076 | 71.53076 | 10.219 |
20 | 150 | 6.25 | 62.5 | 6.260957 | 62.60957 | 10.957 |
20 | 200 | 5.555556 | 55.55556 | 5.566689 | 55.66689 | 11.133 |
20 | 250 | 5 | 50 | 5.009016 | 50.09016 | 9.016 |
Table 2: 10G RS Model implementing Adaptive Rate Control Algorithm Simulation Results
Above mentioned observation shows performance improvement up to 25Mbps at heavy data load for single node. Aggregating multiple nodes over network can result into significant improvement in performance of the Network by using such adaptive rate control algorithms. Improvement figures in efficiency over entire network are under further study.
Note this solution is not restricted to any rate specific RS Layer it can be used at higher Bandwidth as well.
Bar Graphs depicting a notable improvement in output data rate against input data rate with adopted solution are plotted below.
Figure 8: Bar Graphs: Input Vs Output Data Rate (RS Layer + Adopted solution)
V. Effects & Trade off:
Notable improvement in performance is observed in previous section where FIFO_DEPTH was bound to 4. This section deals with the Tradeoffs e.g. Power and Hardware consumption and Delay to Ethernet network with adopted solution.
FIFO Depth + Engines => Power Consumption + Delay in network + Gate Count
Hence, Number of Engines and FIFO Depth should not be heedlessly selected. We have undertaken a study where effect of increase in FIFO_DEPTH on performance is observed for the data rates above traffic rate threshold value.
No. of Packet | FIFO Depth | IPG (Bytes) | Input Data Rate (Gbps) | Input Efficiency (%) | Output Data Rate (Gbps) | Output Efficiency (%) | Improved Performance (mbps) |
20 | 4 | 12 | 9.541985 | 95.41985 | 9.567547 | 95.67547 | 25.562 |
20 | 8 | 12 | 9.541985 | 95.41985 | 9.545628 | 95.45628 | 3.643 |
20 | 16 | 12 | 9.541985 | 95.41985 | 9.545628 | 95.45628 | 3.643 |
Table 3: Effect of Increase in FIFO Depth under high load
Table 3 concludes that under high traffic load, Performance reduces with increase in FIFO_DEPTH.
No. of Packet | FIFO Depth | IPG (Bytes) | Input Data Rate (Gbps) | Input Efficiency (%) | Output Data Rate (Gbps) | Output Efficiency (%) | Improved Performance (mbps) |
20 | 4 | 50 | 8.333333 | 83.33333 | 8.352823 | 83.52823 | 19.49 |
20 | 8 | 50 | 8.333333 | 83.33333 | 8.380825 | 83.80825 | 47.492 |
20 | 16 | 50 | 8.333333 | 83.33333 | 8.336112 | 83.36112 | 2.779 |
Table 4: Effect of Increase in FIFO Depth under average load
Table 4 concludes that under average traffic load, Performance is high when FIFO_DEPTH is 8 and reduces with increase in depth.
No. of Packet | FIFO Depth | IPG (Bytes) | Input Data Rate (Gbps) | Input Efficiency (%) | Output Data Rate (Gbps) | Output Efficiency (%) | Improved Performance (mbps) |
20 | 4 | 100 | 7.142857 | 71.42857 | 7.153076 | 71.53076 | 10.219 |
20 | 8 | 100 | 7.142857 | 71.42857 | 7.181844 | 71.81844 | 38.987 |
20 | 16 | 100 | 7.142857 | 71.42857 | 7.215007 | 72.15007 | 72.15 |
Table 5: Effect of Increase in FIFO Depth under low loads
Effect of increase in FIFO DEPTH on performance under low traffic load conditions obtained after simulating integrated 10G RS Model is summarized in Table 5. It concludes that there is a wide scope of improvement in performance with increase in FIFO_DEPTH at low loads.
FIFO Depth is one of the most important factors characterizing the performance of the system depending on traffic load condition. A regressive study of network is needed before selecting FIFO_DEPTH.
VI. Looking Forward
No. of engines is another factor determining the performance of the Adopted solution to be studied in the next paper.
Also by enabling this mechanism, node with least FIFO depth will be the limiting factor.
VII. Conclusion:
This paper concludes that “Overhead” is one of the major bandwidth limiting factor in an Ethernet network and “Adaptive rate control Algorithm” is a solution adopted to overcome this factor at RS Layer without disturbing the legacy system. There can be several other solutions as well so that the globe can be benefitted.
Implementation of adopted solution is not restricted from MAC to PHY transmission. It can be implemented at repeaters (extended till data link layer) in physical media. In an Ethernet network, there can be several such repeaters where each repeater can play a significant role in improving bandwidth efficiency over aggregate gigabit link.
VIII. Reference
[1] IEEE Std 802.3™-2008, “IEEE Standard for Information technology Telecommunications and information exchange between systems - Local and metropolitan area networks”.
[2] SPIRENT White Paper, “HOW TO TEST 10 GIGABIT ETHERNET PERFORMANCE”.
[3] Xerox Corporation 1980, “Measured Performance of an Ethernet Local Network - by John F. Shoch and Jon A. Hupp”.
[4] Universidad Carlos III de Madrid, Spain, “Ethernet as a Carrier Grade Technology: Developments and Innovations – by Lampros Raptis and Kostas Vaxevanakis”.
[5] Wikipedia, “Ethernet” - http://en.wikipedia.org/wiki/Ethernet.
If you wish to download a copy of this white paper, click here
|
Related Articles
- How control electronics can help scale quantum computers
- Functional Safety for Control and Status Registers
- From a Lossless (~1.5:1) Compression Algorithm for Llama2 7B Weights to Variable Precision, Variable Range, Compressed Numeric Data Types for CNNs and LLMs
- ASICs Bring Back Control to Supply Chains
- Writing a modular Audio Post Processing DSP algorithm
New Articles
- Quantum Readiness Considerations for Suppliers and Manufacturers
- A Rad Hard ASIC Design Approach: Triple Modular Redundancy (TMR)
- Early Interactive Short Isolation for Faster SoC Verification
- The Ideal Crypto Coprocessor with Root of Trust to Support Customer Complete Full Chip Evaluation: PUFcc gained SESIP and PSA Certified™ Level 3 RoT Component Certification
- Advanced Packaging and Chiplets Can Be for Everyone
Most Popular
- System Verilog Assertions Simplified
- System Verilog Macro: A Powerful Feature for Design Verification Projects
- UPF Constraint coding for SoC - A Case Study
- Dynamic Memory Allocation and Fragmentation in C and C++
- Enhancing VLSI Design Efficiency: Tackling Congestion and Shorts with Practical Approaches and PnR Tool (ICC2)
E-mail This Article | Printer-Friendly Page |