|
||||||||||
High Bandwidth Memory (HBM) Model & Verification IP Implementation - Beginner's guideBy Amith Nagaraj, Anand V Kulkarni, Atria Logic Pvt Ltd INTRODUCTION With ever increasing need for very high operating frequency of graphics (GPU) or general purpose (CPU) processors, the limited memory bandwidth forms a bottleneck to extract maximum performance out of a system. To add to this, low power consumption, form factor of the memory devices also play significant roles in optimal and efficient solutions. DDR5 (GDDR5) memory technology has been riddled with issues like
High Bandwidth Memory (HBM) is a high-performance 3D-stacked DRAM. It is a technology which stacks up DRAM chips (memory die) vertically on a high speed logic layer which are connected by vertical interconnect technology called TSV (through silicon via) which reduces the connectivity impedance and thereby total power consumption. The HBM DRAM uses ultra- wide interface architecture to achieve high-bandwidth, high-speed, low-power operation. The HBM DRAM is optimized for high-bandwidth operation to a stack of multiple DRAM devices across a number of independent interfaces called channels. In this article, we will discuss about the implementation details of HBM memory model, and verification IP along with example test bench environment involving HBM controller. The Atria Logic High Bandwidth Memory (HBM) Verification IP is a SystemVerilog (SV) based IP that can be used to verify a HBM memory controller design. The VIP is pre-verified and configurable. FEATURES Supported Features This version of the HBM VIP is compliant with the JEDEC JESD235A specification. The following are the features supported by the VIP:
Unsupported Features The following features, present in the JEDEC JESD235A specification, are currently not supported by the VIP:
NOTE: Since the mode registers can be used to enable DWORD training, it is up to the user to ensure that DWORD training is not enabled at any point. In case the DWORD training is enabled through mode registers, the READ and WRITE operations of the HBM is suspended till the DWORD mode is disabled. In the DWORD training mode, the VIP expects only READ and WRITE commands and NOP commands to be sent on the address bus. VIP ARCHITECTURE The HBM VIP has been implemented using SV classes. The class descriptions are part of a package that is imported into a top module. The module then instantiates the classes as needed. The module itself needs to be instantiated in the verification environment along with the HBM memory controller design unit that needs to be verified. An HBM device can consist of up to 8 channels. The VIP provides implementation of a single channel in the form of a class. Multiple objects of this class can be created to mimic a multi-channel HBM device. Each HBM channel operates independently and has its own interface. HBM operation consists of below phases Power-up phase The HBM operation begins with the power-up phase. The power-up phase involves issuing a reset. The HBM VIP waits for an initial reset. Until this reset is issued, the channel is stuck in the power-up phase. The channel moves on to the initialization phase once it detects the reset. Initialization phase This phase is entered as soon as a reset is detected. The reset may be during the power-up phase or it may be an intermittent reset – one issued during normal operation. Upon entry into this phase, the device memory is wiped. If the phase is entered after an intermittent reset, the device’s mode registers are initialized to default values. In this phase, the model implements checks for the initialization sequence. It also drives various output signals to valid levels as per the initialization protocol. Configuration phase Each HBM channel contains 16 mode registers. These registers set the device configuration and are modified by means of MRS commands. The registers can be written into but cannot be read from by the memory controller. While some of the fields of these registers have default values, not all of them do. The register fields (that are not reserved) without default values are initialized to ‘x’ after every reset. These fields need to be configured before the memory is accessed. When an MRS command is issued to the HBM, the model will check, at the time of updating the mode registers, for the validity of the configuration. In case the user inputs invalid configurations, the model will indicate an error. Normal operation The JESD235A specification details the protocol to be followed during normal operation. The model complies with the protocol and if it detects a protocol violation from the memory controller, it throws up an error. The protocol checks are mainly timing checks between commands as well as setup and hold time checks and pulse width checks for the inputs. Clock frequency change The model supports this feature and implements checks for this protocol. Once clock frequency has been changed, the user may move to either configuration phase or to normal operation. It is not mandatory for the user to re-configure the mode registers. The mode register values are retained even after the clock frequency has been changed. In fact, these values are only changed when an MRS command or a reset is detected. Intermittent reset The user may issue a reset during the normal operation. Whenever a reset is detected, all ongoing processes are stopped and the model resets the values of internal flags (initialization_done and programming_done) to zero. It then moves on to the initialization phase. The model checks for this duration as part of the initialization sequence protocol checks. THE HBM CHANNEL ARCHITECTURE As previously mentioned, the HBM channel is implemented as a System Verilog class. The class properties are of two types:
The class methods are also of two types:
Figure 2 provides a high-level architecture of the HBM channel: Figure 1: HBM High-level Channel Architecture The VIP config file is common to the entire VIP and is accessed by all of the channels. None of the other blocks, including the interface, are shared between channels. In the channel architecture, the various blocks are implemented as SystemVerilog tasks or functions. These blocks are described in the following paragraphs. Command and Timing Block The command and timing block has several sub-blocks.
Setup and Hold Time Checks This block implements setup and hold time checks for all of the channel interface inputs. MRS Block The MRS block is invoked by the Column Command Decode Block whenever an MRS command is detected. The MRS block then checks the register to which the command was issued and, after waiting for TMOD period, updates the registers in the Global Register Block with the data that was provided as part of the command. Activate and Precharge Block The Activate and Precharge block is called by the Row Command Decode Block whenever an ACT, PRE, or PREALL command is detected. Depending on the command, the block make appropriate changes to the row_enable variable (which keeps track of all the rows that are currently open). Read and Write Block The read and write block is divided into two sub-blocks: the read block and the write block. Read Block Once a read command (with or without auto-precharge) is detected, the block looks up the current configuration of the channel from the global registers block and checks if the bank to which the command was issued has an open row. If no rows are open, the block indicates an error. Otherwise, it implements the read protocol. If the read command is issued with auto-precharge bit set, then the read block updates the row_enable variable. If read parity is enabled, the block sends out read parity after parity_latency time has elapsed from the time at which read data transmission starts. If ECC is enabled, the read data sends out the stored ECC data on the ECC/DM bus. Once the current read transaction has been completed, high impedance values are driven on the bus, provided that no other read transaction (pre-conditioning or data transmission) is in progress. Write Block Once a write command is detected (with or without auto-precharge), the block looks up the current channel configuration from the global registers block and checks if a row is open in the bank to which the command was issued. If no row is open, the block indicates an error; otherwise, it implements the write protocol. If the auto-precharge bit is set in the write command, then the block updates the row_enable variable. Once the write data is received, it is processed as per the current mode register configuration. It is then stored into the memory. If ECC is enabled, then the block stores the received ECC into memory. The block then waits for the write data parity, if enabled, and checks for parity mismatch. If any mismatch is found, the block drives the appropriate DERR signals to 1. Refresh Block The refresh block is a set of countdown timers which count down from 9*TREFI. Each bank has a separate timer. The timer of a bank is reset whenever a REF or REFSB command (to that bank) is detected or whenever the HBM enters self-refresh mode. In case the timer of a bank counts down to 0, an error is recorded and all of the data in the bank is erased. Low Power Modes Block The low-power modes block is divided into two sub-blocks: the power down mode block and the self-refresh mode block. The low-power modes block is triggered whenever the CKE signal goes low during normal operation. Depending on the command on the row bus when CKE goes low, one of the two sub-blocks is invoked. Power down Mode Block The power down mode block is called whenever there is an RNOP command on the bus when CKE goes low during normal operation. Upon entry into power down mode, the model implements a number of checks. These checks, together, cover all of the protocol points provided in the JEDEC JESD235A specification for power down mode. The protocol checks include checks for clock stopping and clock frequency change as well. If clock frequency is changed during this mode, at the time of power down exit, the new frequency is recorded and the timing parameter values are updated. The power down mode block exits when the tXP time has elapsed after PDX command – NOP on the buses with CKE high. Self-refresh Mode Block The self-refresh mode block is called whenever there is a refresh (REF/REFSB) command on the bus when CKE goes low during normal operation. In this block, the model implements checks to cover all of the protocol points for self-refresh mode. The specification allows for the clock frequency to be changed in this mode as well. If clock frequency change is detected, the new frequency is recorded and the timing parameter values are updated upon self-refresh exit. The block exits when the tXS time has elapsed after SRX command – NOP on the buses with CKE high. Interface The HBM interface consists of all of the signals that are part of the channel’s interface. For each channel, a separate instance of the interface is instantiated. The specification specifies multiple bi-directional buses (for data, dbi, dm/ecc) and several uni-directional buses (address/command, cke, etc.). All of the bi-directional buses have been split into two uni-directional buses in the interface. The top module, however, contains only bi-directional buses and the interface’s signals have been assigned appropriately. The AERR signal has also been split into row_err and col_err signals in the interface and the AERR signal has been assigned to the logical OR of these signals. Memory The memory is implemented in the form of a multi-dimensional dynamic array. The size of the memory will depend on the VIP configuration. If the device is configured as pseudo-channel mode, the memory will have two pseudo-channels, with each pseudo-channel having 8 or 16 banks. In case of legacy mode, the memory will not be divided and will have 8 or 16 banks in total. Each bank is further divided into a number of rows and each row will have 64 columns. In pseudo-channel mode, each column will store 128 bits of data along with ECC, if supported. In legacy mode, each column will store 256 bits of data along with ECC, if supported. A burst length setting of 2 will involve data from one column whereas a burst length setting of 4 will involve data from two consecutive columns. Functional Coverage Functional Coverage Using SystemVerilog “covergroup” Functional coverage using SystemVerilog cover groups is implemented as a separate class. This keeps track of back-to-back commands ({row, row}, {row, column}, {column, row}, {column, column}) and is useful to understand the spacing between the commands. The cover points include row commands, column commands, and back-to-back non-NOP command combinations crossed with the number of clock cycles between the back-to-back commands. Functional Coverage Using SVA “cover property” Functional coverage has also been implemented in the interface using SVA cover property statements. A large number of SVA properties have been written to cover various possible input scenarios and all of these properties have an associated cover property statement. These cover property statements are exhaustive and provide a better basis for measuring functional coverage than the cover groups detailed above. The properties are implemented using a number of interface signals (each of size 1 bit) that are not part of the HBM specification. These signals, driven by the Row Command Decode Block and Column Command Decode Block, indicate the latest command recorded. USING THE VIP IN A TESTBENCH This section provides details on using the VIP to verify a memory controller. It has sections that provide details of VIP configuration and instantiation, the log files generated by the VIP, and the testbench architecture used to verify Atria Logic’s HBM Memory Controller. Please refer https://www.design-reuse.com/articles/41186/design-considerations-for-high-bandwidth-memory-controller.html for memory controller details INITIALIZATION SEQUENCE TIMING PARAMETER VALUES The VIP configuration file contains the set of timing parameters that are used during the initialization sequence. These parameters are: tINIT1, tINIT2, tINIT3, tINIT4, tINIT5, tINIT6, and tPWRESET. These parameters are defined in the JESD235A specification in the section on Initialization. The specification provides values for these parameters. However, in the interest of speeding up simulation, the VIP allows the user to modify these values. Note that it is the responsibility of the user to provide the values to these parameters in such a way as to maintain the relative magnitudes as in the specification. Of the seven timing parameters, some have units of clock cycles while the others use an analog delay. Those that use an analog delay are to be mentioned in terms of nanoseconds. COMPILER DIRECTIVES Apart from the timing parameter values mentioned in the previous section, the VIP configuration file can also be used to enter compiler directives. These directives configure the memory size and mode of operation and also help the user to configure the functional coverage aspects of the VIP. It is not mandatory to enter the compiler directives in the configuration files. They can be declared from the command line during compilation. However, the directives need to be used to ensure proper behavior of the VIP. VIP TOP MODULE PARAMETERIZATION The instantiation of the VIP will be covered later. For now, it is sufficient to know that the VIP has a top module that needs to instantiated. This module has 9 parameters that can be configured during instantiation. The parameters have a default value. Protocol Analyzer vs Memory Model The VIP can be configured for use as either a memory model or a protocol analyzer through a parameter PROTCOL_ANALYSER. By default, this parameter has value 0, i.e., the VIP is configured as a memory model. When this parameter is set to 1, the VIP works as a protocol analyzer. When used as a protocol analyser, the VIP does not respond to memory access commands (READ/WRITE). However, it still implements all of the timing checks. The protocol analyser currently gives a fatal error and stops the simulation as soon as it detects a protocol violation. In future versions, an option will be provided that allows the analyser to continue working even after it detects errors, subject to an upper limit. This option can be provided for the model even when it is not configured as a protocol analyser. Initial Mode Register Values The VIP allows the user to set the values of the mode registers of the memory channels once upon initial reset without having to issue MRS commands. This can be done through a set of parameters. The parameters are MODE_REGISTER_x, where x takes values from 0 to 8. These parameters are declared as arrays and the size of each array is equal to the number of channels. The values are for mode registers 0 to 8 as defined in the HBM specification. The parameters are 8 bits wide each (declared as 7:0) and the LSB of the parameter corresponds to the LSB of the mode register that it is associated with. Note that these values will not be loaded into the mode registers of the HBM after intermittent resets. In that case, if a mode register field does not have a default value, it is given the value ‘x’. USING THE VIP IN A TESTBENCH TO VERIFY HBM MEMORY CONTROLLER The VIP has been developed to verify HBM memory controller. The architecture of the testbench used to verify Atria Logic’s HBM Memory Controller using the VIP is shown in Figure 5. The testbench instantiates the memory controller RTL design along with the HBM VIP in the top module and uses the verification environment as a PHY layer to enable data transfer between the controller and the design. The testbench is implemented using SV and UVM. The verification environment is explained in the following sections. Figure 2: HBM Memory Controller Testbench Architecture - UI_ENV The UI_ENV speaks with the Memory Controller’s user interface. This is responsible for providing commands to the controller as well as for configuring the memory controller’s registers as well as the mode registers of the memory itself (via the controller). The User Interface The memory controller’s user interface is divided into two components:
- UI_AGENT The UI_AGENT is responsible for providing the READ, WRITE, and NOP commands to the memory controller. It only communicates with the controller’s FIFO interface. The agent has a driver, a monitor, and a sequencer, and it is connected to the UI_SCOREBOARD in the UI_ENV. Data is collected by the monitor and sent to the scoreboard in every clock cycle. - REG_AGENT The REG_AGENT is responsible for configuring the memory controller’s registers, for disabling clocks during low power modes (and re-enabling them), for changing clock frequency, and for updating the values of the timing parameters after every clock frequency change. This agent has a driver, a monitor, and a sequencer, and it is connected to the REG_SCOREBOARD in the UI_ENV. Data is collected by the monitor and sent to the scoreboard in every clock cycle. - REG_DRIVER The driver follows the memory controller’s register interface protocol. Upon reset, the driver first updates all of the timing parameter values. It then configures the mode registers of the memory (via the corresponding register set in the memory controller). In case any of the timing parameters need to be updated after configuring the mode registers, the driver does so. Details of the memory controller’s registers can be found in the memory controller’s documentation. Once the driver writes into the mode register set inside the memory controller, it instructs the controller to write those values into the mode registers of the HBM device. The driver is also responsible for putting the HBM device into low power modes. It again does so by instructing the memory controller through a register. To come out of low power modes, the driver again instructs the controller through a register. Note that the memory controller’s register interface protocol says that commands should not be issued on consecutive clock cycles. Between two commands (read-read, read-write, write-write, write-read), there should be a NOP. Also note that while the HBM’s mode registers cannot be read from, the memory controller provides an option for reading its registers. - UI_SCOREBOARD: The scoreboard is responsible for only data integrity checks. The memory controller sends additional error bits with the read data. If either there is a data mismatch or if any of the error bits are set to 1, the scoreboard indicates an error. - REG_SCOREBOARD: The scoreboard is responsible for only data integrity checks with respect to the memory controller’s registers. If the scoreboard detects an update command, it updates the expected values of the mode register based on previous write commands. If the scoreboard detects a read command, it checks the values of the register being read from in the next clock cycle. Note that the memory controller provides the register read data one clock cycle after receiving the command. Further note that the scoreboard only does data integrity checks for those registers whose values cannot be modified by the memory controller. - PHY_ENV The PHY_ENV works as a PHY layer. It is responsible for facilitating data transfer between the model and the memory controller. It is also responsible for implementing the initialization sequence as per the JEDEC JESD235A specification. PHY interfaces The PHY_ENV is connected to two interfaces:
The memory controller’s output interface has data buses whose width are twice the width of the HBM channel interface buses. This is because the memory controller operates at SDR whereas the HBM device operates at DDR. The lower half of the buses correspond to the rising edge data whereas the upper half correspond to the falling edge data. When operating in pseudo-channel mode, the first and third quarters of the bus correspond to pseudo-channel 0 while the second and fourth quarters carry the data of pseudo-channel 1. Whenever the memory controller wants to write data into the HBM, it sets the wdata_valid signal to high. There is also the rdata_valid signal, which, when high, indicates to the controller that read data is being transmitted from the HBM. The PHY_ENV is responsible for converting data from DDR to SDR and back again as and when necessary. To accomplish this, he PHY_ENV has two “agents”. - HBM_TO_MC_AGENT This “agent” is responsible for receiving read data from the HBM and sending it to the memory controller. It has a driver and nothing else. - HBM_TO_MC_DRIVER The driver monitors the PHY interfaces and whenever it detects a READ command, it gets prepares to receive read data, read dbi and read ecc, as well as read parity, from the HBM. After it collects all of the data associated with a particular read command, it converts the DDR data to SDR data and then sends it to the memory controller. If there is a non-zero parity latency, it sends the read data, waits to receive the read parity, converts the received parity into SDR data, and then send the parity data to the memory controller. - MC_TO_HBM_AGENT This “agent” is responsible for:
The agent uses a driver and a monitor to perform the above tasks. - MC_TO_HBM_DRIVER Upon reset, the driver implements the initialization sequence. It then hands over the control to the memory controller. At the end of initialization, the driver indicates to the memory controller calibration is complete by means of a calibration_done signal. The driver converts the SDR command/address signals to DDR in every clock cycle and forwards the command/address to the memory model. Whenever the driver detects the WRITE command, it waits to receive data from the memory controller, converts it into DDR, and transmits it to the HBM. - MC_TO_HBM_MONITOR The monitor checks that the HBM model implements the initialization sequence as per protocol. Whenever it detects a WRITE command, the monitor waits for write_latency and checks that the wdata_valid signal is generated and held high for the appropriate number of clock cycles. Further, the monitor also indicates an error whenever the HBM drives one of the DERR or AERR signals to ‘1’. MEMORY MODEL INSTANTIATION The HBM VIP provides a separate interface for each channel, a set of global signals, an IEEE port, and a DA port. Currently, the IEEE port (excluding the WRST_n signal), the DA port, the CATTRIP signal, and the TEMP signal are not used. Therefore, these signals may be left unconnected. However, the signals for each channel interface should be connected. On the VIP side, the channel signals are declared as an array and, therefore, the model instance expects an array as input for the channel interface signals. In case the memory controller talks to just one channel, as in the case of Atria Logic’s HBM memory controller, the channel signals should still be declared as an array, albeit of size 1. The HBM VIP can be instantiated like a design unit and can be parameterized as previously mentioned. The WRST_n signal of the IEEE port should be connected and it should be driven to valid levels as per the protocol, at least during the initialization phase. The model checks for the wrst_n signal and will indicate and error if it is not driven to valid levels. The signal names are identical to the ones provided in the JEDEC JESD235A specification. - RESET_ENV The testbench includes a separate environment for issuing reset. The environment contains an agent that includes just a driver. The driver issues reset based. This is controlled by a virtual sequence. - Test Cases for the Verification of Memory Controller Note that in the verification environment described in the previous sections, there are two agents that speak with the memory controller on the user interface side simultaneously: UI_AGEENT and REG_AGENT. It is the responsibility of the user to ensure that the two agents work in sync. For this purpose, using a virtual sequence to coordinate between the sequences running on the two agents’ sequencers is recommended. This section lists a few test cases for the verification of the memory controller. It is by no means an exhaustive list and to ensure complete coverage, the user should try to implement cases based on the SVA cover properties. Further, in addition to writing sequences for functional coverage, the user is also recommended to implement tests that ensure complete code coverage. The tests listed below are only for functional coverage. 1. Choose a memory location randomly. Then continuously provide alternate write and read commands to this location. This test has been implemented. The details of this test are as follows: Sequence name: write_read_same_location_test_sequence 2. Write to location A, then read from location A. The repeat for different locations. This test has been implemented. The details of this test are as follows: Sequence name: write_read_same_location_burst_test_sequence 3. Write to all of the columns from 0 to 63 of a particular row. Then read from all of these columns. This test has been implemented. The details of this test are as follows: Sequence name: write_read_same_row_test_sequence 4. Choose a bank of the memory channel randomly. Give consecutive write commands to locations in different rows of the bank. Then give consecutive read commands to different rows of the bank to read back the previously written data. This test has been implemented. The details of this test are as follows: Sequence name: write_read_same_bank_different_row_test_sequence 5. Write to location A of a random bank. Then read from the same location. Repeat by changing the bank address. This test has been implemented. The details of this test are as follows: Sequence name: write_read_write_read_different_bank _test_sequence 6. Choose a random row from a random bank. Give alternate write and read commands to locations from that row. This test has been implemented. The details of this test are as follows: Sequence name: write_read_write_read_same_row_test_sequence 7. Choose a random bank. Write and read from a location in a randomly chose row of the bank. Then randomly choose another row of the bank and repeat. This test has been implemented. The details of this test are as follows: Sequence name: write_read_write_read_same_bank_different_row_test_sequence 8. Give consecutive writes to locations in different banks. Then give consecutive reads to locations in different banks. This test has been implemented. The details of this test are as follows: Sequence name: write_read_different_bank_test_sequence Additional notes: This sequence is very useful for testing that the memory controller implements the tFAW protocol correctly. 9. Only for PSEUDO-CHANNEL mode. Give consecutive writes to different pseudo channels. Then give consecutive reads to different pseudo channels. This test has been implemented. The details of this test are as follows: Sequence name: write_write_read_read_different_pseudo_channel _test_sequence 10. Only for PSEUDO-CHANNEL mode. Write and read from a location in pseudo-channel 0. Then write and read from a location in pseudo-channel 1. Repeat. This test has been implemented. The details of this test are as follows: Sequence name: write _read_write_read_different_pseudo_channel _test_sequence 11. Write to the last column of a row. Read from the last column of the same row. Change row and repeat. 12. Write to the last column of the last row of a bank. Read from the same location. Change bank and repeat. 13. No read and write transactions at all. Just NOPs. Details of all of the tests that have been implemented in this testbench have been provided above. Users are recommended to write tests for the other test cases suggested above. It is also recommended to repeat all of the tests for all valid mode register settings. Note that for each of the above tests, you can randomly choose to enter (or not enter) one of the low power modes after completing the transactions. Then, if you enter the low power modes, you can choose to reset or not reset the system when in low power modes. You can also choose whether or not to stop clocks during the low power modes. Further, if you choose to stop clocks, you can also choose whether or not to change clock frequency. Further, you are required to run the tests with enough number of transactions to ensure that the memory controller has time to issue REF (or REFSB) commands. Also, if your memory controller implements the option to delay or advance REF commands, then you should run a test that allows your memory controller to do so. In fact, you can just run test number 13, where you do no read and write transactions at all but just keep the memory controller active, for long enough to check for REF/REFSB commands (delayed or otherwise). The test cases above will ensure a fair amount of functional coverage. However, it does not include everything. For example, you will need to write tests that will move from self-refresh mode to power down mode or vice-versa. LOG FILES The VIP generates a set of log files for each channel. The log files have .txt extension. Each set of log files includes 7 files:
The “*” in the file name will be replaced by the channel number to which the file corresponds. In all of the files, the values of any data or address are in decimal, unless otherwise specified. Binary representations are preceded by (‘b) whereas hexadecimal representations are preceded by (‘h). If neither of the above prefixes are found, and unless otherwise specified, the value is represented in decimal system. Further, the MRS command details are in binary. Note that all of the files, except mrs_details_channel_*.txt, also capture reset events whenever they occur. If you wish to download a copy of this white paper, click here
|
Home | Feedback | Register | Site Map |
All material on this site Copyright © 2017 Design And Reuse S.A. All rights reserved. |