|
||||||||||||||||||||||||||||||||||||||||
An efficient way of loading data packets and checking data integrity of memories in SoC verification environmentAbhinav Gaur, Gaurav Jain, Ankit Khandelwal (Freescale Semiconductors) Introduction The applications for modern day SoCs are becoming bulkier in size, which in turn requires a huge amount of storage. These storage solutions can be On-chip or Off-chip, depending on the system requirements. SoCs today consist of various types of on chip and off chip storage solutions – like SRAM, DRAM, FLASH, ROM, etc. which can be used for a multitude of applications. For example, a piece of software code can be placed in different memories in a system, storing the incoming data from cameras in different memories, and computation-intensive peripherals use different kinds of memories for performing various operations and so on. There are use-cases which include data processing by multiple masters and usage of more than one memory type in its complete data path (for storing the intermediate data) and hence data sanity needs to be assured at each and every step .With the already crunched timelines for the execution, it becomes necessary to device methods to save verification time. This paper discusses about the requirement for backdoor loading and comparison of processed data (in digital verification environment) and explains a method to implement such a scheme, which saves a lot of simulation time, while maintaining the data integrity. Why Memory backdoor loading/comparison is required Memories are used by different masters to store the code and data. In real application, these masters use different memories present in the system based on the system use-case requirement. While doing so they will be writing and reading data from the memory. With time, the amount of data which needs to be stored and processed is increasing. In simulations, loading the data and comparing the processed data via masters is virtually impossible when it comes to huge data (say, in order of MB’s). The general practice is to load the initial data which will be used by the masters for processing via backdoor. After the processing, few chunks of data in the memory are checked at start, middle and the end locations for sanity, reading each memory location by the core present on the SoC and comparing it with the expected data. This method is indeed “very inefficient” because of the following reasons:
For example, suppose we want to load 2KB of data via core (working at say 100MHz). Assuming that the core takes 4 cycles per transfer with 64 bit data bus interface (8 bytes per transfer), then the time taken to load this amount of data will be (2000/8) * (1/100) * 4 = 10us. Now, 10us time in simulation will translate to a considerable test time while running this case, simply to load the data! Instead, if we load this data via backdoor, we can save this entire 10us simulation time. In short, backdoor loading is way more efficient as compared to front door loading for test cases where multiple memories are to be loaded or compared. Hence, we need a flow which provides a much faster and accurate method of backdoor loading and comparing the data. However, dealing with the preloading of so many different kinds of memories, each having a different kind of simulation model can be difficult to manage in a testbench. This paper presents an approach of doing so using user-friendly tasks that it not only makes life simple for the verification engineer, but also gives a good confidence on the correct functionality of the complete data path of the system. These tasks load and compare the final processed data with the expected data in “ZERO” simulation time and are independent of the data size thus saving a lot of simulation time. Building blocks for memory loading/comparison The below section explains the basic setup for memory loading/comparison. The first step is to assign unique IDs to various memory chunks present in the design. Let’s try to understand this with the help of an example where we take different memories present in system and develop different tasks.
A task is created which takes address as input and returns the memory ID. Let us name this function as “return_mem_id“. The address here is the memory mapped address. Another important factor to be considered is the memory width of each of the memories. Different memories have different widths depending on the system requirements. For this, another task can be created, say “return_mem_width”, which takes the memory IDs as input. Also, the memory map address of each memory will be different from the actual physical address of the memory cells. For backdoor loading, we would require the physical address of the memories and not their memory mapped addresses. A task would be required for this, which again, takes the memory ID as input and returns the physical address by which the particular memory needs to be addressed for backdoor loading. Let us name this task as “return_physical_address”. Now that we are done with creating the basic tasks for calculating the memory ID, width and physical address, we can proceed to build the “memory_write” task. This task takes the memory ID, physical address and write data as input. The width of data that is to be written depends on the memory width. Depending on the memory ID, it does the backdoor loading for that particular memory. Each memory will have a different way of backdoor loading. For example, if Denali models for the external interfaces like Flash, DRAM etc. are used, there are special PLI tasks which are used to load the data. That can be found out by studying the model of that memory. Similar to the “memory_write” task, a “memory_read” function can also be created which also takes the memory ID and physical address as input and returns the read-data. Many memories support ECC or parity feature also. For example, for 8 bit data, there may be 5 bits ECC associated with it. The “memory_write” task can be enhanced to calculate ECC each time while loading data to each location of the memory. Similarly in the “memory_read” can be enhanced to read the data from each memory location, feed it to a ECC calculating task, and compare it with the ECC read from the memory location. This also helps in checking that ECC is stored correctly in the memory at various stages of the data flow in a use-case. These functions mentioned above can be stitched together to provide a seamless flow for backdoor loading and comparison of different memories of the SoC. Testbench Infrastructure for loading/comparison of memories through command line (as simulation arguments) 1. Loading of data For loading of memories, the user simply needs to provide:
A top level task processes all this information. Depending on the format of data, it parses each data line of the data file and loads the data at various memory locations using the takes mentioned earlier (memory_write, etc. ). The top level task repeats this process for all the memories which are passed through command line for being loaded. The task can be made user-friendly so that if the user does not give the format of data, it can calculate by itself, by parsing lines of the data file. 2.Comparison of data For comparison of data present in memories, a C-side function is created which can be called at any point in the testcase for memory comparisons. The memory comparison information that needs to be passed through the command line is:
The C- side function sends a trigger to testbench side, triggering a top level memory comparison task. This task compares the data by parsing it one at a time (depending on the format of data) from the data file, with the actual read-data being returned by the “memory_read” task. Similar to memory loading, if the user does not give the format of data, the task can calculate that itself by parsing the lines of the data comparison file. For example, this is what a sample C-code containing memory comparisons at various stages look like this: <Initialization code> <Input data fetched from a source and stored at location in SRAM1> memory_comparison(1) //Compare the data at location, format, etc. as per 1st memory comparison argument <Fetch this data, process it, and store in SRAM2> memory_comparison(2) //Compare the data at location, format, etc. as per 2nd memory comparison argument <Process the data again and place the final data in DDR0> memory_comparison(3) //Compare the data at location, format, etc. as per 3rd memory comparison argument <End of Test> Conclusion The above described method has been implemented and has led to uncover many issues in various SoCs during design phase. Following are the main advantages of using the mentioned scheme in verification environment:
With this we are able to achieve faster data integrity checks without any software overhead. This is indeed very useful from digital verification perspective as just waiting for the software (usually core) to load and compare the data takes a lot of time and is definitely not a good idea. Such smart methodologies are the need of the hour as someone rightly said “The more you value your time, the more value it will bring”.
|
Home | Feedback | Register | Site Map |
All material on this site Copyright © 2017 Design And Reuse S.A. All rights reserved. |