|
||||||||||
Optimizing System Management in the Platform SoC EraHoward Pakosh, ChipStart Introduction Consumer focused SoCs have evolved into platform architectures that are now being driven by requirements from operating systems such as Android, iPhone. Linux, and Windows and the thousands of applications they support. Overtime more of the system is moving into silicon . As a result, system management functions have moved into the SoC. Traditional feature based regression testing at the silicon level must now be increasingly complimented with complex system level testing in order to maintain a high level of system coverage across SoC road maps. Balancing price-performance-power and high system level test coverage therefore creates complex system management design challenges that effect both hardware and software operation. System management must now be considered as a central feature and responsibility of the SoC architecture, not just as a tactical design consideration for the development of each individual SoC. System management should provide adequate synchronization of hardware state changes driven by software, maintain reasonable time to market and maximize system test coverage and support. The remainder of this paper will discuss design considerations and compare and contrast three system management architectures. The first is an ad hoc system management, which is comprised of combinations of hardware and software elements that serve a dual purpose, one being normal operation, and one for system management. The second is including system management as part of the on-chip interconnects implementation. The third architecture introduces a control plane approach for system management which complements the data centric global interconnect. Finally the paper will discuss the growing importance of integrated subsystem design and IP for SoCs and how system level partitioning will play a growing role in achieving efficient system management. System management design considerations One of the key challenges associated with designing SoC system management schemes stems from the growing number of programmable devices on-chip. Programmable devices exponentially increase the number of combinations of software operations that drive hardware state changes in real time. This in turn complicates system level testing in order to achieve reasonable test coverage. Optimizing the SoC design for a single operating system provides little relief , because the diversity of applications running on the SoC continues to multiply the testing complexities at the system level. System level testing via traditional silicon level functional and data path regressions must now be augmented by system functional test suites include the programmable elements and their impact on hardware state changes. Each programmable core can be isolated and tested to achieve a high level of code coverage, and each execution path through the different cores combinations can be tested., but the combinations of hardware state changes they require as a result of application behavior makes it almost impossible to achieve adequate system level coverage solely from testing the cores and the buses in isolation or even pseudo random combinations. It is at this point that compromises are often made in the SoC design. How much risk is affordable when trading off the cost and time to build these complex system level regression suites with the actual test coverage achieved? As volumes grow the answer is risk must be mitigated and therefore these tradeoffs become essential to minimize. This paper challenges the increasing “tax” on the project costs to balance adequate system level test coverage, and risk, based on current system management architecture assumptions . Specifically, instead of continuing to grow regression suites and make risk choices based on the assumption that the associations between the levels of hardware and system testing are tightly coupled, abstraction layers can be inserted into the architecture to decouple the hardware, operating system, and applications support functions. Furthermore, each of these components can tested through independent elements introduced into the SoC architecture. In fact, this trend has already begun. The growing use of decoupled global interconnect structures, such as those that employ OCP or similar features, provides a proven example of how to ease chip architecture design as it evolves from single to multicore or multi-layer. By “abstracting” the data plane, and allowing the associations between the IP cores to become linked through the independent global interconnect structure, system performance at the hardware level becomes more predictable and tunable (CPU to off chip memory for example). This predictability affords opportunities to streamline the design process because these loosely coupled associations are less effected by specific design changes. This leads to more rapid timing closure even though the complexity of the data plane has grown significantly. Similar abstraction techniques can be applied to system management. The software and hardware layers, the system management, and the functional operation of the SoC can be decoupled, making it easier to test each component of the system level architecture while considering the system level driven hardware state changes. This results in a system level design which is more easily understood and has better test coverage. This approach also abstracts the system management operational complexities between hardware and software even though the number of applications grows. The next section of the paper will discuss three potential methods of abstraction that lead to varied degrees of optimizing system management. System Management Scheme Comparisons Given that the objective is to reduce overall system management complexity there are three baseline characteristics that system management schemes should be benchmarked by:
By applying these benchmark criteria, three methods can be evaluated. Method 1: Using a single operating system hosted on a “master” CPU. This has been a popular approach to perform system management because silicon elements already required for real time operation also execute system management functions. When SoC complexities are relatively low, this scheme is very efficient. No extra silicon, some extra software development, but very containable. However, the complexity growth associated with multicore SoC for consumer designs today have weakened the effectiveness of using this approach because as system tasks become distributed, that is more interdependent as more cores are added to the SoC, the visibility and control of any one core over any of the others is reduced with each new element added. The visibility and control becomes more dependent on the global interconnect as well as the cores, adding even more complexity to execute control functions. The addition of the global interconnect as part of the system testing is required in this case because it controls access to external memory, a key element in system operations. If the master CPU can no longer manage and verify the hardware state changes of the other core elements, the number of possible states increasing results in unpredictable coverage and the methodology no longer has value. Extending the scheme then to add system test does not return meaningful dividends on the potentially massive investment of developing the tests and verification infrastructure. Applying the criteria then to this method for today’s platform SoCs
Method 2: Introducing global interconnect structures and additional logic to support pseudo-control plane system management functions. This approach is an extension of method 1 because often the host CPU continues to act as the system management master. Side band signaling, either contained in the interconnect or designed separately is used for the control functions. Mixing data plane and control functions introduces abstraction levels that aides in achieving higher system test coverage as long as the SoC does not drive the interconnect requirements to become so complex that the control functions become a small and lower priority in the overall mix of functions. When this occurs the control tasks are executed sub-optimally as delays occur from priority choices between functional operations and system management tasks because of complex arbitration sequences and delayed communication through blocked hierarchical buses. Applying the criteria then to this method for today’s SoCs
Method 3: Introducing a control plane that compliments a data plane global interconnect. This approach differs from the first two methods because it does not extend the traditional host CPU system master approach. Rather, it introduces a separate control plane and an independent system controller to perform system management tasks. An independent control plane essentially abstracts the system management tasks from any one entity. As such, it can be controlled by any-or all SoC elements as required, and therefore offers multiple layers of abstraction. System testing can be developed by software, hardware, verification, and system engineers and applied using a common framework with equal effectiveness. This approach is also advantageous because it separates targeted control tasks ideally executed with low latency from longer more complex and often performance sensitive data plane tasks. This separation is often necessary when complexity is high, because traditional approaches reach the ceiling of effectiveness discussed during method 2. Applying the criteria then to this method for today’s SoCs
Summary: While method 3 introduces new control plane functionality, it also enables SoCs of virtually any complexity to be tested and operated with maximum efficiency achieved using the same approach. As such it is best suited for roadmaps that contain a wide variety of complexity or when extreme flexibility is required for the SoC architecture. The ability to direct the system controller using any SoC core is especially noteworthy because it allows multiple applications to directly control the hardware states in real time when needed and without the overhead of channeling its requests through other entities, thus avoiding inter-function dependencies, complexities and delays. The Impact of SoC Subsystems on System Management. The basic theme to achieving better system management is successful partitioning in order to increase adequate levels of system test coverage. This is why method 3 was chosen as the most effective for today’s system management needs. It stands to reason, then, that the impact of subsystem utilization further abstracts the system management tasks. However, creating systems within systems also introduces hierarchies of complexity and as such, further pushes traditional methods of system management useless. The growing use of subsystems over the next generations of SoC design will therefore accelerate the adoption of control plane based system management as the preferred method of architecture so that hierarchical levels of complexity can be absorbed into the system management architecture while maintaining a common architecture that provides the flexibility and scalability while minimizing risks and costs of expensive architecture redesigns that will accelerate as system requirements continue to become more complex.
|
Home | Feedback | Register | Site Map |
All material on this site Copyright © 2017 Design And Reuse S.A. All rights reserved. |