ICE-IP-338 High-speed XTS-GCM Multi Stream Inline Cipher Engine
Configurable Microprocessor for Life Essential Devices
Fergus Casey and Thang Tran (Synopsys Inc.)
Abstract:
As new use cases emerge in the personal electronics market, the processors that run them must evolve to meet the changing requirements. In 1990, the productivity era drove growth in personal computers and laptops. In 2000, the connectivity era brought with it smart phones and tablets. And in 2010, with assistance of WiFi, the world became connected in real time everywhere through social media (FaceTime, IM, Skype, Twitter), live streaming, and interactive gaming. A new era of essential devices to improve quality of life is now upon us. These essential devices, also known as the Internet of Things (IoT), require a step function improvement in power and performance efficiency together with extreme reliability requirements. This paper will explore how processors have evolved with attributes such as configurability and extensibility to enable the next generation of electronics.
I. INTRODUCTION
In the productivity era of the 1990s, PCs and laptops eventually reached a point that more performance from their processors did not add any value. Other features and electronics started to evolve around the PC/laptop such as thin, light, large screen monitors, portable mice, faster modems and routers. Then came an era focused on connectivity and mobility where the smart phone and tablet were at the center of innovation. But now even they are reaching maturity, any further performance improvement for the microprocessor seems unnecessary. The maturity is evident by:
- The smart phone camera goes over 10 mega pixels
- The audio and video quality of smart phone is as good as business video conferencing
- Kids watch video on smart phone rather than TV
- More games are played on smart phone than portable game consoles
Now we are seeing the emergence of new connected devices such smart cars, medical monitors, and smart cards that are becoming “life-essential” [1]. Many fall into the category commonly known as the internet of things (IoT) and are driving processor requirements beyond what was needed by PCs and smart phones. Some of the unique requirements for life-essential devices include financial and medical security and reliability, ultra-low power capabilities for weeks/months of battery life, support for safety standards to enable autonomous driving, and reliability to sustain device lifespans of 15+ years. These requirements are driving the evolution of a new type of processor in which application optimized with configurability and extensibility. This paper explores the evolution of processors over the years and explains the reason for the genesis of a new wave of processors for the next-generation applications for the next 20 years
II. BACKGROUND
Microprocessors are an important differentiation for many companies [2]. Intel® became the largest semiconductor company in the world with X86 processors. Intel and AMD are integrated circuit producers that provide microprocessor design, full software support, fabrication, assembly, packaging, and production test. The microprocessor chip can be purchased off the shelf and built into the PC/laptop. For smart phones and tablets, a new model emerged – processor as an intellectual property (IP) where the customer pays for a license to use the core in the design of a chip and pays a royalty fee based on the number of chips sold. The business model is completely different, with the processor IP becoming part of system-on-chip (SoC) design. This processor IP business is successful as illustrated by the recent acquisition of ARM® by Softbank for 32 billion dollars. The processor IP was still a fixed design and compiler that was provided off the shelf. Customers could not modify the RTL and had very limited options in terms of configurability. The SoC design is harder to copy, but most components on the SoC are off-the-shelf IPs. As the market evolves to include more life-essential devices such as smart cards, medical devices and fitness monitors, yet another type of processor IP is needed. The ability to configure and extend a processor to meet the requirements of the application is a promising new technology model for processors. It provides high degrees of flexibility with the ability for the SoC developer to include custom instructions that are unknown to their competitors. The SoC design can also reach the optimal tradeoffs in performance, power and area by customize the processor IP to the application requirements. Also, the customization of processor technology provides the ability to meet evolving standards. With the wide variety of applications and requirements today, from consumer to automotive, there are just too many different types of applications to standardize on a single fixed processor IP. Synopsys’ DesignWare® ARC® processors are an example of a configurable and extensible processor that is well positioned to ride the wave of the era of lifeessential devices.
III. BUSINESS MODEL FOR CONFIGURABLE PROCESSOR
Design flexibility is enhanced in this wave of processor designs. The IP model allows significant customer adaptation in the SoC design where power, performance, and area (PPA ), security, and flexibility are the highest priorities. Key attributes of a processor to drive successful next-generation designs include:
- Significant improvement in power and area. With configurability, processors can be tailored to achieve an optimum PPA for a given target application and specific type of memory.
- Ease of adding custom instructions and hardware accelerators.
- Ability to add security features that can be customized to an application as well as to the semiconductor; using obfuscation techniques known only to the SoC developer and no one else.
- Processor IP that comes with certification from industry recognized consultants or standards bodies, for example ISO 26262 ASIL D ready certification.
- Highly integrated synthesis flow for optimal PPA.
- Simple business model, in which the processor IP is licensed with mix and match of hardware accelerators per the customer’s need. Having the same processor IP platform with many possible configurations makes it easier for customers to provide product variation and support future upgrades.
As previously stated, the configurability provides the customers with (1) product differentiation and (2) faster time to market without the burden (and expense) of hiring a full processor design team. This generation of processor covers all aspects of PPA for the specific application, resulting in a tailored processor within the PPA triangle as shown in Figure 1. The optimal solution for a design is moving toward the center of the PPA triangle. The configurable processor IP allows simpler trade-offs to reach the optimal design.
Fig. 1. Microprocessors PPA Triangle
An SoC generally consists of many cores, one or multiple application CPUs, a graphic processing unit, hardware accelerators for audio/video/imaging, and a power-management core. The configurable and extensible processor can provide options to unify many of these cores. All the cores could use variants of a single ISA and both hardware and software can be reused.
Synopsys’ ARCv2 ISA is an extensible instruction set, with support for optional and predefined extensions, and a mechanism for creating variable-length extensions. The compiler technology includes the extensible instructions.
IV. APPLICATIONS OF CONFIGURABLE MICROPROCESSOR
A. Advantages of configurable processors
One key advantage of configurable processors is that the same processor ISA can used for all designs. Processors can be adapted to the application with the addition of custom instructions and configurable functional units. The advantages of adding custom instruction that are included in the compiler are:
- Ease of software pipelining, the custom instructions are part of the instruction set which are optimized by the compiler.
- Ease of adding custom instructions and hardware accelerators, the hardware accelerator operations are part of the instruction pipeline execution; no need for interrupts or status polling.
- Ease of adding and removing functional units, i.e. if integer divide is not used in application code, then it should not be part of the processor.
- Ease of configuring instruction and data memories. The exact size and type of memory, close coupled or cache, are used for optimal size and power.
The configurable microprocessor provides the exact requirements for an application with ease of integration. Another main goal of configurable and extensible microprocessor is to shorten the customers design
B. Extremely Low Power and Security
Many applications, such as drones, smart cards, medical monitor devices, and wearable devices, need extremely low power to operate because of its size or extended required operation time. Some applications devices such as medical devices and smart bank cards require security features which should be known only by the developer. The combination of low power and security is a very tough goal to achieve.
The below example of SHA-256 cryptographic standard implementations help demonstrate the trade-offs achieved through the use of extensibility to create a hybrid hardware/software.
The diagram below illustrates the performance and area trade-offs of a “pure” software implementation of SHA- 256 (running on a baseline core), a “pure” hardware solution, and the hybrid solution based on using hardware extensions to accelerate a software algorithm. As seen in Figure 2, with the addition of simple extension instructions allowing specific bit manipulation and shifting tailored to the algorithm, significant performance can be achieved with very low gate count. In the hybrid solution, the SHA-256 algorithm instructions are part of the ISA with inline execution and minimal hardware addition; the control logic and state machine of the hardware SHA solution is part of the microprocessor.
Fig. 2. ARC EM processor Security Options
C. Tools and Design Flow
As mentioned earlier, Synopsys ARC processors exemplify the qualities of a processor ideal for the next generation of SoC design. As an example, the ARC EM processors exhibit leadership PPA with 1.81 DMIPs/MHz performance efficiency, as little as 3uW/MHz power consumption, and as low as 0.01mm2 area in a 28-nm HPM process. ARC EM processors also provide a full range of configurable features such as the ability to tailor the design to the specific application requirements, extensibility with custom instructions/hardware through APEX technology, and certifications for automotive safety applications.
Furthermore, Synopsys’ ARChitect configuration tool eases the process of putting it all together (Figure 3). The ARC processor environment includes compilers, debuggers, simulators, synthesis, backend and development systems. It is important for a processor IP vendor to supply all necessary tools to shorten the time to market for customers.
Fig. 3. Architect Tool for SoC Configuration
Two things that life-essential devices generally have in common are that they incorporate sensors to collect data and connectivity to transmit and receive data. For example, the smart band design includes medical monitoring, a gyroscope to detect falling, and Bluetooth to connect to the internet for sending information directly to medical providers. As noted above, the exact configuration and extensibility to include the sensors as part of the processor provide the optimal power for these applications.
As many SoC developers know, designing a system does not stop at the core. Memory, peripherals, drivers and software libraries all factor into the design process. This is a key reason that Synopsys offers ARC-based subsystems, including the DesignWare Smart Sensor and Control Subsystem, which consist of an ARC EM processor, tightly-coupled peripherals and memory, application specific hardware accelerators, and production ready software libraries (Figure 4):
- Host Interface: AHB bus, JTAG, UART, I2C Slave
- Sensor and Actuator Connectivity:
- Power Management
- UART
- DAC/ADC
- GPIO
- SPI
- I2C
- APB Interface
- CREG
- Hardware accelerators:
- Filtering Accelerators
- Interpolation
- Vector Processing
- Fast Math Accelerators
Fig. 4. DesignWare Smart Sensor and Control Subsystem [3]
The ARC EM processor based subsystem targets many different types of applications, including:
- Industrial: real-time control, robotics
- Automotive: ABS, digital motor control, electric power steering, in-vehicle networking
- Consumer: mobile phone/tablet sensor, E-metering, home automation, connected appliances,
- Portable medical devices: blood-pressure monitors, heart-rate monitor, digital thermometer
D. Intelligent Sensor Application Example
Intelligent Sensor, a reference design, provided in the ARC Databook [4] is shown in Figure 5. The following
ARC EM5D processor features are configurable, allowing the customer to remove them if not required:
- DSP fixed point functions: SINCOS, SQRT
- 16-bit program counter (PC)
- 8-bit loop counter (LC) registers
- Sixteen core registers
- 16 KB closely coupled memories for instructions
- 4 KB closely coupled memories for data
- Seven external interrupts
- I/O functions: ADC Master, CREG Control, CREG Observation, I2C Slave
- Software drivers that are part of the I/O library
- Implement a process that frequently processes raw sensory data that is acquired by the AD Controller.
Fig. 5. Intelligent Sensor Hardware Block Diagram
Figure 6 shows the energy consumption comparison between the Synopsys’ Smart Sensor and Control Subsystem and the ARM M0 & M4, demonstrating the power efficiencies that can be achieved through the use of the ARC configurable and extensible architecture optimized to sensor domain applications.
Fig. 6. Performance Comparison
SUMMARY
The characteristic of the last two generations of the microprocessor-related products, the PC/laptop and the smart phone/tablet, is the mass worldwide appeal to young and old and for personal and business use. The next big market is not a single product but an explosion of a multitude of life-essential applications. The processor market is faced with a variety of challenges as demands for higher performance coupled with longer battery life and the need for security grow in emerging applications. The configurable and extensible processor is the next generation in the processor family tree. It is exciting to talk about the evolving capacities enabled by the new class of processors for smart automobiles, smart cards, smart homes, drones, and connected medical monitoring devices.
This new class of processor enables many possibilities and many opportunities to enhance the quality of life.
ACKNOWLEDGMENT
Acknowledge contributions from marketing department of Synopsys.
REFERENCES
[1] [Source: http://semiengineering.com/executive-insight-aart-de-geus-3/]
[2] Tran, T., “Microprocessor: Past, Present, and Future.” 2016 International Conference on IC Design and Technology (ICICDT), 2016.
[3] “DesignWare Sensor and Control IP Subsystem User Guide.” ARC Synopsys, Version 4027-006, February 2015.
[4] “DesignWare Sensor and Control IP Subsystem Infrastructure Databook,” ARC Synopsys, Version 4022- 004, February 2015.
If you wish to download a copy of this white paper, click here
|
Synopsys, Inc. Hot IP
Related Articles
- How to cost-efficiently add Ethernet switching to industrial devices
- How Efinix is Conquering the Hurdle of Hardware Acceleration for Devices at the Edge
- MIPI in next generation of AI IoT devices at the edge
- Rapid Validation of Post-Silicon Devices Using Verification IP
- Securing UART communication interface in embedded IoT devices
New Articles
- Accelerating RISC-V development with Tessent UltraSight-V
- Automotive Ethernet Security Using MACsec
- What is JESD204C? A quick glance at the standard
- Optimizing Power Efficiency in SOC with PVT Sensor-Assisted DVFS Technology
- Bandgap Reference (BGR) Circuit Design and Transient Analysis in 90nm VLSI Technology
Most Popular
- System Verilog Assertions Simplified
- Accelerating RISC-V development with Tessent UltraSight-V
- System Verilog Macro: A Powerful Feature for Design Verification Projects
- Understanding Logic Equivalence Check (LEC) Flow and Its Challenges and Proposed Solution
- Design Rule Checks (DRC) - A Practical View for 28nm Technology
E-mail This Article | Printer-Friendly Page |