ICE-IP-338 High-speed XTS-GCM Multi Stream Inline Cipher Engine
Partitioning to optimize AI inference for multi-core platforms
By Rami Drucker, Ceva
EDN (January 8, 2024)
Not so long ago, artificial intelligence (AI) inference at the edge was a novelty easily supported by a single neural processing unit (NPU) IP accelerator embedded in the edge device. Expectations have accelerated rapidly since then. Now we want embedded AI inference to handle multiple cameras, complex scene segmentation, voice recognition with intelligent noise suppression, fusion between multiple sensors, and now very large and complex generative AI models.
Such applications can deliver acceptable throughput for edge products only when run on multi-core AI processors. NPU IP accelerators are already available to meet this need, extending to eight or more parallel cores and able to handle multiple inference tasks in parallel. But how should you partition expected AI inference workloads for your product to take maximum advantage of all that horsepower?
E-mail This Article | Printer-Friendly Page |
|
Ceva, Inc. Hot IP
Related Articles
New Articles
- Accelerating RISC-V development with Tessent UltraSight-V
- Automotive Ethernet Security Using MACsec
- What is JESD204C? A quick glance at the standard
- Optimizing Power Efficiency in SOC with PVT Sensor-Assisted DVFS Technology
- Bandgap Reference (BGR) Circuit Design and Transient Analysis in 90nm VLSI Technology
Most Popular
- Accelerating RISC-V development with Tessent UltraSight-V
- System Verilog Assertions Simplified
- Synthesis Methodology & Netlist Qualification
- System Verilog Macro: A Powerful Feature for Design Verification Projects
- Enhancing VLSI Design Efficiency: Tackling Congestion and Shorts with Practical Approaches and PnR Tool (ICC2)