Ceva Adds Homogenous AI Acceleration to Third-Gen Engine
By Sally Ward-Foxton, EETimes (January 13, 2022)
Ceva has revamped its NeuPro AI accelerator engine IP, adding specialized co-processors for Winograd transforms and sparsity operations and a general-purpose vector processing unit alongside the engine’s MAC array. The new generation engine, NeuPro-M, can boost performance 5-15X (depending on the exact workload) compared to Ceva’s second generation NeuPro-S core (released Sept 2019). For example, ResNet-50 performance was improved 4.9X without using the specialized engines – boosted to 14.3X when using specialized co-processors, according to Ceva. Results for Yolo-v3 showed similar speedups. The core’s power efficiency is expected to be 24 TOPS/Watt for 1.25 GHz operation.
The NeuPro-M engine architecture allows for parallel processing on two levels — between the engines (if multiple engines are used), and within the engines themselves. The main MAC array has 4000 MACs capable of mixed precision operation (2-16 bits). Alongside this are new, specialized co-processors for some AI tasks. Local memory in each engine breaks the dependence on the core shared memory and on external DDR; the co-processors in each engine can work in parallel on the same memory, though they sometimes transfer data from one to another directly (without passing through memory). The size of this local memory is configurable based on network size, input image size, number of engines in the design and customers’ DDR latency and bandwidth.
E-mail This Article | Printer-Friendly Page |
|
Ceva, Inc. Hot IP
Related News
- AImotive's latest aiWare3P delivers superior NN acceleration for production L2-L3 automotive AI
- PiMCHIP Deploys Ceva Sensor Hub DSP in New Edge AI SoC
- Fractile Licenses Andes Technology's RISC-V Vector Processor as It Builds Radical New Chip to Accelerate AI Inference
- BrainChip Introduces Lowest-Power AI Acceleration Co-Processor
- Ceva and Edge Impulse Join Forces to Enable Faster, Easier Development of Edge AI Applications
Breaking News
- HPC customer engages Sondrel for high end chip design
- Ubitium Debuts First Universal RISC-V Processor to Enable AI at No Additional Cost, as It Raises $3.7M
- TSMC drives A16, 3D process technology
- Frontgrade Gaisler Unveils GR716B, a New Standard in Space-Grade Microcontrollers
- Blueshift Memory launches BlueFive processor, accelerating computation by up to 50 times and saving up to 65% energy
Most Popular
- Cadence Unveils Arm-Based System Chiplet
- Eliyan Ports Industry's Highest Performing PHY to Samsung Foundry SF4X Process Node, Achieving up to 40 Gbps Bandwidth at Unprecedented Power Levels with UCIe-Compliant Chiplet Interconnect Technology
- TSMC drives A16, 3D process technology
- CXL Fabless Startup Panmnesia Secures Over $60M in Series A Funding, Aiming to Lead the CXL Switch Silicon Chip and CXL IP
- Blueshift Memory launches BlueFive processor, accelerating computation by up to 50 times and saving up to 65% energy