Kalray Announces the Release of an Efficient Manycore Processing Solution Dedicated to Deep Learning
Kalray will be demonstrating its deep learning processing solution at Embedded World, March 14-16, 2017 in Nuremberg, Germany
NUREMBERG, Germany, March 14, 2017 -- Rounding out its embedded technology offer, Kalray has expanded its processing capabilities to the world of artificial intelligence. The company is introducing a highly optimized deep learning solution targeting embedded applications like autonomous cars, avionics, drones, robotics and more. The solution is capable of supporting all of the most commonly-used deep learning neural networks and framework, such as GoogleNet, Squeezenet, CAFFE and more.
Kalray's deep learning solution includes:
- MPPA®2-256 Bostan manycore processor: industry-recognized 288-core processor
- Kalray Neural Network (KaNN): deep learning software tool used in the development and evaluation of neural networks on MPPA®. KaNN is compatible with all commonly-used deep learning networks.
In terms of pure performance, Kalray is able to leverage the 288 cores of its MPPA®2-256 Bostan processor in order to efficiently process notoriously compute-heavy deep learning algorithms. To do this, the solution uses the extensive on-chip memory of the Bostan processor and it spreads the compute-heavy aspects of deep learning – data dependent layers and weight parameters – across the MPPA®'s numerous cores. The result is particularly efficient processing – up to 60 frames per second while running "GoogleNet"– outperforming the most efficient GPUs addressing today's embedded market.
But developing a deep learning embedded application is much more than pure performance. Beyond basic deep learning processing, the system has been honed to respond to the specific needs of the highly demanding embedded applications market. This means that the chip, in addition to highly efficient processing, also responds to the constraints of the embedded industry, offering low latency, low power consumption, certifiability, parallel processing – all at an affordable cost. The chip will be used for deep learning purposes in autonomous vehicles, drones, robotics, visual inspection, avionics, aeronautics and more.
Kalray's deep learning solution offers:
- 40 MB on-chip memory: allowing for the storage of the neurons and the weight parameters for compute efficiency.
- Memory bandwidth: high-speed internal memory used to store intermediate data or even to pre-fetch data from the global memory, avoiding latency associated with data access. Overall, brings more than 1 TB/s on-chip memory bandwidth capabilities.
- Low-latency Network-on-Chip (NoC): carries out high-bandwidth data transfers between clusters, with broadcasting capabilities.
- 288 energy-efficient cores: each core is 5-issue VLIW with simple and double-precision floating-point operations, offering 1 TFLOPS on-chip processing capability.
KaNN will be the first version of Kalray deep learning dedicated solution. The company already anticipates sampling a second generation in 2018, providing a factor 20 performance improvement.
Kalray will be demonstrating this solution at the 2017 edition of Embedded World, March 14-16 in Nuremberg, Germany. The company will be located at booth 4A-330.
About Kalray
Kalray Inc. is a fabless semiconductor company and pioneer in manycore processor solutions. Its innovative MPPA® architecture delivers real-time and low-latency processing for embedded applications, including avionics, aeronautics, automobile and more. For more information, visit http://www.kalrayinc.com
|
Related News
- Synopsys and Morpho Collaborate to Accelerate Deep Learning Processing for Embedded Vision Applications
- KALRAY announces world's most efficient manycore PCI Express board for intensive computing application at ISC 2014
- 4DS Unveils New Interface Switching ReRAM Technology for Faster and Energy Efficient Memory for AI Processing
- Kalray Joins Arm Total Design, Extending Collaboration with Arm on Accelerated AI Processing
- Kalray and Arm to collaborate to bring data intensive processing and AI acceleration DPU solutions to the global Arm ecosystem
Breaking News
- TSMC drives A16, 3D process technology
- Frontgrade Gaisler Unveils GR716B, a New Standard in Space-Grade Microcontrollers
- Blueshift Memory launches BlueFive processor, accelerating computation by up to 50 times and saving up to 65% energy
- Eliyan Ports Industry's Highest Performing PHY to Samsung Foundry SF4X Process Node, Achieving up to 40 Gbps Bandwidth at Unprecedented Power Levels with UCIe-Compliant Chiplet Interconnect Technology
- CXL Fabless Startup Panmnesia Secures Over $60M in Series A Funding, Aiming to Lead the CXL Switch Silicon Chip and CXL IP
Most Popular
- Cadence Unveils Arm-Based System Chiplet
- CXL Fabless Startup Panmnesia Secures Over $60M in Series A Funding, Aiming to Lead the CXL Switch Silicon Chip and CXL IP
- Esperanto Technologies and NEC Cooperate on Initiative to Advance Next Generation RISC-V Chips and Software Solutions for HPC
- Eliyan Ports Industry's Highest Performing PHY to Samsung Foundry SF4X Process Node, Achieving up to 40 Gbps Bandwidth at Unprecedented Power Levels with UCIe-Compliant Chiplet Interconnect Technology
- Arteris Selected by GigaDevice for Development in Next-Generation Automotive SoC With Enhanced FuSa Standards
E-mail This Article | Printer-Friendly Page |