Kalray Announces the Release of an Efficient Manycore Processing Solution Dedicated to Deep Learning
Kalray will be demonstrating its deep learning processing solution at Embedded World, March 14-16, 2017 in Nuremberg, Germany
NUREMBERG, Germany, March 14, 2017 -- Rounding out its embedded technology offer, Kalray has expanded its processing capabilities to the world of artificial intelligence. The company is introducing a highly optimized deep learning solution targeting embedded applications like autonomous cars, avionics, drones, robotics and more. The solution is capable of supporting all of the most commonly-used deep learning neural networks and framework, such as GoogleNet, Squeezenet, CAFFE and more.
Kalray's deep learning solution includes:
- MPPA®2-256 Bostan manycore processor: industry-recognized 288-core processor
- Kalray Neural Network (KaNN): deep learning software tool used in the development and evaluation of neural networks on MPPA®. KaNN is compatible with all commonly-used deep learning networks.
In terms of pure performance, Kalray is able to leverage the 288 cores of its MPPA®2-256 Bostan processor in order to efficiently process notoriously compute-heavy deep learning algorithms. To do this, the solution uses the extensive on-chip memory of the Bostan processor and it spreads the compute-heavy aspects of deep learning – data dependent layers and weight parameters – across the MPPA®'s numerous cores. The result is particularly efficient processing – up to 60 frames per second while running "GoogleNet"– outperforming the most efficient GPUs addressing today's embedded market.
But developing a deep learning embedded application is much more than pure performance. Beyond basic deep learning processing, the system has been honed to respond to the specific needs of the highly demanding embedded applications market. This means that the chip, in addition to highly efficient processing, also responds to the constraints of the embedded industry, offering low latency, low power consumption, certifiability, parallel processing – all at an affordable cost. The chip will be used for deep learning purposes in autonomous vehicles, drones, robotics, visual inspection, avionics, aeronautics and more.
Kalray's deep learning solution offers:
- 40 MB on-chip memory: allowing for the storage of the neurons and the weight parameters for compute efficiency.
- Memory bandwidth: high-speed internal memory used to store intermediate data or even to pre-fetch data from the global memory, avoiding latency associated with data access. Overall, brings more than 1 TB/s on-chip memory bandwidth capabilities.
- Low-latency Network-on-Chip (NoC): carries out high-bandwidth data transfers between clusters, with broadcasting capabilities.
- 288 energy-efficient cores: each core is 5-issue VLIW with simple and double-precision floating-point operations, offering 1 TFLOPS on-chip processing capability.
KaNN will be the first version of Kalray deep learning dedicated solution. The company already anticipates sampling a second generation in 2018, providing a factor 20 performance improvement.
Kalray will be demonstrating this solution at the 2017 edition of Embedded World, March 14-16 in Nuremberg, Germany. The company will be located at booth 4A-330.
About Kalray
Kalray Inc. is a fabless semiconductor company and pioneer in manycore processor solutions. Its innovative MPPA® architecture delivers real-time and low-latency processing for embedded applications, including avionics, aeronautics, automobile and more. For more information, visit http://www.kalrayinc.com
|
Related News
- Synopsys and Morpho Collaborate to Accelerate Deep Learning Processing for Embedded Vision Applications
- KALRAY announces world's most efficient manycore PCI Express board for intensive computing application at ISC 2014
- 4DS Unveils New Interface Switching ReRAM Technology for Faster and Energy Efficient Memory for AI Processing
- Kalray Joins Arm Total Design, Extending Collaboration with Arm on Accelerated AI Processing
- Kalray and Arm to collaborate to bring data intensive processing and AI acceleration DPU solutions to the global Arm ecosystem
Breaking News
- Jury is out in the Arm vs Qualcomm trial
- Ceva Seeks To Exploit Synergies in Portfolio with Nano NPU
- Synopsys Responds to U.K. Competition and Markets Authority's Phase 1 Announcement Regarding Ansys Acquisition
- Alphawave Semi Scales UCIe™ to 64 Gbps Enabling >20 Tbps/mm Bandwidth Density for Die-to-Die Chiplet Connectivity
- RaiderChip Hardware NPU adds Falcon-3 LLM to its supported AI models
Most Popular
E-mail This Article | Printer-Friendly Page |