GenAI v1-Q launched with 4 bits Quantization support to accelerate larger LLMs at the Edge
The new version brings a 276% speed increase for the top LLMs in low-cost systems, while maintaining their intelligence.
Spain, September 24, 2024 -- Raiderchip has presented a new HW accelerator product adding 4-bits and 5-bits Quantization support (Q4_K and Q5_K) to the extraordinary efficiency of the base GenAI v1. The new Generative AI LLM hardware accelerator runs inside FPGA devices, and is ideal for boosting their capabilities with low-cost DDR and LPDDR memories, incrementing inference speed by 276%.
GenAI v1-Q running the Llama 2-7B LLM model with 4 bits Quantization on a low-cost Versal FPGA with LPDDR4 memory
The new acceleration engine increases not only inference speed but also lowers memory requirements by up to 75%, allowing the largest and most intelligent LLM models to fit into smaller systems, lowering the overall cost, while keeping real-time speed, and also reducing energy consumption. All of this with minimal impact on model accuracy and intelligence perception.
The GenAI v1-Q, which like its predecessor is already available for a wide range of FPGAs, aims to expand the range of available features. In the words of its CTO, Victor Lopez, ‘We seek to offer maximum flexibility to our customers, with highly configurable hardware that allows them to balance criteria such as accuracy, inference speed, model size, unit cost of hardware, or energy consumption goals according to their needs, finding the perfect balance that best fits their objectives.’
The current demonstrator accelerates the 4 bits quantized Meta’s Llama 2-7B using barely 4 GB of memory, whereas the vanilla version requires 16 GB of DDR.
Companies interested in trying the GenAI v1-Q may reach out to Raiderchip for access to our demo or a consultation on how our IP cores can accelerate their AI workloads.
More information at https://raiderchip.ai/technology/hardware-ai-accelerators
|
Related News
- Accelerate Innovation: Harnessing the Speed of Tomorrow with PCIe Gen 4 PHY and Controller IP Cores
- Arteris Ships Arteris NoC Solution 1.4 to Accelerate Adoption of Network on Chip
- Jennic launches IEEE802.15.4 / ZigBee module range to accelerate wireless sensor product development
- Rambus Announces Industry-First HBM4 Controller IP to Accelerate Next-Generation AI Workloads
- Q2 2024 Global Semiconductor Equipment Billings Increased 4% Year-Over-Year, SEMI Reports
Breaking News
- GenAI v1-Q launched with 4 bits Quantization support to accelerate larger LLMs at the Edge
- Ceva and Edge Impulse Join Forces to Enable Faster, Easier Development of Edge AI Applications
- GUC Announces Adoption of HBM3E IP by CSP Data Center
- Alphawave Semi and InnoLight Extend PCIe over Optics Collaboration with Demonstration of 128Gbps Gen 7.0 over Low Latency Linear Pluggable Optics at ECOC 2024
- SigmaSense Teams Up with Dolphin Design to Deliver Power Efficiency in their Advanced SDC300 Touch Controller
Most Popular
- RAAAM Memory Technologies and NXP Semiconductors Announce Collaboration to Implement High Density On-Chip Memory
- VyperCore plans 5nm RISC-V server chip and card
- Alphawave Semi to Showcase Latest Advances in AI Connectivity IP at ECOC 2024
- Alphawave Semi - Interim results for the six months ended 30 June 2024
- Synopsys Enters Definitive Agreement with Keysight Technologies For Sale of Optical Solutions Group
E-mail This Article | Printer-Friendly Page |