GenAI v1-Q launched with 4 bits Quantization support to accelerate larger LLMs at the Edge
The new version brings a 276% speed increase for the top LLMs in low-cost systems, while maintaining their intelligence.
Spain, September 24, 2024 -- Raiderchip has presented a new HW accelerator product adding 4-bits and 5-bits Quantization support (Q4_K and Q5_K) to the extraordinary efficiency of the base GenAI v1. The new Generative AI LLM hardware accelerator runs inside FPGA devices, and is ideal for boosting their capabilities with low-cost DDR and LPDDR memories, incrementing inference speed by 276%.
GenAI v1-Q running the Llama 2-7B LLM model with 4 bits Quantization on a low-cost Versal FPGA with LPDDR4 memory
The new acceleration engine increases not only inference speed but also lowers memory requirements by up to 75%, allowing the largest and most intelligent LLM models to fit into smaller systems, lowering the overall cost, while keeping real-time speed, and also reducing energy consumption. All of this with minimal impact on model accuracy and intelligence perception.
The GenAI v1-Q, which like its predecessor is already available for a wide range of FPGAs, aims to expand the range of available features. In the words of its CTO, Victor Lopez, ‘We seek to offer maximum flexibility to our customers, with highly configurable hardware that allows them to balance criteria such as accuracy, inference speed, model size, unit cost of hardware, or energy consumption goals according to their needs, finding the perfect balance that best fits their objectives.’
The current demonstrator accelerates the 4 bits quantized Meta’s Llama 2-7B using barely 4 GB of memory, whereas the vanilla version requires 16 GB of DDR.
Companies interested in trying the GenAI v1-Q may reach out to Raiderchip for access to our demo or a consultation on how our IP cores can accelerate their AI workloads.
More information at https://raiderchip.ai/technology/hardware-ai-accelerators
|
Related News
- RaiderChip raises 1 Million Euros in seed capital to market its innovative generative AI accelerator: the GenAI v1.
- Ceva Expands Embedded AI NPU Ecosystem with New Partnerships That Accelerate Time-to-Market for Smart Edge Devices
- Synopsys and SiMa.ai Announce Strategic Collaboration to Accelerate Development of Automotive Edge AI Solutions
- DMP Released Next-Generation AI Accelerator IP "ZIA A3000 V2" - Industry-leading PPA efficiency to propel the future of edge AI
- LDRA Announces Extended Support for RISC-V High Assurance Software Quality Tool Suite to Accelerate On-Target Testing of Critical Embedded Applications
Breaking News
- Mixel Announces the Opening of New Branch in Da Nang, Vietnam
- intoPIX and Nextera-Adeas Announce Latest IPMX Demo Design with JPEG XS on Compact FPGAs at ISE 2025
- Certus releases radiation-hardened I/O Library in GlobalFoundries 12nm LP/LP+
- Plexus and intoPIX Expand IPMX Solutions Offering
- Alphawave Semi to Showcase Innovations and Lead Expert Panels on 224G, 128G PCIe 7.0, 32G UCIe, HBM 4, and Advanced Packaging Techniques at DesignCon 2025
Most Popular
- RaiderChip unveils its fully Hardware-Based Generative AI Accelerator: The GenAI NPU
- Cadence to Acquire Secure-IC, a Leader in Embedded Security IP
- 创飞芯宣布其反熔丝一次性可编程(OTP)技术在90nm BCD 工艺上实现量产
- Arm Chiplet System Architecture Makes New Strides in Accelerating the Evolution of Silicon
- ARM boost in $100bn Stargate data centre project
E-mail This Article | Printer-Friendly Page |