Fujitsu Presents Post-K CPU Specifications
Peak performance of over 2.7 TFLOPS to realize outstanding HPC and AI capabilities
Tokyo, August 22, 2018 --Fujitsu today announced publication of specifications for the A64FX™ CPU to be featured in the post-K computer, a supercomputer being developed by Fujitsu and RIKEN as a successor to the K computer, which achieved the world's highest performance in 2011. The organizations are striving to achieve post-K application execution performance up to 100 times that of the K computer.
A64FX is the world's first CPU to adopt the Scalable Vector Extension (SVE), an extension of Armv8-A instruction set architecture for supercomputers. Building on over 60 years' worth of Fujitsu-developed microarchitecture, this chip offers peak performance of over 2.7 TFLOPS, demonstrating superior HPC and AI performance.
Fujitsu made the announcement at Hot Chips 30(1), an international symposium on high performance processors and related technologies held in Silicon Valley, California from August 19-21.
Post-K is the successor to the K computer which in 2011 achieved the highest ranking in the world on the TOP500 list of supercomputers around the world. Fujitsu and RIKEN are developing post-K, aiming for starting operation around 2021.
A64FX is the high-performance CPU that will be used in post-K. It offers a number of features, including broad utility supporting a wide range of applications, massive parallelization through the Tofu interconnect, low power consumption, and mainframe-class reliability.
A64FX is the world's first CPU to adopt the SVE of Arm Limited's Armv8-A instruction set architecture, extended for supercomputers. Fujitsu collaborated with Arm, contributing to the development of the SVE as a lead partner, and adopted the results in the A64FX.
Fujitsu developed the microarchitecture of the A64FX by building on the technology of its previous supercomputers, mainframes, and UNIX servers. With hardware technology that draws out the high memory bandwidth of high performance stacked memory, the system can efficiently utilize the CPU's high functional computational processing units, enabling delivery of high application execution performance. The CPUs will be directly connected by the proprietary Tofu interconnect developed for the K computer, improving parallel performance. The system can provide a peak double precision (64 bit) floating point operations performance of over 2.7 TFLOPS, with a computational throughput twice that amount for single precision (32 bit), and four times that amount for half precision (16 bit). In other words, by using single precision or half precision operations, applications can get results even faster. Fujitsu has also enhanced computational performance for 16 bit and 8 bit integer operations. Accordingly, this CPU is suited for a wide range of fields such as big data and AI, not just for the computer simulations at which traditional supercomputers excel.
The Arm architecture is widely accepted by software developers and users, and by participating in the Arm community, Fujitsu can utilize its software resources, including open source software, while also contributing to the expansion of the Arm architecture ecosystem.
Photo of the A64FX package
A64FX block diagram
Category | Details |
---|---|
Instruction Set Architecture | Armv8.2-A SVE (512-bit wide SIMD) |
Number of cores | 48 computing cores, 4 assistant cores |
Memory | 32GiB (HBM2) |
Process Technology | 7 nm FinFET |
Number of Transistors | About 8.7 billion transistors |
Peak Performance (TOPS) | Double precision (64 bit) floating point operations: over 2.7 TOPS (DGEMM(2) execution efficiency over 90%) Single precision (32 bit) floating point operations: over 5.4 TOPS Half precision (16 bit) floating point operations/16 bit integer operations: over 10.8 TOPS 8 bit integer operations: over 21.6 TOPS |
Peak Memory Bandwidth | 1024 GB/second (STREAM Triad(3) execution efficiency over 80%) |
Future Plans
Through the development of post-K, which will be equipped with this CPU, Fujitsu will contribute to the resolution of social and scientific issues in such computer simulation fields as cutting-edge research, health and longevity, disaster prevention and mitigation, energy, as well as manufacturing, while enhancing industrial competitiveness and contributing to the creation of Society 5.0 by promoting applications in big data and AI fields.
-
[1] Hot Chips 30
A symposium held every year by the US Institute of Electrical and Electronics Engineers (IEEE).
-
[2] DGEMM
A component in computational programs used for benchmarking. A subroutine that computes matrix multiplication.
-
[3] STREAM Triad
A benchmark used as an indicator of memory access performance. It measures sustained memory bandwidth when the processor is accessing memory.
About Fujitsu
Fujitsu is the leading Japanese information and communication technology (ICT) company, offering a full range of technology products, solutions, and services. Approximately 140,000 Fujitsu people support customers in more than 100 countries. We use our experience and the power of ICT to shape the future of society with our customers. Fujitsu Limited (TSE: 6702) reported consolidated revenues of 4.1 trillion yen (US $39 billion) for the fiscal year ended March 31, 2018. For more information, please see www.fujitsu.com.
|
Related News
- Fujitsu Begins Production of Post-K
- Fujitsu Adopts Cadence Palladium Z1 Enterprise Emulation Platform for Post-K Supercomputer Development
- Digital Core Design Presents D68000-CPU32+ for well-known 68k family
- Digital Core Design Presents DAES XTS Cryptographic CPU for Unparalleled Security
- Fujitsu Presents HEVC HD Decoding SoC for Multimedia Applications
Breaking News
- Ubitium Debuts First Universal RISC-V Processor to Enable AI at No Additional Cost, as It Raises $3.7M
- TSMC drives A16, 3D process technology
- Frontgrade Gaisler Unveils GR716B, a New Standard in Space-Grade Microcontrollers
- Blueshift Memory launches BlueFive processor, accelerating computation by up to 50 times and saving up to 65% energy
- Eliyan Ports Industry's Highest Performing PHY to Samsung Foundry SF4X Process Node, Achieving up to 40 Gbps Bandwidth at Unprecedented Power Levels with UCIe-Compliant Chiplet Interconnect Technology
Most Popular
- Cadence Unveils Arm-Based System Chiplet
- CXL Fabless Startup Panmnesia Secures Over $60M in Series A Funding, Aiming to Lead the CXL Switch Silicon Chip and CXL IP
- Esperanto Technologies and NEC Cooperate on Initiative to Advance Next Generation RISC-V Chips and Software Solutions for HPC
- Eliyan Ports Industry's Highest Performing PHY to Samsung Foundry SF4X Process Node, Achieving up to 40 Gbps Bandwidth at Unprecedented Power Levels with UCIe-Compliant Chiplet Interconnect Technology
- Arteris Selected by GigaDevice for Development in Next-Generation Automotive SoC With Enhanced FuSa Standards
E-mail This Article | Printer-Friendly Page |