Embedded, on-chip SRAM has been a fundamental building block for custom and standard chips for quite a while. When all this began, there were typically small SRAM blocks of on-chip memory supplemented by off-chip DRAM devices. Those off-chip devices became more sophisticated, with higher performance interfaces (e.g., GDDR6) or new form factors (e.g., HBM2 3D memory stacks). The on-chip memory portion continued to grow as well.
Today, over 60 percent of the silicon real estate of an advanced FinFET-class design is typically occupied by on-chip memory. Single-port, two-port, pseudo two-port, fast cache and multiple flavors of register files are just some of the memory types that occupy all that silicon area. This memory is like a supporting fabric for the chip, facilitating the computation being performed in a ubiquitous way. With all those memories occupying all that area, the impact of increasing speed or reducing power/area, if even just by a small amount, can be quite significant. More on that in a moment.
