Again at its Monetary Analyst Day in 2020, AMD confirmed a diagram emphasizing its server CPU design chops and that chiplets weren’t the final step within the evolution of its CPUs. AMD opted to attract a line from its first deployment of HBM in 2015 via to the launch of chiplets, together with a future CPU delivering X3D packaging with a mix of two.5D and 3D applied sciences.
AMD by no means introduced a particular product that will carry X3D to market, however a brand new rumor suggests the corporate is engaged on a product codenamed “Milan-X.” Milan-X can be based mostly on AMD’s most up-to-date Epyc processor structure, however it could deploy much more reminiscence bandwidth than we’ve seen in an AMD server earlier than.
AMD’s next-gen I/O die is supposedly known as Genesis I/O, and the complete mixed 2.5D/3D stack sits on prime of a giant interposer. AMD’s official diagram exhibits a Four-high stack of HBM per CPU cluster, with one HBM stack devoted to every chip.
It’s potential that AMD’s diagram above is just supposed to point out the overall idea of what the corporate intends to construct, not precisely convey the ultimate design of the product. If the diagram is correct, it suggests Milan-X will both function extra cores per chiplet (16 can be wanted to hit 64 cores in 4 chiplets) or that Milan-X will prime out at 32 cores. The diagram additionally implies AMD’s interposer die have to be beneath the cluster of chiplets.
This might positively qualify as 3D chip stacking, however it additionally raises questions on how a lot energy the I/O die will draw. It appears probably that AMD would have lastly shrunk right down to 7nm for I/O, simply to restrict the general energy consumption.
3D die stacking has at all times been tough exterior of low-power environments, because of the downside of shifting warmth from the underside to the highest of the stack with out cooking some a part of the chip within the course of. The Holy Grail of chip stacking is to place a number of high-power chiplets on prime of one another versus laying them out side-by-side, however Intel and AMD have each determined to deal with one thing a bit simpler first: placing a scorching chip on prime of a cool one.
Intel doesn’t use the identical X3D know-how that AMD is rumored to be delivery for Milan-X, however its Foveros 3D interconnect allowed the corporate’s low-power Lakefield processor to function one big-core Ice Lake CPU stacked on prime of 4 low-power “Tremont” CPU cores. With Milan-X, AMD can be tackling one thing significantly extra complicated — once more, assuming each that this rumor is true and that the I/O die is beneath the chip cluster.
Milan-X is alleged to be a data-center-only chip and it isn’t clear what sort of cooling answer can be required to cope with the CPU’s distinctive construction. Presumably, AMD will wish to keep on with pressured air, however liquid and immersion cooling are additionally potential.
The quantity of bandwidth Milan-X would provide on this configuration is unparalleled. Our latest TRACBench debut illustrated how a lot extra reminiscence bandwidth might enhance the efficiency of the eight-channel 3995WX in contrast with the quad-channel 3990X, even when the latter is operating at the next clock pace. In that comparability, a Threadripper 3995WX has as much as 204.8GB/s value of reminiscence bandwidth to separate throughout 64 cores.
If every Milan-X chiplet remains to be eight cores and the chip makes use of mainstream, commercially obtainable HBM2E, we’d be taking a look at someplace between 300-500GB/s value of reminiscence bandwidth per chiplet. Whole obtainable reminiscence bandwidth throughout the complete chip ought to break 1TB/s and will attain 2TB/s. No matter different constraints would possibly bind Milan-X at that time, bandwidth wouldn’t be amongst them. The chip additionally presumably helps off-package reminiscence, nonetheless. Even when we assume a near-term breakthrough permitting for 32GB per HBM2E stack, 4 stacks would solely be 128GB of RAM and eight stacks would supply simply 256GB. AMD’s present servers help 4TB of RAM per socket, so there’s no probability of changing that form of capability with an equal quantity of on-package HBM2E.
Milan-X seems to be just like the form of chip AMD might carry to bear towards Sapphire Rapids. That CPU is anticipated to function someplace between 56 and 80 cores (reviews have different), and it additionally integrates HBM2 on-package. Sapphire Rapids is presently anticipated in late 2021 or early 2022. No launch date for Milan-X has been reported.